Your capacity will vary

Your capacity will vary

Summary: By an increasing amountWhy is your storage capacity always less, sometimes a lot less, then what you see advertised on the box? There is only one rule: you will never get the capacity the vendor advertises.

SHARE:
TOPICS: Hardware
28

By an increasing amount Why is your storage capacity always less, sometimes a lot less, then what you see advertised on the box? There is only one rule: you will never get the capacity the vendor advertises.

Storage vendors don't mean to be lying. They just have a world view that you and your OS don't happen to share. In their minds their numbers are justifiable.

The disk problem The major cause of disk drive capacity shrinkage is the difference between how disk drives measure capacity and how your computer measures capacity.

Memory, like the RAM, is measured in powers of two. A gigabyte of RAM is really 1,073,741,824 bytes of capacity.

Disk capacity is measured in powers of 10. Thus any gigabyte of disk capacity is one billion bytes.

Your computer measures hard disk capacity in a power of two. Thus 1 million bytes of disk becomes 977 kilobytes and you just lost 2.3% of your apparent capacity.

As disk drives get bigger the problem gets worse. Here's a table comparing binary powers to decimal powers:

binary vs decimal compared

Officially, disk vendors have the standards bodies on their side: a MB is officially defined as 1,000,000 bytes. What the memory vendors should use are the binary prefixes kibi, mibi, gibi and the like. The bi stands for binary. Who knows, someday it might catch on.

But even most computer publications stick to the old, unofficial definitions that we all use. The disk drive vendors should switch from decimal to binary prefixes because that is how operating systems measure drive capacity.

And as the table above shows the problem is only getting worse as disk capacities grow.

The array problem Disk arrays have a different problem: raw capacity vs protected capacity. Raw capacity is simply the sum - in decimal - of the capacity of the disk drives in the array. A 4 drive array with 1 TB drives has a 4 TB raw capacity.

But unless you use RAID 0 striping, which doesn't protect your data - lose 1 drive and all your data goes away - your usable capacity will be less. Far less.

With a 4 drive RAID array - like one I recently tried to test - RAID 5 will give you 3 drives worth of capacity, saving 1 drive for parity data. BTW, I wouldn't use such a configuration with 1 TB SATA drives: you have a 25% chance of losing data during a rebuild.

Much more reliable is a mirrored configuration. With a 4 drive array mirroring would give you 2 TB of protected capacity - only 50% or your raw capacity. But your data is much safer mirrored.

Array capacity arguments are common among enterprise array vendors - and if you were paying $5/GB raw you might be more interested in the usable capacity too. With 100's of TB in a single array, even small percentage differences start looking big.

The Storage Bits take There's only 1 good strategy for dealing with storage capacity: have more storage than you need. Most enterprises run with 2-3x the capacity they need - mostly for performance reasons - but the extra comes in handy for end-of-quarter capacity spikes or slower than expected capital approval cycles.

Home users should keep 10-20% of their disk unfilled. Windows and Mac OS X are virtual memory operating systems, which means they use disk space to substitute for DRAM when main memory fills up. Without enough spare capacity the virtual memory system can't do its job efficiently and your system slows down.

The good news: disk capacity is cheap and rapidly getting cheaper. 25 years ago disk cost $25,000 per gigabyte. Today it is less than $0.25 per gig. Fill 'er up!

Comments welcome, of course.

Topic: Hardware

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

28 comments
Log in or register to join the discussion
  • If your SMART is working, you should not lose data.

    "BTW, I wouldn?t use such a configuration with 1 TB SATA drives: you have a 25% chance of losing data during a rebuild."

    If your drive has SMART, it also has ECC, and will not lose any data because it catches and fixes errors. This idea that we're losing any more data on large drives is largely a myth; even "small" drives are actually constantly finding and fixing errors. The important point is that they're found and fixed and thus the data is never actually corrupted when it leaves the drive.
    CobraA1
    • Not so SMART

      Check out Robins storagemojo.com post on Disk drive faiure:

      How smart is SMART?
      Not very, as Google found, and many in the industry already knew. SMART (Self-Monitoring, Analysis, and Reporting Technology) captures drive error data to predict failure far enough in advance so you can back up. Yet SMART focuses on mechanical failures, while a good deal of a disk drive is electronic, so SMART misses many sudden drive failure modes, like power component failure. The Google team found that 36% of the failed drives did not exhibit a single SMART-monitored failure. They concluded that SMART data is almost useless for predicting the failure of a single drive.

      http://storagemojo.com/2007/02/19/googles-disk-failure-experience/

      Worth a look before relying on a single drive to protect your data.
      thewelshboy
      • Totally different discussion.

        Umm, I'm talking about failure of a few bits, not a total drive failure. Yes, I am aware that total failure is quite likely.
        CobraA1
        • Totally confused discussion

          Hardly a different discussion given the subject line of your comment.

          The reality is that even with SMART disks you can lose data.

          Full stop, point, end of.
          thewelshboy
    • Incorrect

      SATA drives specify an unrecoverable read error rate of 1
      in 10^14 - or one every 12.something TB. The URE comes
      AFTER the drive does its level best to read and reread the
      data.

      So reading 3 TB of SATA drive you have an almost 25%
      chance of encountering an URE. Most consumer-grade
      storage arrays won't inform you of the failure, figuring
      you'll never notice or will blame Windows if you do.

      This is why all the serious array vendors now support -
      and pretty much insist on - RAID6.

      SMART doesn't have anything to do with URE. It is a
      monitoring and reporting protocol.

      Robin
      R Harris
      • re: incorrect

        "SATA drives specify an unrecoverable read error rate of 1 in 10^14 - or one every 12.something TB."

        Sounds to me like a theoretical rate, not an actual URE rate from a real drive.

        "Most consumer-grade storage arrays won't inform you of the failure, figuring you'll never notice or will blame Windows if you do."

        AFAIK, this is false. The drive will generally retry after a failed read, and attempt to recover the data (it has spare sectors for this). If it turns out the data is truly unrecoverable, then as far as I know it does indeed report it to the OS. The OS itself will then attempt a higher level recovery. When all is said and done, the actual chance of a single URE failing all of the retries and recovery attempts at higher levels is pretty small, and it is unlikely that the integrity of the data has been compromised. It [b]might[/b] be that modern OSes do not report it to the user, but I am certain that drives report it to the OS.

        "SMART doesn't have anything to do with URE. It is a
        monitoring and reporting protocol."

        A protocol that specifies monitoring of ECC rates, among other things. You would be wise to test this for yourself.
        CobraA1
        • Yes, the drive will report it . . .

          But will whatever is controlling the RAID report it? Be it a
          host bus adapter or a single chip onboard RAID controller,
          the designers of the software/firmware have the choice not
          to pass on the drive's report. Many low-end consumer
          products choose not to. After all, what is the chance the
          consumer has a backup anyway?

          You might spend some time boning up on URE. "The actual
          chance of a single URE" is specified by vendors at 1 in
          10^14 for SATA - this is after the drive has exhausted
          every data reading trick. The data I've seen suggests that
          vendors do a *little* better than that, but that is the spec.

          Sure, SMART monitors read errors. Most of those are fixed
          on a retry. But the vendor's URE spec is exactly that: the
          chance of an UNRECOVERABLE read error. Some failure
          modes can be predicted by watching recoverable read
          errors and that is what SMART does. URE's are normal,
          expected behavior in a healthy drive.

          Robin
          R Harris
      • RE: Your capacity will vary

        Enormously assiduous <a href="http://flvto.com/">youtube converter free</a>, attempt before be affectionate of <a href="http://flvto.com/">youtube to mp3 converter free</a> along and <a href="http://flvto.com/">youtube to mp3</a>|<a href="http://flvto.com/">youtube mp3</a>
        youtube to mp3 converter
  • hours of ten?

    Should it be "powers of ten"?
    russguill
    • A "wordo"

      I dictated the first draft of the article, and while the software
      is pretty good about not misspelling words, if it gets the
      wrong word it can be hard to spot. Like "desk" for "disk".

      Thanks for noting it.

      Robin
      R Harris
  • and I have a bridge I would like to sell you

    i am confused as to how anyone would make this claim that a SMART drive will prevent data loss.

    If the drive dies or the heads crashes, chances are the data is gone unless you are willing to pay $200 an hour to some recovery firm.

    yes the smart drive is smarter, but guarantees no data loss - I don't think so.
    stso9daa
  • Arrgh!!

    "Your computer measures hard disk capacity in a power of two.
    Thus 1 million bytes of disk becomes 977 kilobytes and you
    just lost 2.3% of your apparent capacity."

    I am getting so sick of this TOTAL ASININE IGNORANCE on the
    part of tech bloggers who should know better. Isn't even a
    basic understanding of science required to blog on technical
    matters in this industry?

    You didn't lose ANYTHING. The capacity is IDENTICAL. 1 million
    bytes EQUALS 977 kilobytes. It's the EXACT SAME VALUE being
    expressed in a different UNIT.

    Do we have to smash your head against the wall of
    mathematics until you finally get it?

    Only a total MORON thinks 10^6 equals 2^20.

    Here's how someone who actually understand it would write
    your paragraph.

    So, since computer makers use a base 10 system to measure
    hard drive capacity, and your computer uses a base 2 system,
    when your hard drive box says 1 million bytes, your computer
    will report 977 kilobytes, an apparent difference of 2.3%, but in
    actuality, the capacity is identical, the two groups are simply
    using different units, like weighing a block of wood in ounces
    troy or ounces avoirdupois and reporting a different number.

    And just to forestall the inevitable ignorant replies. Here's the
    stats on my 250 GB hard drive as reported by disk utility:

    Total Capacity: 232.9 GB (250,059,350,016 Bytes)

    Well, shazam, what do you know? My 250 GB hard drive has
    slightly over 250 billion bytes. Just like the box says. I haven't
    lost any capacity at all. BECAUSE the 232.9 is 232.9 x 2^30
    which EQUALS 250 x 10^9.
    frgough
    • Note the word "apparent" in the sentence

      You are right - you still get the capacity you bought. But it is
      reported by the OS as LESS than what you bought.

      And this confuses many consumers.

      In a consumer-driven industry this is a problem, one we
      don't need.

      Robin
      R Harris
      • Sorry, not buying it

        It is not reported by the OS as less than what you bought. It is
        reported in different units. Period. The confusion occurs
        because people don't realize it's in different units and
        irresponsible tech bloggers like yourself continue to use
        phrases like shrinkage and apparent instead of labeling the
        problem correctly, thusly:

        "But don't worry, the shrinkage is an illusion. The hard drive
        capacity is exactly the same, simply reported in different
        units."

        It's a clear, simple sentence that any consumer can understand.
        And you didn't use it.
        frgough
        • Irresponsible commenting...

          I do not think Robin Harris has become a renowned and highly respected blogger on storage tech by peddling drivel.

          I think to label him irresponsible when he is rightly pointing out an area for consumer confusion is a tad, well, irresponsible.

          If you ever get to blog Mr frgough I hope you get a similar response.

          Think on ....
          thewelshboy
          • Robin Harris is a renowned and highly respected blogger?

            By whom? The guy seems clueless about technology. This is yet another example.
            ye
          • Pot Kettle Black

            That's rich coming from you. Read up on his background mr no fud.
            gtdavies339
          • I don't care what his background is. He has clearly demonstrated...

            ...he is clueless. This is just [b]another[/b] example.

            And thanks for recognizing that I provide accurate, factual information. It's rare that someone does so in these forums.
            ye
          • 'ye' of little faith..

            Ye, can you please point me to your website stating your credentials to comment in such a vociferous fashion? Its all too easy make such comments in a cowardly hidden manner.

            Robins credentials are out there for all to see.
            thewelshboy
          • You took the words right outta my mouth....

            ...or off of my keyboard. Same goes for frgough. Where are YOUR blogs, ye and frgough? If you both are so damned knowledgeable, why are you hanging around here with us unwashed?

            It's a simple case of "Those who can, do. Those who can't, biotch about those who can." And part of the reason they can't is most likely due to the fact that, what they imagine themselves to possess in intelligence, is certainly offset by their apparent lack of tact and finesse.
            MGP2