Five things you never knew about flash drives

Five things you never knew about flash drives

Summary: Flash drives only look like disks. In fact, nothing works the way you'd think.

TOPICS: Hardware

Flash drives only look like disks. In fact, nothing works the way you'd think. Flash is really different from magnetic recording, and those differences have a big impact on flash drive performance. How well vendors manage flash oddities has a huge impact on performance and even drive lifespan.

The five weirdest things about flash drives I've started from the bottom up - the bits - to present flash weirdness logically. And what it means to users of flash drives.

1) Flash drives can only write zeros. Every write must be preceded by an erase because the only way to write a one is to erase first, which writes all ones. Every write means an erase followed by a write, which is slows performance.

2) To write a page you must first erase the entire block. NAND flash, the most common kind, is divided into blocks - typically 128 KB - and each block is divided into pages - typically 2 KB. To write a new page, the entire 128 KB block must be copied first - less pages due for rewriting - and the entire block rewritten.

This impacts performance even more. You may just need to write 2 KB, but the drive has to erase 128 KB and then write 128 KB.

This makes small random writes very slow - even slower than notebook disk drive writes. And since today's PC/Mac file systems perform lots of small random writes, you won't see all the performance flash drives promise after you boot up.

3) There are no random writes in a block. Each block write starts with page 0 and proceeds in order to the 64th block. This is great for the blazing sequential write speeds that vendors happily quote, but it means that small random write performance is pretty awful.

4) Block size is a tradeoff, not a given. As flash chip capacities grow, keeping block size constant means more blocks to manage. For example, if flash drives were divided into 512 byte blocks, a 64 GB flash drive's block table would require 128 million entries and about 384 MB of storage. With a 128 KB block, the table size is a more manageable 524,352 entries and less than 2 MB of storage.

This means that vendors have the opportunity to improve flash drive performance through smaller block sizes and better block management techniques. They'll cost more to implement, but you should get more too.

5) The most important piece of a flash drive is the translation layer. This software takes the underlying weirdness of flash and makes it look like a disk. The translation layer is unique to each vendor and none of them are public. Each makes assumptions that can throttle or help performance under certain workloads.

What workloads? Sorry, you'll have to figure that out for yourself. The bottom line is that flash drive write performance will be all over the map as engineers try to optimize for a wide range of workloads.

The Storage Bits take Flash drives fast access times are a compelling advantage over magnetic disks. Flash prices are dropping faster than disk prices, so the cost differential is dropping, making flash more attractive each day.

But just because it looks like a disk doesn't mean it acts like a disk. It will be years before we have a good handle on the details of flash drive performance.

Of course, if filesystems stopped issuing lots of small random writes these performance issues would go away. Apple's new ZFS does this, but NTFS doesn't and it isn't clear if it can be modified to reduce the problem.

Update: I added some text to discuss the impact on users. And I tweaked the title, which is why it shows up two different ways.

Update II: I've written a lot more about flash at StorageMojo my personal blog. If you want to get your storage geek on, check it out.

Comments welcome, of course.

Topic: Hardware

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.


Log in or register to join the discussion
  • More Q's than A's

    Wow, lots to learn about this tech. Does this mean that as the size of the drive grows, you lose more and more space to deal with the block issue or is it the same percentage of space req'd to deal with the blocks?

    I guess this technology wouldn't be good for backing up? Would it be better to have a flash drive with the OS and apps, with a traditional disk for data storage and access??
    • Good question and no

      The capacity the drive promises is actually there and usable and not consumed by
      overhead. The controller chip maintains the logical block to physical block address
      table and does so on dedicated capacity.<br>
      See my article <a href="" >A random walk down
      flash street</a> on StorageMojo for more on how flash is organized, including a
      nifty diagram.

      R Harris
  • Interesting article...

    There are a lot of us who have simply assumed that the flash drive would eventually make the traditional hard drive obsolete. It's clear from this article that there are major obstacles to be overcome before that will become reality. The resiliency of flash drives make us hope that these obstacles are addressed sooner rather than later. Dropping a hard drive is more often than not a death sentence. I'd think it would also be much easier keeping a flash drive cool since you don't have to worry quite so much about the enclosure. And basic mechanics teaches us that the fewer moving parts you have, the fewer points of failure you typically have. I can see a day where rotating platters storing information on "pixie dust" are relics of a bygone era..hopefully hardware vendors are dedicated to leading us down that path.
  • I dont see these things as wierd...

    This very typical behavior for programmable chips,even back to the old eeprom days.
    • Interesting observation

      You are right. When I first started researching flash, the EEPROM aspects threw me,
      because I am a storage guy, and I don't think of writing to a flash drive as
      "programming". <br>
      It is an interesting artifact of moving from one tech culture to another.

      R Harris
    • EEPROM is an oxymoron

      electrically erasable programmable read-only memory. If you can erase it or program it then it is not read-only.
      Beat a Dead Horse
  • Next subject of review?

    Sounds like a potent area for consumer research.
  • There are more questions than answers

    I think the guys at the hardware test sites need to put their thinking caps on.
    Figuring out how to characterize flash drive write performance is not going to be
    easy: the drive's translation layer react differently to different write patterns and
    different file systems presumably have different write patterns as well.

    That's why I like ZFS for flash drives: it doesn't overwrite data. It always writes to
    free space and then runs garbage collection to mark as free space old versions of

    The basic message is: don't believe the hype. Flash drives are cool technology with
    some substantial advantages over hard drives, such as power, reliability, access
    time and weight. Yet these things come at an economic cost and a performance
    cost if you have lots of random writes. As our desktop computers get faster and
    we run more programs that do more in the background, the random write
    problem could affect a lot of people. Not what you expect when you pay 20x for a
    drive's worth of capacity.

    R Harris
  • Apple?s new ZFS?

    I thought Sun created ZFS.
    • "New" as in new to Mac OS & Apple

      I've written <a href=""
      target="_blank">extensively</a> on ZFS and met the ZFS engineering team - a
      very bright group of guys.

      A key strategy Apple has used in the last ten years is to combine external
      technology with their own innovations to create great products. ZFS fits right in
      with that strategy.

      Apple has also incorporated Sun's Dtrace, developed by the brilliant Bryan Cantrill.
      Sun has been both wise and generous in open sourcing these products.

      R Harris
      • But it's still not clear when Apple will start using it

        It appears Leopard won't.
      • What?

        "Apple's new ZFS" is possessive and attributes ZFS to belonging to Apple. If that's not what you meant, then please don't write it that way.

        For example, if Intel were to use HyperTransport, would it be proper to write "Intel's new HyperTransport"? Absolutely not.
        Uber Dweeb
    • ZFS & Sun

      Yes, but Sun open-sourced it and Apple has been doing a massive rewrite/improvement including the ability to boot on ZFS, something Sun had not been able to do to date. So, in essence, it *would* be "Apple's ZFS."
  • Like old magnetic core memory

    Reminds me of the old core memory cycles where the read and write cycles always took place together. It seems to me that on the first half of the cycle the data was read out into a register and then modified (if "writing") and put back in on the second half.
  • not to mention wear leveling

    the necessity of wear leveling is the sixth wierdest thing about flash, and how it's implemented is part of that middle layer performance.
  • so, what's the best cluster size?

    I just purchased a 2GB flash drive. It came with FAT16 and 32K clusters. I first assumed they used FAT16 for maximum compatibility. I considered reformatting it as FAT32 and 4K clusters. Now it occurs to me perhaps they chose FAT16 for better performance? Assuming a 128K block size, with FAT16 a block (2GB) contains 4 clusters. Each cluster may or may not be full. Not to mention the smaller FAT table for FAT16.

    Somebody writes the firmware then somebody else wrecks it.
  • Many of these are reasons why ReadyBoost doesn't really perform that well

    It's been my experience that ReadyBoost really does very little for me - and these points shed some light on why I'm not seeing any performance gains with it.

    Quite simply put, Windows is trying to use ReadyBoost to speed up random writes - but as this article points out, Flash drives are really optimized for sequential, not random, writes.

    Quite frankly, it's a LOT better to get more memory than to use ReadyBoost. And quite frankly, any boost I had in performance with ReadyBoost was not visible to me - my machine seemed to run the same with or without it.

    So - while ReadyBoost may have a big "coolness" factor, it's not much more than that. If you [i]really[/i] want better performance, get more memory. Vista will use all of the memory you give it to help your performance.
    • ReadyBoost

      Hmmm...I have not experiance so I can't say how effective ReadyBoost is. However according to this ( they did take this random write into consideration.
      • Did they?

        The article seems to focus on ReadyBoost as a read cache. Since Windows scatters
        DLLs and other bits all over a drive, placing them in a flash cache should help. <br>
        The article also asserts that flash sequential reads are slower than hard drives. This is
        not always true for flash-based disks, as Samsung has shown.

        R Harris