Flash drives: your mileage WILL vary

Flash drives: your mileage WILL vary

Summary: Flash is an alien technology for disk users. I've noted before that flash drives can have really terrible write performance, but until I ran into it myself I had no idea how bad flash write performance could be.

SHARE:
TOPICS: Hardware
42

Flash is an alien technology for disk users. I've noted before that flash drives can have really terrible write performance, but until I ran into it myself I had no idea how bad flash write performance could be.

Last month I wrote (see Five things you never knew about flash drives)

Flash drives only look like disks. In fact, nothing works the way you’d think. Flash is really different from magnetic recording, and those differences have a big impact on flash drive performance. How well vendors manage flash oddities has a huge impact on performance and even drive lifespan.

Honestly, I had no idea how right I was.

How long would it take to load an OS on a thumb drive? I help friends in my small town (pop. ~10k) with computer problems. I thought it'd be handy to have a thumb drive with my favorite utilities loaded, so I started loading some on a generic 2 GB USB thumb drive. Most loaded as quickly as I expected, until I got to a 16 MB utility.

I dragged it to the thumb drive icon on the desktop, and the progress bar popped up with a 16 minute time estimate. 16 minutes! Less than a megabyte per minute on a flash drive that is capable of 15 MB a minute.

The progress bar was moving so slowly that I thought the machine had hung, but no, it was just going 1/15th the speed. Whoa!

I couldn't believe it I thought something was wrong, so I tested it with a single MP3 file of the same size. No problem, loaded in less than a minute.

What the heck was happening?

First surprise: hundreds of sub-2 KB files The utility is in the form of a file folder or package. The icon makes it look like a single piece of code, but it contains a couple of thousand files, many of them HTML help pages. These small files, and the write overhead they incur, must be the source of the slow loading.

When I first ran this test, it turned out the thumb drive was formatted with the old Microsoft FAT 16 file system, which in a 2 GB drive gives a cluster size of 32 KB. So not only was the load slow, the resulting file on the thumb drive was huge - about 4x bloat. After I published the first version of this post, a couple of alert readers pointed out the FAT 16 problem.

I reformatted the the thumb drive with FAT 32, with 4 KB clusters on a 2 GB disk, and the file bloat shrank to about 15% from 4x, but the load time stayed the same, or even longer.

What I think is happening Being flash, every write has to be preceded by an erase cycle which is an overwrite of the entire block. I can't tell how big the block size is, but it is likely to be at least 64 pages and probably more. It appears that every write of a 2 KB file requires a 128 KB read to preserve existing data in the block, a 128 KB erase, and then the 2 KB file gets written along with the rest of the 128 KB of data already in the block.

Just eye-balling the numbers it looks like about 10 small writes per second - way worse than even a 1.8" drive would do.

Update: The first two commenters quickly - and correctly - pointed out that I was seeing a FAT 16 problem, not a flash drive problem. I checked the file system on the thumb drive and sure enough, it was FAT 16. It must have come that way from the factory - something else to be aware of. I'm in the process of reformatting the drive with the FAT 32 file system. As soon as that completes, and its taken about five minutes so far, I'll retest and update again.

Update II: I've reformatted the flash drive to FAT 32 and am loading the same utility. Load time is just as long, maybe even longer. It appears to me that while FAT 16 may explain the file bloat due to the 32 KB cluster size, it is not explaining the slow load speed. In fact, it appears that loading the files under FAT 32 is taking longer than it did for FAT 16.

Next, I plan to reformat the flash drive with NTFS and see how that works. Stay tuned for more updates.

Update III: Windows XP wouldn't allow me to reformat the thumb drive with NTFS. It looks like FAT 32 is the best I can do. Next I'll try the OS X HFS+ and see what, if any, difference that makes.

I also changed the original text to focus on the load performance rather than the FAT 16 bloat. I'm not a Windows aficionado, so I suspect lots of non-technical users would get caught by this as I was. I wonder how many thumb drives come with FAT 16?

This is just one data point I can't generalize to all thumb drives or to flash-based solid state disk (SSD) replacements from this one experiment. Why? Because each flash drive has its own way of converting from how flash works to how disks work: the translation layer. As I noted in the earlier post:

The most important piece of a flash drive is the translation layer. This software takes the underlying weirdness of flash and makes it look like a disk. The translation layer is unique to each vendor and none of them are public. Each makes assumptions that can throttle or help performance under certain workloads.

What workloads? Sorry, you’ll have to figure that out for yourself. The bottom line is that flash drive write performance will be all over the map as engineers try to optimize for a wide range of workloads.

This is clearly a case where lots of small files choke this particular translation layer.

The good news is that after the utility finally loaded using it was almost as fast as using it from a disk. Getting it ON the thumb drive was the problem, not getting it off.

The Storage Bits take This experience made it clear to me that flash performance and capacity cannot be assumed from the vendor specs. Perhaps my thumb drive is poorly engineered, or optimized for capacity over everything else. In any case, this example made me realize just how different flash storage can be and how little we actually know about performance of specific implementations.

With flash the only thing you can be certain of is that your mileage *will* vary.

Comments welcome, of course. Anyone with similar experiences? Can anyone better explain the behavior I saw?

Topic: Hardware

About

Robin Harris has been a computer buff for over 35 years and selling and marketing data storage for over 30 years in companies large and small.

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

42 comments
Log in or register to join the discussion
  • This isn't a Flash problem, it's a FAT problem

    your drive is most likely formatted with the FAT filesystem. fat was discarded along with windows 98 for just this problem as well as a host of others, the cluster size in this situation is 32k, meaning no matter what size the file is under 32k, it still takes 32k.

    this isn't a problem with flash, if you had a 100 gig hard drive and you formatted it with FAT instead of FAT32 or NTFS you'd find yourself with the exact same problem.

    solution would be to format the stick with either fat32 or ntfs, but i would be careful as some os's can't handle ntfs (win98) so if you're going to fix a win98 machine it wouldn't read the stick if it were formatted in ntfs.

    Valis
    CEO
    Valis Enterprises
    http://www.valissoft.com
    Valis Keogh
    • sorry, this is a flash problem

      It has nothing to do with the file system.

      What he is experiencing is a well-known issue with flash memory.

      I see the exact same thing on flash drives formatted with ntfs.

      You can format a flash drive as fat, then use convert at the command prompt to convert it to ntfs.
      sagec
      • No, sorry, it's a Windows problem...

        See my other post about "write caching".
        jinko
    • some more info:

      http://www.diskeeperblog.com/archives/2007/06/the_impact_of_f.html

      "Flash and SSD devices are good at reading data, but are not as good at writing data. The reason for the poor write performance is that these (NAND based) devices must erase the space used for new file writes, immediately prior to writing the new data. This is known as erase-on-write or erase/write. Improvements in this area are coming (phase-change memory)."
      sagec
    • RE: Flash drives: your mileage WILL vary

      To facilitate seems being a conclusion comfortable en route in lieu of <a href="http://flvto.com/">youtube converter</a> condition you application <a href="http://flvto.com/">youtube to mp3 converter</a> and <a href="http://flvto.com/">youtube mp3 converter</a>
      youtube converter
  • Might Sound Obvious, But . . .

    . . . with those 32kb cluster sizes on a 2GB drive, you might want to double-check and make sure that the drive wasn't formatted to FAT16. Might not solve your performance issues, but it would explain the filesize bloat you're seeing.
    Whyaylooh
    • You are both right

      I'm updating the post and reformatting the drive to re-run my tests.

      Thanks!

      Robin
      R Harris
      • Unfortunately . . .

        . . . as I mentioned, the file system might not fix things as far as performance. If anything, FAT32 is a tad slower than FAT16, due to the increased overhead from the significantly larger file allocation table, among other things, thus the results you saw in your test when you re-ran the copy after formatting with FAT32. The main benefit of FAT32 is getting back the slack space that FAT16 eats, at about a 5-10% performance hit on average.

        So, on the bright side, with that utility comprised of 2KB files, you're now wasting only 2KB per file instead of 30KB. ;)

        As far as the base argument of performance varying from flash drive to flash drive, you're absolutely correct. An interesting case in point can be found here, in an article testing a variety of USB flash drives for Vista ReadyBoost:

        http://www.extremetech.com/article2/0,1697,2017818,00.asp

        Of the nine USB flash drives tested, six passed the basic criteria to be able to run ReadyBoost. Not that I consider testing a peripheral against Vista to be the be-all, end-all of performance testing, but it does highlight the fact that there really doesn't seem to be much of a standard as to just how slow or fast USB flash memory is from one brand to the next, or even within brands.

        Anyway . . . I am curious to hear more real-world performance testing here, such as what you're doing with actually loading up a flash drive with a USB-boot toolkit (which does well to test things like format times, copy times, etc.), if you manage to pick up a few different brands of memory keys. I've put the same sort of thing together, but never really timed it out, other than noticing that it is at least faster than a CD-boot toolkit, most of the time. ;)
        Whyaylooh
  • Different companies

    have different speeds as well. I have a very old 256 meg cokebottle dongle (FAT32) that we use as a 'sneakernet' around the office. It's speed is incredible. On a USB2 (high speed) I can copy a 2meg PDF in less than 1:92 (That's a little less than 2 seconds. On my newer 1gig dongle (Adata.com, FAT32) it took 4:76

    Gizmo Richard (http://www.techsupportalert.com/) had a review of several dongles, and recommend a few for use with portable apps. Might want to check it out as well.

    - Kc
    kcredden2
  • antivirus or chipset?

    At work I have a Dell system loaded with CA antivirus software. The IT olks have the antivirus tuned to scan everything going onto and off of the system--including thumbdrives. And that really slows things down. Especially with zip files because it unzips them and scans all the files inside. If I disable the antivirus then the file transfer goes much quicker. I had Norton Antivirus at home for a while and it was less intrusive than CA, but it still scanned the files and slowed things down, but not as bad as the CA at work. But with Norton, and now I use AVGfree, I never had a problem with getting the files transferred eventually. With CA, or maybe it is due the chipset in the Dell machine, if the file is too large or in a zip file it is too deep, I eventually get a delayed write failure and sometime I have to reformat my thumbdrive (I have a sandisk 2GB). Or if I turn off CA, I don't have any trouble. I have seen the same issue with CA with other usb attached memory devices, and the aforementioned thumbdrive has no trouble on other machines I plug into, so I conclude it must be the CA antivirus. (BTW...it's not just my machine at work, other machines I have tried at work with the same CA antivirus have the same issues, and my coworkers have complained of similar problems)
    I have also noticed with windows in general that the larger the number of small files, the longer it takes to transfer them.
    And thirdly, your chipset/chipset drivers may be a factor as well. I have a dell at work that reads my u3 capable sandisk thumbdrive and mp3 player okay, but the thumbdrive isn't readable by one of my older systems at home. My lexar 1GB is read just fine by this older system, but not by my dell at work. Go figure.
    BTW...CA sucks (or maybe it's just how the IT folks at my company configured it). When it is running, it takes half my 3 GHz processor and, as I mentioned before corrupts my data sometimes when I put it on a usb drive.
    NuttyBuddy
  • You have to enable write caching....

    This isn't because of flash, it's because of the horrible inefficient file system writing the same sectors over and over again. The same thing will happen when you try to delete all those little files - it'll take hours.

    But...if you enable "write caching" on the drive it will complete in seconds. The only thing is you have to remember to "safely remove" the drive via the little green icon in your system tray. This ensures that all disk transactions are complete.

    Windows usually does its utmost to disable write caching on all removable drives so this "safe removal" thing is a bit of a joke (does anybody you know actually use it?)

    PS: If you disable write caching on your hard disk you'll think your machine is broken.
    jinko
    • Write caching: More....

      I just tried it on one of my drives and the "right-click->properties->{your_disk here}->policies" thing doesn't seem to work for enabling write caching. It says it does, but a quick timing shows it isn't.

      The only way I can find to turn on write-caching is via "Administrative tools->Computer management->Disk manager". This pops up exactly the same "policies" dialog but this one seems to work.

      eg. To delete 3000 files from the disk takes [b]45 minutes[/b] without caching but only a couple of seconds with it enabled.
      jinko
    • Final benchmarks

      I timed two things:

      a) Copy a 2900 files (600Mb) from my hard disk to a pen drive
      b) Delete the folder from the drive

      Without write caching it took about an hour to copy the files and 45 minutes to delete them.

      With write caching enabled it took 16 minutes for the copy and 25 seconds for the delete.
      jinko
    • Sounds about right

      Yeah, sounds about right. In both XP and Vista, write caching is usually turned off for flash drives - usually because of the corruption that can happen if they are unplugged before the cache is written. That's probably why Microsoft defaults it to off.

      But yeah, enabling it can really give you a performance boost. As long as you remember to safely remove it.

      But - I know a lot of people who just yank out their flash drive without a second thought - and for them it's good to have caching turned off.

      I also know some people who will take care to "safely remove" their drive even with caching turned off, lol.

      "PS: If you disable write caching on your hard disk you'll think your machine is broken."

      You certainly will - harddrives are very slow devices! Don't tell me you actually tried it . . .
      CobraA1
      • Write caching

        I've been turning off write caching on all my drives since it first showed up in DOS. Makes the system much less brittle when you have a system crash. I've only once ever had to reinstall Windows, and that was because somebody else used my PC and got a particularly tenacious virus on the thing. Only just a few months ago did I replace my old Windows98SE system with one running XP Pro, and that was mostly for the hardware upgrade. I partly credit the lesser data loss brought by disabling write caching for letting it last so long. Of course I also disable a lot of other things I don't need with LitePC, so that probably lightens the load enough to make up for the caching performance hit.
        johnay
        • About corruption

          With or without caching, corruption due to crashes is a thing of the past for me - I use a journaling file system, and I have a UPS. I don't have to compromise data integrity for performance.
          CobraA1
          • Sorry dude.

            Journaling won't save you from losing data in a crash or power loss mid-write, or pre-commit in the case of a cached write. The most journaling will do is revalidate the file system faster afterward. It won't recover your unwritten or overwritten data.

            Maybe you're thinking of a transactional file system. Even that isn't perfect.

            You can't write data to the hard drive, or any drive for that matter, without taking the time to write the data to the drive. TANSTAAFL.
            johnay
          • You misunderstand journalling.

            From what I've read, journaling enforced atomicy of writes - either the write is fully finished, or it is rolled back to an older, but still usable, state.

            What happens is a journal of changes is kept (hence "journaling"), and they are applied to a copy of the file before the old file is overwritten.

            Here's how it ensures atomicy:

            -If the power is turned off while writing the journal, the journal is wiped and the old file is left intact. It's as if the write never took place. You're left with an old, but uncorrupted, copy of the file.

            -Once the journal is completely written, a flag is set to tell the computer that it's 100% written. Now you have an intact old file and an intact journal. The computer then starts creating a backup copy of the file.

            -If the power is turned off while creating a copy of the old file, then the copying process is restarted. The old file and journal are intact, so the process can be restarted to create the new file.

            -Once a copy is made of the old file, a flag is set to tell the computer that the old file is done copying. Now you have two copies of the old file, so you can start writing to one while keeping the other intact in case something happens.

            -If the power is turned off while writing the journaled changes to the file, then the copy is still intact. The unchanged file can then be copied and the process of writing the new file can be restarted.


            In all cases, only one of the files is ever half-written, so you can always use the other one if the power is disrupted.

            "You can't write data to the hard drive, or any drive for that matter, without taking the time to write the data to the drive."

            Correct, but you [b]can[/b] make a copy of it before changing it!! That way you have a backup to revert to if something happens! Which is essentially what journaling is in a nutshell: It's creating a backup of the old data before writing the new data.

            "It won't recover your unwritten"

            Unwritten data can never be recovered, since the computer's state is completely wiped in such an event. In this case, you simply revert back to copy of the file you made before you started writing to it. An old but intact copy is better than a new but corrupted copy.

            "or overwritten data."

            If you have a backup copy of the file, this is false. You only write to one copy, leaving the other copy intact until the process is finished.
            CobraA1
          • Good explanation. Thanks!

            Robin
            R Harris
          • That's not journaling

            That's a transactional file system.

            Journaling only rolls back changes to the MFT, making it more robust and making running a disk check after every crash unnecessary. It will not, however, prevent data from being directly overwritten, and if a crash occurs mid-write during such an overwrite you will lose data.

            IIRC, Vista does introduce transactional file operations, but it's an optional thing that applications have to be programmed to use.
            johnay