Making MLC safe for the enterprise

Making MLC safe for the enterprise

Summary: Can 3bpc flash - with only 1,000 read/write cycles - ever be safe for serious enterprise apps? Of course it can. In fact, the enterprise already uses lower duty cycle media today. It's how you use it that matters.

SHARE:

Can 3bpc flash - with only 1,000 read/write cycles - ever be safe for serious enterprise apps? Of course it can. In fact, the enterprise already uses lower duty cycle media today. It's how you use it that matters.

As NAND flash has moved from cameras and cell phones to SSDs and storage arrays, the focus has been on SLC - single level cell - flash. SLC has much higher R/W cycles - from 100k to 1,000k - than MLC (multi-level cell). The newest generation 3bpc - 3 bits-per-cell - MLC has 1k write cycles. Should this stop designers from using it for mission-critical apps?

The cost factor The only reason for using MLC is that it is a lot cheaper than SLC. You get 2-3x the capacity on the same chip.

But the actual price difference is closer to 5x. Why? Because MLC is the high-volume product. It takes a lot of photos to fill an 8 GB SD card even 100 times, let alone 1,000.

But MLC chips have a higher failure rate - perhaps as high as 50x SLC. That is only about 0.5%, but for vendors using millions of chips this is a real problem.

Overcoming the problems None of these factors are fatal to MLC in mission-critical apps. Enterprises routinely trust their data to LTO tapes that only support 200 head passes - 1/5th of what even 3bpc MLC offers. The key is matching the media to the application.

MLC has used several techniques to overcome the endurance problem:

  • Higher capacity. The larger the box, the longer it takes to fill. While filling a 128 MB flash is easy, filling a 128 GB SD card 1,000 times takes much longer.
  • Over-provisioning. Do what disk drives do: assume bad blocks happen and have extra capacity to replace them when they fail.
  • Wear-leveling. Just because you're hitting the same file, doesn't mean you have to hit the same blocks. Wear-leveling spreads the joy and ensures that blocks wear at about the same pace.
  • Improved garbage collection. Since flash has to be written in blocks, transferring data from old blocks to new ones is a major source of wear. Replace some flash capacity with non-volatile DRAM and you stop a major source of wear.
  • Enhanced ECC. Just like disks, flash vendors have gone from 4-bit to 15-bit ECC as geometries shrink and capacities increase. They can go much further, just as disk drives have.
  • Improved signal processing. Signal processing determines what is signal and what is noise. There are multiple techniques for measuring flash cell performance to improve data integrity. The net: MLC that acts more like SLC.

The Storage Bits take It will take time for engineers to scope out all the MLC issues and to develop the software to implement optimized algorithms. But have no doubt, the problems will be solved because the economics are too great to ignore.

Once these techniques are embedded in silicon, they'll spread to consumer devices too. That means much cheaper SSDs for us consumers. Cool.

Comments welcome, of course. A presentation by Anobit at the Storage Networking Industrial Association Storage Developer Conference spurred my thinking on this subject.

Topics: Networking, CXO, Hardware, Processors

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

3 comments
Log in or register to join the discussion
  • Correct me if I'm wrong, but ...

    ... isn't SLC also MUCH faster?

    I know it's more expensive, but I try to buy SLC only ...

    Ludo
    Ludovit
  • RE: Making MLC safe for the enterprise

    SLC is faster, but no single flash chip has anywhere near the bandwidth of a disk. Which means that multiple flash die have to be written in parallel to achieve performance.

    With MLC you have to write to more die in parallel for a given level of performance. But since you also want a large capacity AND over-provisioning, this is not much of a negative.

    Robin
    Robin Harris
  • RE: Making MLC safe for the enterprise

    Hi Robin,

    Anyone who looks at MLC and thinks that it will never be "enterprise ready" needs to consider that the same was said about 5.25 inch disks. 3.5 inch disks. 2.5 inch disks. Windows. Need I go on?

    Disruptive technology is a beautiful thing, and it always wins.

    I am looking for the day when Apple announces that they will no longer ship spinning disks in any consumer platform. 2011? 2012?

    MLC in the enterprise will be accepted as a foregone conclusion by the end of 2011.

    Warmest regards,

    Dave
    Dave Nicholson