Replacing hard disk drives with NAND flash-based solid state drives (SSDs) dramatically improved I/O performance, even over top of the line 15,000 RPM enterprise drives. Access times dropped from 6-12 milliseconds to less than 1ms. For most of us, the move to an SSD felt like getting a whole new computer.
But sub-millisecond access times weren't anywhere near what flash drives could do. The problem was the old hardware and software that was optimized over 50 years for HDDs instead of SSDs.
Enter the Dragon
That's where the Non-Volatile Memory express comes in. It is the I/O software stack that is optimized for today's SSDs and future non-volatile storage technologies. NVMe is the new industry standard software interface for PCIe SSDs.
NVMe eliminates the legacy cruft of HDDs. Since hard drives are so much slower than every other layer of the storage stack - except tape - the software stack makes every effort NOT to go to disk, with layers of caching and other techniques to minimize reads and writes.
Worse, the software stack was so much faster than the disk that little attention was given to optimizing the I/O stack for performance. With SSDs though, the I/O stack went from less than 20 percent of the access time to over 50 percent.
With the advent of PCIe SSDs - I think Fusion-io, now part of WD, was the first - the much faster PCIe hardware drove the software's contribution to I/O latency up to 80 percent. Clearly, a new I/O stack was needed.
NVMe is a very different beast than current I/O stacks.
- Support up to 64k commands per queue and up to 64k queues
- Message Signal Interrupts (MSI-X), an in-band signalling protocol that supports multi-processor systems
- High performance protocol with only 13 required commands
- Optional features for data center or client systems
- Designed for scalability and NVM independence
The goal is to enable next-gen technologies to deliver a 4KB I/O in less than 10μs - about one thousandth of the latency of a 7200 RPM SATA drive.
But wait! There's more!
NVM Express, Inc., the organization behind NVMe, is also working on a standard NVMe management interface to simplify adoption and NVMe over fabrics. The fabrics will support Remote Direct Memory Access (RDMA) over Ethernet, InfiniBand, and Fibre Channel. A number of demonstrations promise as little as 2-4μs additional latency with RDMA over fabrics, a remarkably low number.
The Storage Bits take
Get used to NVMe. It is already supported on Windows, Linux, and FreeBSD. In combination with PCIe it offers stunningly high performance.
It will be interesting to see how the economics of internal PCIe/NVMe play out against the fabric-based rack scale systems that Intel and others are proposing, especially once high performance non-volatile memory is widely available on optimized memory busses.
But that's a post for another day.
Courteous comments welcome, of course.