I attended the 7th Annual Non-Volatile Memories Workshop 2016 at UC San Diego this month. Frank T. Hady, Intel Fellow and Chief Architect of 3D XPoint Storage, gave a keynote address to a room full of PhDs and competitors. Of course, Intel was not ready to answer a number of vital questions, but what they did offer was illuminating.
First gen XPoint will be built on a 20nm process, and can be used both as storage and as system memory. But unlike SSDs and disks, XPoint is byte addressable, meaning that it can be used similarly to DRAM. For data storage it is used as a 4k block device.
But unlike DRAM, XPoint is 10x denser (but not as dense as flash since the initial product will be single level cell). Combined with DRAM, 3D XPoint servers will be able to support 4x the memory capacity at a significantly lower cost per bit than DRAM.
3D XPoint can be used in DIMMs because of a) byte addressing, b) 1000x more write endurance than NAND flash, and c) 1000x faster I/Os. The 3D XPoint DIMMs will need controllers to perform wear-leveling as they do with NAND flash, but I'd expect the trash collection process to be much more granular - thus much less intrusive - than the full block writes of NAND flash devices.
First Optane SSDs
Nomenclature: 3D XPoint is the technology. Optane is the brand for SSDs using 3D XPoint.
Intel is aiming the initial Optane SSDs at enterprise markets, with a PCIe + NVMe interconnect. The common SATA 6Gb/s interface isn't fast enough, and PCIe and NVMe are much more efficient.
They're engineering an uncorrectable bit error rate (UBER) of 10 to the negative 17th, a 5 year lifetime, and exceptional performance, all for the enterprise/cloud market.
IOPS and latency
IOPS are significantly greater than Intel's data center SSD, the P3700, which also uses PCIe and NVMe. With 70/30 read/write load and a queue depth of one, the P3700 achieved 15,000 IOPS, while the XPoint demo achieved a 5x 78,000 IOPS.
But more importantly, the latency was dramatically lower: 7μs vs the P3700's 85μs, less than a tenth of the latency. These will be dramatically faster drives by every measure.
Architecting for low latency storage
A host of changes must occur to harvest the performance of 3D XPoint. These may include:
- Switch to polling from interrupts: more wait time, but lower latency.
- Use 3D XPoint as the page/swap space for lower latency and more predictable performance.
- New instructions for flushing writes.
- A new NVM library, PMEM.io, available first on Linux.
- Persistent memory aware file systems, such as Nova.
- Storage class memory support in Windows, which Microsoft is working on.
- Future Xeon processors designed to use hybrid memory.
That last could spell lock-in for years or forever. How generous will Intel be with APIs and controller specs?
The Storage Bits take
3D XPoint has massive potential to improve storage and memory. Much will depend on pricing.
Intel says that 3D XPoint will be between DRAM - $5/GB - and flash - $0.20/GB. That's a lot of wiggle room. I expect they'll price 3D XPoint DIMMs at around $2/GB initially, but even at $1/GB, they'd be 5x NAND flash pricing.
Another issue: 3D XPoint failure modes. Large scale deployment always discovers bugs and issues that even diligent betas won't find. Are we going to have to wait eight years for an impartial assessment of the techology's field performance?
But assuming Intel and Micron deliver as promised, or close to it, this will continue the low-latency/high IOPS revolution that flash started. As helpful as 3D XPoint SSDs will be, the real win will come from integrating the technology with DRAM to grow memory capacities and improve server performance.
Comments welcome, as always.