The last few months have been hectic to say the least. After the Intel Developer Forum in late September, I’ve been flying around the planet more or less non-stop. When I was in Europe and Russia last month, before heading to Japan and China just before Thanksgiving, a familiar phrase from the London Underground reminded me of a topic that I’ve wanted to blog about for some time – namely, closing the main memory – bulk storage latency gap that has plagued computer architecture for the last four decades.
Mind the Gap
At the fast end of the memory hierarchy, excluding on-chip caches, we have low-latency DRAM memory, but at over $100 per gigabyte, a lot of PCs still ship with half that amount. While a gig of DRAM may seem like a lot to those of us who can remember when the PDP-8 had only 4KB of core memory and a paper tape reader, a gigabyte is not nearly enough to hold my Outlook archive folders, a high-def movie, or a desktop search index file. I’m sure you share that frustrating feeling when you see your hard-drive light turn on and stay on as an application launches or more data gets paged into memory from disk.
Moving out one level in the hierarchy, magnetic disk has been the bulk storage technology of choice for decades. While disk continues to grow in capacity with relatively fixed cost, those capacity improvements have not been matched with similar reductions in random access latency. Over the past 10 years alone, processor performance has increased by over 30X while measured hard-drive performance has increased by only 1.3X. And, the gap will continue to grow as processor performance scaling moves to the new multi-core trajectory.
To put a finer point on it, we’ve had to make do with a factor of 100,000 difference between DRAM and HDD performance (random read latency of 150 nanoseconds vs. 15 milliseconds) and about two orders of magnitude in cost per bit for equivalent capacity. The trade-off between main memory and hard disk performance and cost affects system design and software design in fundamental and profound ways.
Coping with the Gap
Minding the gap means application developers must constantly manage the placement of data. They need to anticipate huge latency hits that can occur in a seemingly random fashion when a desired datum is not in memory. And, they need to anticipate the different target system configurations which will have a direct bearing on how the user perceives application performance.
OS developers have struggled with the gap for decades and have had some modest success hiding it. Virtual memory was invented to relieve application developers of the hassle of managing overlays, but it is very easy to push the notion of virtual memory too far. Push the ratio of virtual to physical memory too high and “a thrashing we will go” as paging rates turn exponential. The tendency of virtual memory systems to exhibit such poor performance when configured with too little DRAM has given rise to the belief that, “virtual memory is a great idea as long as you never use it.” Fortunately, Moore’s law has made doubling the DRAM in the system the usually affordable fix when the disk activity light never seems to go out.
It should come as no surprise that the search for a “gap filling” memory technology has been going for decades. When I joined Intel three decades ago, we explored (at some considerable expense) magnetic bubble memory and later charge-coupled device (CCD) memory. Neither turned out to either be dense enough and cheap enough to replace rotating magnetic storage. Many other technologies (e.g. holographic memory and, more recently, polymer memory) have been heralded over the years as being the long sought “gap filler” that will be a bit slower, but much cheaper than DRAM. Unfortunately, none of these widely-trumpeted devices panned out.
And the Winner Is?
What is a surprise is that a relatively unheralded technology, NAND flash memory, the same stuff you find in your digital music player or digital camera, looks like it may be the long-sought “gap filler” even though most people had given up looking. There are two approaches to bringing NAND into the memory hierarchy: so-called NAND disks and platform NAND, where the flash memory is integrated onto the motherboard. Let me leave the NAND disk approach for another blog while I focus on platform NAND for this posting. [Note: I have to slightly violate my promise not to tout future Intel products in this blog, but I’ll try to keep my enthusiasm, which is substantial, well in check.] Platform NAND currently goes by the code name Robson Technology at Intel and is slated for introduction with the next-generation mobile platform, codenamed Santa Rosa, in the first half of 2007.
In its initial configurations, Robson consists of up to 1 GB of NAND flash memory and an intelligent controller that fit either on a PCIe mini-card or directly down on the motherboard. In its Robson configuration, the NAND memory is used as a disk cache to temporarily store both applications and data. Since NAND has latency characteristics in the range of tens of microseconds and is non-volatile (maintains the memory image even when power is removed), it enables near “instant” resume from hibernation and applications launch 2X faster on average on Windows Vista. We also see lower overall platform energy consumption as the hard-drive spins up less frequently. The fact that NAND is typically 7X cheaper than DRAM doesn’t hurt either and makes Robson an excellent technology for filling the gap. Note that I say Robson and not NAND, because using plain NAND flash isn’t good enough to do the job on its own.
Overcoming the Weakness of NAND Flash
The one big issue with NAND as a gap filler is write endurance: NAND flash only supports a limited number of erasure cycles before wearing out. That’s where Robson’s smart controller comes into play. Simply put, it uses clever write-leveling algorithms to spread the block erasures evenly across the array giving the NAND flash memory a service life consistent with the rest of the platform.
The use of NAND as a disk cache is just the start of a major overhaul of the memory hierarchy. Samsung recently announced notebooks that use NAND to create a solid-state drive, completely eliminating the hard-drive. Further out in time, Intel and others are exploring technologies, such as phase-change memory (PCM), as a replacement for NAND flash. It’s too early to tell if PCM will go the way of magnetic bubble memory or if it will replace NAND flash, but the race is on for the future of non-volatile solid-state memory.
In the not too distant future, we can expect to see magnetic disk drives relegated to the role that tape drives play today, and even DIMMs may vanish from future motherboards. I’ll say more about that in another blog.
These changes will require us to rethink software architecture and implementation, including tuning of the operating system, drivers and applications. But the benefits are so tangible that the course is set and now the work must get done.
Going forward let’s not just mind the memory / storage gap –it’s time to close the gap for good.