Why SSDs don't perform

From their earliest days, people have reported that SSDs were not providing the performance they expected. As SSDs age, for instance, they get slower. Here's why.

SSDs are a new and evolving (and the SATA versions are already obsolete) technology. It's not surprising that we're still learning how and where to use them.

In this paper, we show through empirical evaluation that performance SLOs cannot be satisfied with current commercial SSDs.

One of the most popular uses of SSDs is in servers hosting virtual machines. The aggregated VMs create the I/O blender effect, which SSDs handle a lot better than disks do.

Special Feature

Storage: Fear, Loss, and Innovation

The rise of big data and the demand for real-time information is putting more pressure than ever on enterprise storage.

But they're far from perfect, as a Usenix FAST 15 paper Towards SLO Complying SSDs Through OPS Isolation by Jaeho Kim and Donghee Lee of the University of Seoul and Sam H. Noh of Hongik University points out:

In this paper, we show through empirical evaluation that performance SLOs cannot be satisfied with current commercial SSDs.

That's a big statement. Here's what's behind it.

The experiments

The researchers used a 128GB commercial MLC SSD purchased off-the-shelf and tested it either clean or aged. Aging is produced by issuing random writes ranging from 4KB through 32KB for a total write that exceeds the SSD capacity, causing garbage collection.

They then tested performance in each mode using traces from the Umass Trace Repository. The traces were "replayed" generating real I/Os to the SSD for three workloads: financial; MSN; and Exchange.

Variables

In addition to clean and aged SSD performance, they tested each VM with its own partition on a clean SSD and running the workloads concurrently on a single partition on a clean SSD.They repeated the tests using an aged SSD, to notable effect:

I/O bandwidth chart
IO bandwidth with individual and concurrent execution of VMs.

The authors ascribe these massive performance declines to the flash process of garbage collection. Flash requires entire blocks to be written. Once the number of invalid pages in a block reaches a threshold, the remaining good data is rewritten to a fresh block - along with other valid data - while the invalid data is flushed.

But, as they note, it is not possible to know exactly what is going on inside an SSD because the control logic - the flash translation layer (FTL) - is a proprietary black box. We can see the data come into the FTL and see it written, but the internal process is not visible.

The Storage Bits take

Flash storage has revolutionized enterprise data storage. With disks, I/Os are costly. With flash, reads are virtually free - but writes remain expensive.

Good advice:

  • Give each VM its own partition on the SSD.
  • Age SSDs before testing their performance.
  • Expect long tail latencies due to garbage collection.

As the paper shows, using an SSD poorly can waste most of its possible performance. And until vendors give users the right controls - the ability to pause garbage collection would be useful - SSDs will inevitably fail to reach their full potential.

Comments welcome, as always. The paper covers a lot more ground. I plan a longer take later this week on StorageMojo.

The Internet of Things

10 types of enterprise deployments

As businesses continue to experiment with the Internet of Things, interesting use cases are emerging. Here are some of the most common ways IoT is deployed in the enterprise.

Newsletters

You have been successfully signed up. To sign up for more newsletters or to manage your account, visit the Newsletter Subscription Center.
See All
See All