The M1 Mac write issue: What's going on with Apple's SSDs?

Some M1 Macs are seeing absurd levels of write activity on their - costly - internal SSDs. Here's my take.

What's the issue? Per iMore: 

There are numerous reports from M1 Mac users that macOS is giving out worrying hard drive health reports which could indicate severe life span problems.

While it sounds pretty bad - especially considering Apple's exorbitant storage prices - it is unlikely to be anywhere near as bad as it sounds.

Why?

Some context is helpful.

Let's start with the fact that NAND flash - the underlying medium in almost all Solid State Drives - is a terrible storage medium. Which means there's a lot of firmware between your write and the medium that you hope it is stored on.

Some key issues include:

  • NAND flash is an analog medium, unlike the magnetic bits in hard drives, or the optical pits in DVD media. Flash stores electrons in tiny physical quantum wells. Electrons, those bad boys, are always trying to sneak out, which is why SSDs aren't forever, unlike, say, M Discs.
  • Limited write life. Herding electrons into a quantum well takes power, way more than the 5 volts common in solid state circuits. That voltage steadily wears away the insulation that lines the quantum well. Which means that the cheapest - i.e. what PCs use - NAND flash may handle as few as 500 writes over a cell's lifespan.
  • Writes are WAY slow and take a lot of power. While reads are microseconds quick, writes can take milliseconds - not much better than hard drives.

And there's more, like write amplification and read disturbs, that call for more controller acrobatics to mitigate.

So what's going on with the M1 SSDs?

Apple has some world-class SSD engineers.  They've led the industry in integrating flash into their notebooks and in high bandwidth consumer SSDs in general.

I'm confident that whatever the problem users are seeing, Apple's engineers will be on top of it in short order.

But where is the problem likely to be? In order of likelihood:

  • The reporting software - based on SMART monitoring from hard drives - is mischaracterizing the SSD's activity. SMART was a good idea haphazardly implemented across many - but not all - drive's firmware.
  • SSD controller firmware is misreporting write activity. SSD controller firmware is complex due to the limitations of NAND flash. Mitigating write amplification requires careful juggling of DRAM buffers in the controller to ensure that only necessary data gets written to flash.
  • A bug specific to how recently ported pro apps like Final Cut Pro and Logic Pro interact with the SSD controller.

The Take

Storage engineers are a different breed. They know that while everything else in a computer system can be reset by a power cycle, your data is the one part of the system that is not transient. Data corruption is a no-no, despite the fact that the universe hates your data.

Mistakes still happen though. I'd expect Apple storage engineers are working long hours to get a handle on this and a fix out the door.

As an owner of a newly minted M1 MacBook Air, am I worried? No.

Am I looking for a fix Real Soon Now? Yes.

Comments welcome.