Microsoft gets two-phase immersion cooling running in an Azure data center

Microsoft says it's the first major cloud provider to test and implement two-phase liquid immersion cooling in its data centers.
Written by Mary Jo Foley, Senior Contributing Editor
Credit: Microsoft

Microsoft has been testing two-phase liquid immersion cooling technology for a number of years. Now, it's starting to implement this technology in its Azure datacenters, starting with its Quincy, Wash.-based ones, officials said on April 6.

This production-environment deployment of two-phase immersion cooling is the next step on the company's journey to deliver more powerful and reliable, and more environmentally friendly data centers, officials said.

Like other major cloud vendors, Microsoft has been using air to cool down processors that are getting increasingly hot, especially when running certain workloads. Because heat transfer in liquids is "orders of magnitude more efficient than air," immersion cooling could be a much better solution over time, execs say.

Google already has deployed liquid cooling in its data centers to handle the high-powered demands from its Tensor Processing Unit AI processing. Microsoft's claims of being the first to revolve around the "two-phase" part of the immersion cooling.

With single-phase cooling, the fluid stays in liquid state, and the heat is carried through natural or forced convection. In this regard, single-phase cooling is like air cooling. The hot fluid in turn rejects heat through a heat exchanger and is circulated back, company officials explain. But two-phase cooling happens passively. When the fluid comes in contact with heat-generating components, the fluid changes state from liquid to vapor and naturally rises, carrying the heat as latent energy. The vapor rejects the heat at a condenser and naturally transforms back into the liquid form, Microsoft execs said.

The cryptocurrency/bitcoin-mining industry was a big pioneer of liquid-immersion cooling, using it to cool off chips logging digital-currency transactions. Microsoft has been investigating liquid immersion cooling for high-performance workloads like AI and has found that two-phase immersion cooling can reduce power consumption for any given server between five percent to 15%, officials said. The servers can be overlocked (run at elevated power) when needed without the risk of overheating.

Liquid cooling is a waterless technology. The steel cooling tank in Quincy is filled with an engineered solution that allows servers to be dunked in the liquid and function as they would in any standard air-cooled rack. The fluid boils at 122 degrees F (90 degrees cooler than boiling water). The coils that run through the tank and enable vapor to condense are connected to a separate closed-loop system that uses fluid to transfer heat from the tank to a dry cooler located outside the container that houses the tank.

For now, Microsoft has one tank running workloads in a hyper-scale Azure datacenter. For the next several months, the Microsoft team will perform a series of tests on the technology.

"Supercomputing has used this kind of technology for decades, so the risk isn't really that high," said Microsoft Distinguished Engineer Christian Belady, who also is vice president of Datacenter Advanced Development. "The goal is for us to deploy this in all our data center regions, but there are a number of steps before we get there."

Belady noted that Microsoft's "Project Natick" undersea data center work demonstrated the improved reliability of systems when humidity and oxygen are removed from an environment. Failures due to corrosion decline substantially.

"With immersion, you have a similar thing. Essentially, you are displacing oxygen and moisture," he said.

Microsoft is continuing to investigate other liquid-cooling technologies, including cold-plate, which usually relies on tubing filled with a liquid refrigerant. Belady says Microsoft sees a potential opportunity for both, but because immersion is "a one-time design cost for the infrastructure," with no engineering required between server generations, it may find more widespread use.

Editorial standards