X
Tech

It’s all about the data (centre)

Last week Facebook publically unveiled its data centre design, showing off its architecture, and telling the world that it was open source. I suspect that last bit was a bit of a surprise to anyone designing and building modern data centres, as they’d been using similar techniques for the last few years.
Written by Simon Bisson, Contributor and  Mary Branscombe, Contributor

Last week Facebook publically unveiled its data centre design, showing off its architecture, and telling the world that it was open source. I suspect that last bit was a bit of a surprise to anyone designing and building modern data centres, as they’d been using similar techniques for the last few years.

If you look at any major data centre, especially any that’s focused on cloud, it’s going to be DC powered, using hot and cold corridors to maintain a passive airflow, and will run quite a bit hotter than most machine rooms. It’ll be designed to use minimal cabling, and will give engineers easy access to servers and storage. None of these techniques are new, though bringing them together has been a relatively recent trend.

I recall a presentation at least five years back where Intel’s Pat Gelsinger (now elsewhere!) showed a group of journalists a design for a modern, power-efficient data centre, which was then fleshed out at the company’s IDF event that autumn. At the heart of Intel’s proposed data centre design was the idea of hot and cold corridors, where air was forced in through human accessible aisles, passed through the servers and heated, before being vented through a tall, chimney-like hot corridor. If you’re not cooling an entire data centre, and are relying on passive airflows to remove heat from your servers, you’re going to save a lot on your air conditioning bills…

Talking to data centre engineers from Microsoft’s Azure team, it was clear that, yes, the increased temperatures did reduce the lifespan of servers, but that reduction was more than offset by the savings from switching to evaporation cooling. An Azure container needs three connections: power, bandwidth, and a hose, to trickle in water into the cooling beds. And if it’s a particularly cool day, you can turn off the water and just let the breeze keep your boxes cool. If a server fails, virtual machine management tools just migrate the VM elsewhere and the cloud service carries on running. There’s no need to worry about state, as cloud services are designed to fail, taking advantage of stateless computing techniques.

Even DC power isn’t new. Telcos standardized on 48V DC a long long time ago, and most switching equipment is designed to work at those voltages. Swap out an inefficient switch mode power supply for a simple voltage regulator or two, and you’re saving on your power bills. Design a motherboard from the ground up to work with DC power (as Google and Facebook have done) and you can drop most of the voltage regulators you find throughout a server – something that can cut power demands by up to 50% and at the same time run the server cooler. After all, where does the power go when you pass it through a voltage regulator? They’re not efficient, and dump a lot of heat into your server’s case. It’s also a lot easier to provide backup power to DC systems – a room full of batteries will do the trick, just like a telephone exchange. Battery systems at that scale are also able to help even out your power demands, helping save money still further.

Using fewer wires and shorter cabling also makes a lot of sense. It simplifies replacing failed hardware, and keeps cabinet airflow as smooth as possible. It’s also easier to document your installations, and helps you standardise on cabinets and wiring looms.

As businesses transition from traditional data centres to virtual infrastructures to private clouds, all data centres are going to be like this. They’re going to be quiet, hot rooms, full of commodity hardware hooked up to DC power, servers humming away. Maybe one or two will be blinking red lights, but their workloads are already running elsewhere. Someone will be along with a trolley in a day or so to pull the failed machines for repair, dropping in another server to replace them. As soon as a new server is powered up it’ll load a hypervisor and be ready to run.

Brave new data centre indeed.

Simon Bisson

Editorial standards