Data center design drivers

Over the last seven years, the average power consumption in data centers has increased 7-fold, leading to changes in how data centers are built and the assumptions you can make when deploying servers.

I spent yesterday at IBM's Executive Briefing Center in Raleigh. I was there to learn what I could about data centers, power, virtualization, and blades. I've got some of my notes on virtualization over on my blog.

I've always loved data centers. I recently got a chance to visit a data center we built in 1999 and it was like seeing an old friend. One of the main topics at the briefing was power. Lately, power and cooling have turned out to be the big drivers in data center design.

Over the last seven years, the average power consumption in data centers has increased 7-fold, from 20W/sq. foot to over 140W/sq. foot due to huge leaps in the density of servers. About 44% of the power in a typical server is used by things other than CPUs, memory, disks, and other compute components. Where does it go? The AC to DC transition in the power supply turns some of it to heat. A little ironically, fans and other air handling components eat their share. And, in some designs there's even a DC to DC transitions (HP, for example, distributes 48V through the chassis and then drops it to 5V at the blade) that loses more the inefficiency.

Power and cooling are a shell game. You solve it at the CPU level by moving it to the chassis. You solve it in the chassis by moving it to the rack. You solve it in the rack by moving it to the data center. Many data centers start off with good air handling set-ups, but, like anything else, unless they're maintained and re-evaluated from time-to-time, over time the performance degrades and the data center just can't take the load.  As a result, the shell game comes full circle and some of the heat that started out in the server is pulled right back in with the cooling air instead of being taken out of the data center and dumped.

Chassis cooling has come a long way from one big fan in the power supply. Some manufacturer's fully populated blade chassis contain upwards of 64 fans. An interesting point was that as density increases serial connections for drives are a factor in removing heat because big ribbon cables are not only bulky, but interfere with air flow inside the chassis.

Another way to reduce power and cooling requirements inside the chassis, and then on up the line, is to reduce the number of drives on individual servers. Remote boot from SAN is one way to do that.

Rack cooling has changed a little as well. One cool thing I hadn't seen before was a rear door for a rack that is a big radiator. The unit uses chilled water to remove heat--up to 50,000 BTUs/hour or 14Kw. This is obviously a point solution for big cooling problems since each door costs about $4K.

One of the biggest factors in data centers not being properly cooled is air flow.  A lot of times people guess at this.  There are some tools that experts use to model air flow in the data center.  TileFlow "uses the technique of Computational Fluid Dynamics (CFD) to calculate airflow rates through perforated tiles. It accounts for the effects of all important factors that control the airflow distribution."  Results from TileFlow can be fed into FloTherm, 3D fluid dynamics software for modeling air flow.  Unless you've got a good feel for fluid dynamics in compressible fluids (the hard part) you may not want to tackle this yourself, but it's not black magic.  There is science here. 

Weight is an unknown factor in data center design. As systems become more and more dense, power and cooling have been joined by weight as a critical factor. If you're data center's not on the main floor, you might want to consider loading as you increase the density of you server farm.

Power per processor is a key metric for many people. Twelve processors in a blade chassis requires about 4Kw (IBM) Twelve 1U pizza boxes take 5Kw. A 20% savings in power means that many more processors in the same data center.

Since most data centers have power or cooling limitations (everything you put in has to be taken out), the power budget, Kw/rack, is an important measurement. Interestingly, on a per rack basis, blades offer greater density regardless of your power budget--about 30%. If your power budget is 4Kw/rack you can get 17 1U pizza boxes or 23 blades. At 21Kw/rack, you get 93 1U boxes or 123 blades.

The bottom line is that regardless of how big a power budget you build into your data center, for the foreseeable future there will come a time when increased density will mean that you have wasted space. Low power solutions will help, but don't provide the same compute power, making the choices difficult since the most important metric of all, performance/Kw-sq foot, is usually hard to come by.

Getting facilities people to understand data center power requirements can be difficult. Telling someone you want 5, 10, or 20 thousand sq. feet of space with a power and cooling requirement of 400 or even 200 Watts/sq. foot will raise a few eyebrows. It might help if we talked to people in terms of processor/sq foot that the requirements dictate and then work backwards to the power and cooling load form there. Coming up with a big number seemingly out of thin air usually doesn't work--even when it's backed up by solid research.

The alternative is building the data center twice as big as you need and then only using half of it. Don't laugh--it works.