Google's warehouse-size power problem

Power to the internet!Beginning 5 years ago, Google took the lead in making a power consumption an issue for IT vendors.
Written by Robin Harris, Contributor

Power to the internet! Beginning 5 years ago, Google took the lead in making a power consumption an issue for IT vendors. No one cared that much before that because no one else was building 100,000+ server data centers using free software and cheap PC hardware. Google wasn't the only factor, but their use of free software, cheap hardware and massive scale meant that energy consumption became one of the few places they could cut costs.

Now, 5 years later, Intel's power hungry NetBurst architecture is dead and power-efficient kit - multi-core CPUs, LED backlights and even disk drives - are coming on strong across the industry. Google's timing was excellent, true, but they also used their clout to get vendors moving. Bravo!

But Google isn't done. Using the data-intensive methodologies based on their massive scale they've now published their analysis of computer power consumption. The paper offers an interesting peek into the problems of running the world's largest internet data centers as well as some pointers to what we may be seeing in five years.

Googlers take a hard look at power How accurate are the power requirements we see in specs? Is CPU power scaling really helpful? What determines computer power consumption? Google now gives us one large-scale data point, looking at groups of up to 15,000 servers. That's a tiny fraction of the server population of a recent Google data center, but large enough to be useful.

Power Provisioning for a Warehouse-sized Computer (an 11 page PDF) by Xiaobo Fan, Wolf-Dietrich Weber and Luiz André Barroso is worth a read if you're willing to interpret cumulative distribution function graphs. The paper begins by noting that a datacenter costs $10-$20 per deployed watt of peak computer power, excluding cooling and other loads. This is more than 10 years of power costs, so getting the numbers right pays big dividends.

Google doesn't buy power the way you and I do Google's newest data center in Oregon has taken the place of decommissioned aluminum smelters as a major power consumer. The cheap Columbia River hydro-power costs roughly 25 cents per watt/year, or $2.50 per watt for 10 years. Not only that, but Google - like other big users - gets charged based on their peak watt/hour power consumption. If they do one hour at 100 MW and the rest of the month at 25 MW, they get charged for consuming a 100 MW for a month.

So keeping an even strain on a data center's capacity is important. Ideally they want to build a data center that uses, say, a steady 50 megawatts so they can build a so they can build an efficient data centers and they don't get billed for power spikes. That isn't the average homeowner's problem. Google really needs to understand power consumption.

So what did they figure out? Well, a whole heck of a lot. Here's some key findings.

  • The gap between aggregate and spec power can be as great as 40% for a datacenter, though Google's applications are better behaved. That is a lot of wasted distribution and cooling capacity.
  • User-operated systems can be more efficient. This is true of home systems and of Google, and not true in the average enterprise data center. because the system can be run at close to its rated performance.
  • Power management is more effective at the datacenter level than at the rack level. There is no comment on whether that applies to an individual PC.

How do you measure power on 15,000 systems? Google uses the cheapest possible - but no cheaper - PC hardware for their servers, which is one thing they have in common with many SOHO users. Even under heavy load a server would use less than 60% of its nameplate power.

Google couldn't measure 15,000 servers directly, so they looked for a reliable and easy-to-monitor indicator. After testing they determined that CPU utilization predicted power usage to within 1% of measured power use. If the CPU is busy, everything else tends to be busy too: memory, disks, fans and - in the home case - video cards. Also, Google measures large numbers of machines, so individual variations get smoothed out.

So how can this help us save power? Google evaluated two power saving techniques. CPU voltage/frequency scaling has been implemented on some AMD and Intel chips. The idea behind scaling is that when the processor is less busy, it can reduce its input voltage and clock frequency to save power.

Modeling CPU scaling under data center loads, Google found that datacenters could see savings of 15-25%, depending on how aggressively it was used. They also found that I/O bound servers benefitted less than compute-intensive workloads.

Another option is to improve non-peak power efficiency. Most "efficiency per watt" metrics are based on peak loads, but Google found that most systems spent little time at "peak". Google found that idle systems power never dropped below 50% of peak load, while ideally an idle system's consumption would drop to zero.

If idle power were only 10% of peak power, an enterprise data center could save 50% on its power. Even Google's well-behaved apps would see savings in the 30%+ range. We can expect to see Google push the non-peak power efficiency issue pretty hard, and since they spend some $500 million a year on servers, I expect vendors will listen.

For us home and office users the savings would probably be even greater, since typical office tasks rarely stress a system, let alone take it to a peak load. Our systems spend all their time in "non-peak" performance - gamers excepted.

The Storage Bits take As computing systems weave their way deeper into the fabric of industrial societies - and I believe we are still in the early stages of the process - the energy demand becomes a greater issue. Whatever you believe about global warming, more efficient computers are a Good Thing, just as more efficient automobiles and trucks are. I applaud Google for the leadership role they've played in getting energy consumption into vendor's roadmaps and I look forward to much more efficient PCs and servers 5 years from now.

Comments welcome, of course. Has anyone played with those wattmeter gadgets to see what their system really does?

Editorial standards