Home & Office

Facebook grapples with datacentre gremlins

The social-networking company sets great store by datacentre efficiency, but is finding the variable airflow of servers and quirky characteristics of networking devices difficult to deal with
Written by Jack Clark, Contributor

Facebook is facing a mismatch between its self-designed Open Compute Project datacentre equipment and gear from mainstream vendors, which could make it harder to meet energy-efficiency goals.

Facebook datacentre

Facebook has seen airflow issues when using mainstream datacentre equipment alongside its Open Compute Project machines. Photo credit: Facebook

The issues stem from the differing patterns of airflow generated by mainstream servers and networking equipment on the one hand, and the Open Compute Project (OCP) servers on the other, according to Tom Furlong, Facebook's head of IT operations, in an interview with ZDNet UK on Wednesday.

"We have a mix of OEM and [OCP] servers. We find from a cooling standpoint our design servers in the OCP architecture are very happy at a very wide range of temperatures and airflow rates," Furlong said. "Servers we buy off the shelf [are] not engineered as optimally for airflow like ours are [and] are a little less happy."

Facebook launched the OCP in April as a clearinghouse for the open sharing of hardware designs that any organisation can use to build cheaper, energy efficient datacentres. It has provided the specifications needed to build the servers, power delivery systems and other technology used in its Prinneville, Oregon facility, and the electrical design system used in in its upcoming datacentre in Lulea, Sweden.

Servers created on OCP designs lack many of the features of standard servers and have been built with as few components as possible. Although the OCP servers are efficient, they only comprise 60 percent of Facebook's server estate, with other off-the-shelf equipment doing specific but undisclosed tasks, Furlong said.

The problem Facebook faces is how the thermal and airflow characteristics of its servers differ from the industry norm, and what happens when the two types are sitting next to one another in its datacentres.

"I would say if there's one issue that I'm paying attention to the most, [it] is what happens if you have a large number of OEM servers in one end of the facility and [OCP servers] in the other end in a particular room," he said.

Facebook's solution has been to pressurise the room in such a way that higher pressure reigns near the OCP servers, allowing their fans to run slower and consume less energy, with pressure tailing off near the mainstream servers so their fans can keep the kit cool.

Networking gear

The second problem Facebook has is with networking hardware. Because it does not design its own networking equipment, it has to make do with off-the-shelf switches and routers. While these bits of kit do the job, they do not blow air out of their backs, and this can alter the hot aisle/cold aisle containment the company tries to implement.

"Our concern is that in our current design, we have essentially our network loads where our cluster switches and routers are in the main data hall, [but] they don't have a homogeneous front-to-back airflow like our servers do," Furlong said. "We're trying a bunch of things to make sure we get sufficient cooling to those network devices... barring putting them in their own room."

Facebook is still grappling with the problem and has considered putting suction fans in place to help the networking gear deal with its environment. "We can deal with a few servers being unhappy, [but] it's more impactful when the network equipment is unhappy," Furlong said.

This type of networking problem is not exclusive to Facebook; other datacentre operators, such as Telstra, have scratched their heads over how to handle switches like the Cisco 7600.

Editorial standards