Inside Facebook's lab: A mission to make hardware open source

Inside Facebook's lab: A mission to make hardware open source

Summary: A look behind the scenes of Facebook's hardware lab, the spiritual home of the Open Compute datacentre hardware movement, which may radically change the type of IT enterprises use, and who they buy it from.


Amir Michael, Facebook's manager of system engineering, is stood in the company's hardware lab trying not to get in the way of the assorted engineers, wheelie-chairs and bottles of water scattered around the room, describing Facebook's attempt to democratise hardware.

"We're trying to take away a lot of the uniqueness of server design" to create a "clean, open canvas" for companies to base their datacentres around, he explains.

What Amir is talking about are the server and storage systems that Facebook uses in its datacentres and how the social-networking leader is hoping that by publishing the designs and specifications of this low-power, low-cost hardware, it can reduce the cost of infrastructure for businesses large and small. 

Facebook lab shot
In Facebook's lab, the company is trying to reinvent storage (left) and compute (far right). Image: Jack Clark

The equipment in the lab — a novel Open Vault storage array and various versions of the Open Compute server — is being developed by Facebook as part of its Open Compute Project, a cross-industry scheme by the company to bring an open-source approach to physical hardware. 

The Open Compute Project was launched by Facebook in April 2011 as a way of distributing its server designs, but in an attempt to seek broader participation in the scheme, the company span the project off into its own Foundation in October 2011

Facebook remains the initiative's de facto leader: its vice president of hardware design and supply chain operations, Frank Frankovsky, is the chairman of its board of directors. That said, the rest of the board are from major enterprises such as Intel, Rackspace, Arista Networks and Goldman Sachs. If these companies are involved in this scheme, you can assume that the Open Compute approach is something that both IT buyers and IT sellers think is worth a bet.

Lifting the industry

"Our goal is to be non-proprietary," Matt Corddry, a senior manager of hardware engineering at Facebook, said during my recent visit to the lab. "We're not trying to maintain an advantage with this gear, we're trying to elevate the industry."

This approach contrasts with other large cloud operators. Google, Amazon and Microsoft are all notoriously secretive about their datacentre infrastructure, though Google occasionally releases research papers outlining some of its more advanced software systems

"Our goal is to be non-proprietary. We're not trying to maintain an advantage with this gear, we're trying to elevate the industry" — Matt Corddry, Facebook

The Open Compute scheme has received broad industry interest, with both AMD and Intel contributing motherboard designs and CAD documents. Facebook thinks that in time, its Open Compute designs could shake up the enterprise IT landscape. 

"What I see happening is a lot of these principles that we've shared will start to take root in enterprise systems as well," Michael said. "The server can be lightweight; it can be vanity-free."

In fact, the Open Compute Foundation says it believes upcoming Open Compute motherboards designed by Intel (codenamed 'Decathlete') and AMD ('Roadrunner') could, in time, become "a universal motherboard, in terms of functionality, supporting 70 to 80 percent of target enterprise infrastructure use cases".

As for the hardware itself, both the Open Compute server and storage equipment is designed differently to the types of gear being made by enterprise vendors such as HP, IBM and Dell.

Sled servers lead the way 

Open Compute Server Version 2
The second generation of the Open Compute servers integrate an air duct into the chassis. Image: Jack Clark

The servers, (pictured), are based on a sled chassis design that is designed to work with Facebook's Open Rack. This is a new approach to server rack design that seeks to distribute equipment typically found on servers — power systems, networking and so on — and plug it into the rack itself. 

They take power in from a power distribution system that lives in a portion of the rack, rather than the server, and the drives are situated at the front to make it simpler to swap them out if they fail. 

The prototype servers (pictured) are version 2 of the Open Compute specification.

The major differences in the new compute server compared with its predecessor are a move to a single motherboard per chassis, larger fans (now 80mm, up from 60mm) that consume less power, and the incorporation of an air duct in the server sled's chassis. This means Facebook can save on the cost of building plastic air ducts then fitting them to its servers. 

In the future, Facebook hopes to entirely remove the drive from the web servers and boot off a low-power, more-reliable mSATA solid-state drive. A 60GB drive should be sufficient to host Facebook's OS and its logs.

mSATA drives are typically used in laptops, but Michael's team has built an adapter that lets Facebook use them in servers.

Corddry is keen on this, as it lets Facebook obtain cost savings from a "really high volume commodity part", he said, noting that "you don't need enterprise-grade equipment to boost a web server".

Open Vault storage push

The other major project the Facebook hardware labs team is working on is a way of redesigning storage arrays to suit large-scale datacentres.

Open Vault 'Knox' storage
The Open Vault storage systems can selectively cut power to rarely used storage, saving power. Image: Jack Clark

The Open Vault equipment, codenamed Knox (pictured), packs multiple hard drives onto a retractable sled. This can be pulled out and then, using a hinge, lowered to allow engineers to easily swap drives out in case of failures. 

Facebook has a constant backlog of equipment that needs maintenance, Michael said, with a rough annualised failure rate of about one percent. For this reason, making it easy to maintain kit and swap out failed drives has become a priority. 

Knox has a feature that lets it cut the power to individual drives when they are not being used, and differing numbers of drives can be attached to each motherboard according to the processing needs of the storage server.

Sometimes it really is a simple matter of turning it off and on again, according to Facebook's team

This gives Facebook two useful features. To start with, it can give power to its 30-odd drives according to the frequency with which their data is accessed. In other words, regularly accessed information can be kept on drives that are always switched on, while rarely touched data can be put on drives that are by default powered-down and only switched on when an access request is made. That lets the company save on power.

Another benefit is that it gives the company a way to solve hardware problems. 

"Drives... actually fail the most in our datacentre," Michael said. "Part of our procedure is when a drive fails we try and power cycle it."

Yes — sometimes it really is a simple matter of turning it off and on again, according to Facebook's team. "A lot of drive manufacturers get returns with no trouble found," Corddry said. 

An additional benefit of Knox is that its design makes it relatively easy to manipulate the proportion of storage assigned to...

Topics: Data Centers, Cloud, Hardware, Open Source, Social Enterprise

Jack Clark

About Jack Clark

Currently a reporter for ZDNet UK, I previously worked as a technology researcher and reporter for a London-based news agency.

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.


Log in or register to join the discussion
  • Nice to see an informative article and not another iPhone click bait

    Enough with Apple already!
    • Absolutely.

      ZDNet really need to stop focusing so much on the 28% of the consumer phone market at the risk of alienating the rest of the marketplace.

      It was becoming extremely tiring. And this article is also written without the usual brand name fanboyism that "really" irrates.

      If I wanted link bait and flame wars I'd go visit 4chan.
      • We're "customers", not "consumers"

        We don't eat silicon, and one of those two terms strips us of any authority... I am a customer and I demand more than just lamely built garbage in return for my very hard-earned money.
        • "Consumer" does seem to be an odd choice

          But here ya go, straight from the horse's mouth:

          con·sum·er /kuhn-soo-mer/

          1. A person who purchases goods and services for personal use.
          2. A person or thing that eats or uses something.

          user - purchaser - customer

          Source: Google
  • This is a new approach to server rack design?

    "They take power in from a power distribution system that lives in a portion of the rack, rather than the server"

    The miltary's been using that type system for quite some time. I'm not seeing how this is some "new approach"
    William Farrel
    • Innovation is taking an idea from one place

      and applying it in another place. In this case, Facebook is taking an idea from the Military, and applying it to civilian use.
      • how it all started !!!!

        how it all started !!!!
        • The internet itself... bingo

          Still, "open source" implies the programs made are free, with source code provided. There are some exceptions, but those of those tend to be service-oriented.

          Facebook, to those who pay attention, is not to be trusted. They have not earned it, their terms of use is predatory, they take corporate welfare, sold out under those who were gullible enough to invest in this leech of a company, and their opt-in security policies just show how utterly disingenuous they truly are.
      • Isn't that something like a blade server?

        In the sense that power is supplied via a backplane, with the server being that of the cards themselves?
        NoMore MicrosoftEver
      • More info on military

        Hello mheartwood,
        I hadn't heard of military schemes to do this. If you can perhaps remember the names of any programs, equipment or equipment suppliers then I'll take a look into it. Thanks very much for commenting!
        Jack Clark
    • Inside Facebook's lab: A mission to make hardware open source

      @William Farrel
      technology advances always started with your tax money... the europeans are trying to leapfrog by spending multi-billion dollars on their Large Hadron Collider. uncle sam did it before with the manhattan project, the nasa, the arpa/darpa, etc. and as always the military was the major beneficiary for all the advances, before filtering down to the commercial market. so, yes the military was using the technique before... but facebook is trying to innovate from those advances without spending billions in r&d reinventing the wheels.
  • Energy savings don't start at the drive level........

    It starts at the point where fuels are burnt and electricity made. As long as the cloud is powered by big powerplants that push Kilovolts down the grid, which are than transformed down to 110 or 220 Volt and than again transformed down to 5 and 12 volt to feed your CPU and drive, we are loosing this battle. From the 100 watts that are made by that distant powerplant, less than 40 watts reach your computer, The rest is dissipated along the way.

    We will start making real savings only when the industry realizes that the power that feed their computers should be produced as close to the data centre as possible and that that power should go from burning fuel to 12 volt via the shortest possible technical route.
    There is plenty of technology that will do just that, but as long as wasting energy in the grid is cheaper than using "clean" technology we are not going to get that.
    • Some of the big players have realized this...

      Microsoft, Google, others, have been, for instance, building large data centers along the Columbia river in Oregon and Washington in the vicinity of dams that produce the power.
  • pretty interesting

    but at the same time it is going to push vendors out of the market.
    • Or force them to change their ways...

      Hey Jimster480,
      I think it could do both things. On the one hand, it could threaten certain systems made by the major OEMs, on the other hand they could adopt the tech and do their own systems based on it. Think Chinese telco are involved in Open Rack and I have suspicions that HP is trying to replicate Open Compute stuff with its "Gemini" system. Did a story on this called Gemini: HP's attempt to get a like from Facebook and friends - have a read
      Thanks for commenting
      Jack Clark
  • To think...

    Just how many pwnd suckers are overexposing themselves to the world on each of those disk drives. By the sled load! w00t
    • Exposed or warned?

      All I expose are the daily news issues, my political views and very clear expressions of how good my security system is, how deadly my guns are, and how good my aim is. (Think of it as an "anti-invitation" to criminals).

      No checking in, no vacation plans, no family drama, and no info about where I really live....
      Allen Frady
      • Hey not bad

        You are definitely ahead of the curve. ;)
  • Big Mess

    ""We're trying to take away a lot of the uniqueness of server design to create a clean, open canvas"
    Before trying, please clean up that lab and organize those wires.
    • They were a little embarrassed by the wires...

      Hey Rikkrdo,
      Actually Mat and Amir were rather apologetic about the clutter, but as it was a working lab it seemed like a certain amount of clutter came with the territory.
      Jack Clark