Facebook enlists AI to tweak web server performance

Facebook has to run lots of live tests to figure out which configurations are best for its HTTP servers, and it sped up the search for optimal settings by employing a machine learning approach called Bayesian optimization to narrow the list of plausible solutions.
Written by Tiernan Ray, Senior Contributing Writer

Machine learning has been used in recent years to tune the performance of machine learning itself, so why not use it to improve performance on a somewhat humbler level: The performance of a web server?

That's the point of view taken by researchers at Facebook, who on Monday outlined their work tweaking the settings of servers running the social network's server infrastructure.

The work, prepared by Benjamin Letham, Brian Karrer, Guilherme Ottoni, and Eytan Bakshy, is presented in a paper in the journal Bayesian Analysis, and also discussed in a post on Facebook's AI research blog.

Like all internet services, Facebook runs so-called A/B testing to gauge how well servers run when this or that variable is altered. Anyone who's seen different versions of web pages tweaked, such as altering the looks of a button, or the layout of text, will be familiar with this kind of tweaking to optimize things such as click-through rates or shopping cart use, say, on a commerce site.

Also: Top 5: Things to know about AI TechRepublic

In the case of this research, the scientists altered the options for the just-in-time compiler that converts Python to native x86 server code inside the open-source web server that Facebook uses to serve HTTP requests, the "HipHop Virtual Machine."

For example, the JIT can be set to do things such as in-line a given block of code. Such adjustments can make code size larger, and so A/B testing is required to figure out whether the speed-up of in-lining code is worth the trade-off of consuming more server memory.

The authors used an approach called "Bayesian analysis," a form of machine learning that emphasizes using past, or prior, information to divine an optimal solution. Bayesian has been used in the past decade to optimize the"hyper-parameters" of machine learning itself, such as how big to make batch size or how rapid the learning rate. Because such Bayesian optimization can remove the drudgery of designing hyper-parameters, one group has, for example, called Bayesian optimization a way to "automate" machine learning.

The Facebook authors used Bayesian to run A/B tests with the JIT compiler's settings in various different positions. The big advantage is speed. Because tests have to be done in a production environment in order to observe the effects of the different settings, there's a premium placed on getting the tests done quickly in order to move forward with changes to the web server.

Also: Facebook pumps up character recognition to mine memes

The authors write that compared to typical A/B testing, where a single change in configuration is tested at a time, the Bayesian optimization "allowed us to jointly tune more parameters with fewer experiments and find better values."

The key here is the word "jointly": Bayesian mechanisms rule out certain choices of configurations without having to actually run those as an A/B test, by extrapolating from a given A/B test to other parameters, to narrow the number of "feasible" configurations. As the authors phrase this broad search power, "A test of a parameter value in a continuous space gives us information about not only its outcome, but also those of nearby points." As experiments are carried out, the Bayesian model gains new experience data with which to further narrow the search for potentially optimal configurations, so the whole A/B testing affair can get more efficient as it goes along.

Also: Google's sister company fights mosquitoes with AI CNET

A novel contribution of this research with Bayesian optimization is handling noise. The authors note that unlike the task of optimizing machine learning networks, when one is testing server settings in A/B experiments, there's a lot of noise in both the measurement of the results of the test - servers in the real world can have a variety of performance impacts as a consequence of changes in settings - and there are also "noisy" constraints, such as needing to keep memory usage in a server within reason. They came up with a method to address such noise in their Bayesian algorithms, and they concluded that the new approach more readily produced optimal solutions than other kinds of Bayesian approaches.

An interesting wrinkle with this kind approach to A/B testing is that some configurations will never see the light of day: because the Bayesian optimization analysis predicts which configurations should be ruled out entirely, it will eliminate those variables from testing. The authors consider this an advantage in terms of potentially reducing the tumult of exposing users to lots of different experiments.

The tech that changed us: 50 years of breakthroughs

Previous and related coverage:

What is AI? Everything you need to know

An executive guide to artificial intelligence, from machine learning and general AI to neural networks.

What is deep learning? Everything you need to know

The lowdown on deep learning: from how it relates to the wider field of machine learning through to how to get started with it.

What is machine learning? Everything you need to know

This guide explains what machine learning is, how it is related to artificial intelligence, how it works and why it matters.

What is cloud computing? Everything you need to know about

An introduction to cloud computing right from the basics up to IaaS and PaaS, hybrid, public, and private cloud.

Editorial standards