WebScaleSQL: MySQL for Facebook-sized databases

The MySQL Facebook, Google, LinkedIn, and Twitter engineering teams have joined forces to create their own version of MySQL, WebScaleSQL for their monster-sized databases.
Written by Steven Vaughan-Nichols, Senior Contributing Editor

In a Facebook engineering blog, Steaphan Greene, a Facebook software engineer, announced the creation of a new MySQL branch: WebScaleSQL. This is a version of Oracle's MySQL for gigantic database loads.


Greene explained that to handle the more "than 1.23 billion people who use Facebook to share and connect with each other, we’ve had to build an expansive and incredibly advanced infrastructure—including one of the largest deployments of MySQL in the world." To meet these demands, Facebook's MySQL team reached out to their counterparts at Google, LinkedIn, and Twitter to create WebScaleSQL.

"Our goal, in launching WebScaleSQL," Greene continued,  "is to enable the scale-oriented members of the MySQL community to work more closely together in order to prioritize the aspects that are most important to us. We aim to create a more integrated system of knowledge sharing to help companies leverage the great features already found in MySQL 5.6, while building and adding more features that are specific to deployments in large-scale environments."

In the WebScaleSQL FAQ, the group explained that it named the branch WebScaleSQL because "that is exactly what it aims to be." This is not to say that it's a MySQL fork. It's not.

The FAQ added, "As long as the MySQL community releases continue, we are committed to remaining a branch—and not a fork—of MySQL that’s focused specifically on the challenges of deploying MySQL at our scale."

Thus, the group is not following in the MySQL MariaDB fork's footsteps. Many organizations, such as Red Hat, some branches of Google, and Wikipedia,  have moved from MySQL to MariaDB.

The developers did this, according to the FAQ, because "We reached a consensus that MySQL 5.6 was the right choice for this, as it has the production-ready features we need to operate at scale, and the features planned for MySQL-5.7 seem like a fitting path forward for us. We will continue to revisit this decision as the ecosystem evolves."

According to Greene, "WebScaleSQL [will remain] open as we go, to encourage others who have the scale and resources to customize MySQL to join in our efforts. And of course we will welcome input from anyone who wants to contribute, regardless of what they’re currently working on." The code is under the GPLv2.

So far, Greene said, "WebScaleSQL has already produced exciting results." Working together, the engineers involved in WebScaleSQL have made major changes to aid in the development of the new branch, including:

  • An automated framework that will, for each proposed change, run and publish the results of MySQL's built-in test system (mtr).

  • A full new suite of stress tests and a prototype automated performance-testing system.

  • Several changes to the tests already found in MySQL, and to the structure of some existing code, to avoid problems where otherwise safe code changes had previously caused tests to fail or caused unnecessary conflicts. These changes make it easier to work on the code and helped us get started creating WebScaleSQL.

  • Several changes to improve the performance of WebScaleSQL, including buffer pool flushing improvements; optimizations to certain types of queries; support for NUMA interleave policy; and more.

  • New features that make operating WebScaleSQL at true web scale easier, such as super read_only, and the ability to specify sub-second client timeouts.

What they're working on no, according to Greene:

  • Contributing an asynchronous MySQL client, which means that while querying MySQL, we don’t have to wait to connect, send, or retrieve. This non-blocking client is currently being code-reviewed by the other WebScaleSQL teams, after being used in production at Facebook for many months.

  • Preparing to move Facebook's production-tested versions of table, user, and compression statistics into WebScaleSQL.

  • Preparing to push the remaining components of Facebook's current production-tested version of compression that were not already included in MySQL 5.6 into WebScaleSQL.

  • Adding the Logical Read-Ahead mechanism that we have proven in production to achieve large, quantifiable speed improvements (up to 10x) to full table scans, such as nightly logical back-ups.

With powerhouse, open source companies like this behind it, WebScaleSQL looks to be a must-see release for any enterprise that measures its database sizes in petabytes and database transactions with hundreds of terabytes.

Related Stories:

Editorial standards