A couple of months ago, I wrote an in-depth piece on Microsoft's Cosmos DB, which is Redmond's entry into the world globally-distributed cloud databases. The announcement of Cosmos DB's general availability (GA) took place in coordination with the company's Build event, held in May. The following week, Google announced that the GA of Cloud Spanner, a globally-distributed operational database of its own.
Also read: Google adds the last piece to its cloud database stack
Also read: Inside Microsoft's Cosmos DB
Also read: Microsoft debuts Azure Cosmos DB, a superset of its DocumentDB service
Also read: Google makes Cloud Spanner generally available
Because of the confluence of the these announcements, from two of the three leading public cloud providers, I thought it would be useful to follow up my Cosmos DB coverage with coverage for Cloud Spanner that was of similar scope and depth. Add in a briefing on Spanner from Google Cloud Director of Product Management Dominic Preuss that Google was kind enough to offer, and it was a slam-dunk. So let's proceed.
First let's discuss how Spanner compares -- and how it does not -- to competing offerings from the Amazon and Microsoft cloud platforms. This is more about getting our bearings than providing a competitive analysis.
The first thing to understand about Spanner is that it's a relational database, geared to operational OLTP (online transactional processing) workloads, with full ACID (atomicity, consistency, isolation and durability) functionality. Spanner's not a simple scale-up relational database service -- that's where Google Cloud SQL comes in. Spanner is not a data warehouse; Google BigQuery is designed to handle those workloads. And it's not a NoSQL database, either, as BigTable is Google's offering there.
So Spanner contrasts strongly with Amazon's DynamoDB, which is a NoSQL database employing so called "eventual consistency," and Microsoft's Cosmos DB, also a NoSQL database, and one which is configurable along a full spectrum of consistency models, ranging from an ACID model on one end to eventual consistency on the other, and two more consistency models in between.
And though Spanner is relational and designed for OLTP, it can also handle in-database operational analytics. With all that in mind, it might make more sense to compare Spanner to Azure SQL Database, or Amazon Relational Database Service (RDS), both of which are fully relational, ACID-compliant, and offer some level of operational analytics themselves.
But if the relational/ACID affinity tempts you to compare Spanner to Azure SQL database and Amazon RDS, it's not that easy. Why? Because -- like Google's own Cloud SQL -- SQL DB and RDS are cloud incarnations of on-premises database management systems, whereas Spanner was designed for the cloud. And Cosmos DB and DynamoDB were too.
And although Spanner uses SQL for querying and data definition (creating tables and the like), it does not do so for data manipulation/write operations. Instead, it employs a "mutation" API, the syntax for which is more object-relational mapping (ORM)-like and property-oriented than it is set-based. That's another point that distinguishes it from services like Azure SQL DB and Amazon RDS. So apples-to-apples comparisons are elusive.
The in-house version of Spanner was originally built by Google to handle workloads like AdWords and Google Play, that were, according to Google, previously running on massive, manually sharded MySQL implementations. The problem with those implementations was the manual sharding -- while it provided Google with a scale-out mechanism that MySQL didn't support natively, it was unwieldy; so much so that re-sharding the database was a multi-year process.
Google needed a database that had native, flexible sharding capabilities, adhered to relational schema and storage, was ACID-compliant and supported zero downtime. Since such a database didn't exist, Google created its own, and the original Spanner was born. Now, after almost 10 years of battle-testing the product in-house, Google has made Cloud Spanner, a public API in front of that same technology, generally available.
Having your scale-out, and eating your ACID, too
Despite the auto-sharding, Spanner will soon support cross-region transactions. If it can do all that, then why don't conventional relational databases? And why are the traditional platforms based on a scale-up model while Spanner is scale-out, but still retains the other conventional characteristics of relational database systems? How are Spanner customers able to "have it both ways?"
The big reason is the way transactions are committed. Traditional systems, when geographically distributed, must use a protocol known as two-phase commit, which cannot complete until each site finishes its own work. But Spanner makes each site a full replica of the others and uses a Paxos consensus algorithm to commit a transaction when a majority of sites have completed their work. Users of a particular site that hasn't itself finished updating, can be re-routed to a site that has, until their own site is done. That introduces some extra latency for certain users during very specific intervals, but it eliminates the gridlock that standard databases must contend with when configured in a distributed fashion.
But wait, there's more...
Paxos/consensus is key in making everything work, but other tricks, like optimized networking and hardware, as well as other software tricks, help too. For example, when data is locked during write operations, Spanner only has to lock cells (a cell is particular column in a particular row) rather than entire rows. This minimizes contention and accelerates transaction commitment, while still ensuring full consistency of the database. Also, slightly older versions of the data can be made available for read-only operations that have a certain tolerance for "stale" data, thus reducing contention even further.
Another way Spanner speeds things up is by storing child data -- which in conventional databases would be in a separate, related table -- so that it is physically comingled with its parent data. This allows queries that include hierarchical data (like purchase orders and their line items) to be scanned in one fell swoop rather than requiring the database to traverse a join relationship between the two.
So while the CAP theorem states that a database that is partition-tolerant and consistent cannot also be highly available, Spanner can "cheat" that theorem (in a good way) through optimizations that side-step some of the normal constraints imposed by distributed databases.
Spanner is very-developer friendly, featuring a JDBC driver and Software Development Kits (SDKs) for languages like Java, Python, Node.js and others popular among open source stack developers.
For those in the Microsoft/.NET camp, an ODBC driver and a C# SDK are in the pipeline. That will help Spanner compete more robustly against Azure Cosmos DB, SQL Database and SQL Data Warehouse as well as Amazon RDS, all of which are very Microsoft-stack friendly. Even Amazon's DynamoDB service has .NET support, so Spanner's ODBC and C# support can't come quickly enough.
All together now
Again, though, these aren't apples-to-apples comparisons; the Google cloud data stack innovates along different axes than the AWS and Azure ones. One of those axes concerns inter-service integration. For example, Google BigQuery supports the same SQL dialect as Spanner. And while Azure SQL Database and SQL Data Warehouse both use Microsoft's Transact-SQL, Cosmos DB's SQL dialect is different. On the Amazon side, DynamoDB doesn't even offer native SQL support.
Google's integration goes beyond SQL dialects though. For example, BigQuery supports federated queries across its own data, as well as BigTable and files in Google Drive. And while Spanner tables cannot participate in these federated queries today, I wouldn't be surprised if that changed.
Pick your database
So which database is the right one for your application? Since data movement is expensive, a lot will depend on where your data is today. And given that many companies have a lot of data stored in Amazon Simple Storage Service (S3), AWS has the power of incumbency going for it.
Meanwhile, fans of the relational model who need a globally-distributed database, may find Spanner offers an irresistible combination of those things. Customers who are very focused on service level agreements (SLAs), for reasons of compliance, or the SLAs they need to offer their own customers, may find Cosmos DB's value proposition there trumps the other two.
No matter which way customers go, though, they're in a good position. Through the combination of DynamoDB, Cosmos DB and Spanner, all three Internet giants are offering customer-facing versions of the globally distributed database services they themselves rely on for first-party offerings. With that as a baseline, competition is (and will continue to be) fierce, and the customer wins out.