Making the jump to Google Cloud Spanner

At Mobile World Congress next week, one of Google Cloud Spanner's first reference clients will be demonstrating a Tier 1 telco charging system ported from Oracle. Optiva, which is demo'ing the system and claiming 10x scalability vs Oracle, explained to us how they made the transition.

google-spanner-logo.png

A year ago, Google unveiled the beta for Cloud Spanner, the publicly-accessible version of the distributed transaction database that powers Google's AdWords revenue machine. And a few months ago, it went multi-region. It's part of a new breed of cloud-native distributed databases that also include Amazon DynamoDB and Microsoft Cosmos DB that rethink how to run in distributed mode; Big on Data bro Andrew Brust gave a blow by blow description last summer on how they all stacked up.

On the eve of Mobile World Congress (MWC), Google is introducing one of the first public references for Cloud Spanner. Optiva, formerly Redknee, is a provider of call charging systems to the telecom industry, and is announcing a new offering based on the Google platform.

As it challenges mainstays like Amdocs, Huawei, and Ericsson, many of whom run on Oracle RAC databases, you have the makings of a David vs. Goliath battle. Optiva is putting Cloud Spanner front and center as part of its relaunch with its revamped unified charging system aimed at Tier 1 carriers.

During its Redknee days, the company's unified charging system ran on Oracle. With the company relaunching as Optiva, the team had a 10x stretch goal: they wanted their new release to deliver 10x performance at 10x less cost. Based on internal benchmarks, they claim that they can reach this goal with Cloud Spanner.

Oracle is not facing this battle sitting down. While Optiva claims that its original Oracle-based implementations hit the wall at 50,000 transactions/second, Oracle cites customers running 3 - 4x that volume.

At MWC, Optiva is demonstrating its new Cloud Spanner-based charging solution; while it doesn't have production customers yet, porting a telco charging system is a major milestone for Cloud Spanner. And it's a good showcase for how Google platforms run differently than the incumbent systems they replace. Samy Aboel-Nil, COO for the technology arm of Optiva, explained how his team made the transition.

The baseline was an Oracle RAC configuration, which is a cluster database with a shared cache architecture. When deployed in sharded configuration, reads can be distributed, but writes are controlled by a cluster manager. By contrast, one of Cloud Spanner's key differences is that it can read and write in a fully distributed mode; there is no single write-master to limit scalability. This is especially important in large scale, low latency applications that have a mix of read and write transactions. Cloud Spanner can scale read and write traffic across an unlimited number of nodes while still maintaining transactional consistency because of Google's TrueTime API -- a method of distributed transaction sequencing guided by atomic clocks and GPS data.

The differences run deeper. Cloud Spanner, like Oracle, supports SQL, but for the Google platform, it is ANSI SQL 2011 with extensions. Beneath the hood, Cloud Spanner stores data differently compared to standard relational databases. In addition to defining a typical relational schema, Cloud Spanner performs best when you identify parent-child relationships in the data. It uses these relationships to co-locate rows for efficient retrieval. For instance, if you have a customer database, chances are you will physically group customer-related records with other associated records (say, by household or geographic region). So, when you are laying out the data in Cloud Spanner, it makes sense to identify the strongest parent-child relationships, thereby enabling Cloud Spanner to do the optimization for you.

Another difference with Cloud Spanner involves how applications read and write data to the database; instead of using a traditional JDBC/ODBC client, applications perform these operations using a client library that communicates over an RPC to the Cloud Spanner system. Apps interact with Cloud Spanner through its distributed architecture -- there is no central hub to go through to identify where the data is.

Then there is functionality that would otherwise be carried in the database itself, such as constraint validation, stored procedures, and triggers. Google's philosophy instead is to keep the database as lean as possible so it can perform with minimal overhead. The application developer must instead implement these operations in the application tier -- a strategy that harkens back to the early days of relational databases before features such as stored procedures were invented.

There are also subtle differences in identifying tables; for example, while many relational databases will let you take shortcuts and refer to rows with an automatically-generated row ID, Cloud Spanner forces you to adhere to primary keys, which are a best practice for database design anyway. Common cursor functions, where you iteratively manipulate large result sets, are replaced by a functionally similar system of offsets and limits.

As a commercial database, Cloud Spanner is still in its early days. Google is still proving itself out as an enterprise supplier while Oracle has decades of track record.

Like much of Google's technology portfolio, it has radically redesigned the database for use cases that were not originally envisioned for the Oracles of the world. Thanks to a radically different approach to database design, Google can make those 10x price/performance promises -- but you must know how to design for them. As the upstart in the database game, we'd expect that early customers are likely to be those who are also challenging the established order, just like Optiva.