Fauna adds geo-isolation to globally distributed database cloud

While globally distributed databases are no longer rare, few of them offer a key capability for addressing data residency laws. Fauna is adding region groups that allow organizations to keep within a region or node.

flora-and-fauna.jpg

Fauna, a database whose founders came from Twitter, is adding an important capability for enterprises seeking to operate global databases; a new "region groups" feature that enables enterprises to keep data within a specific region or country. It provides a feature for specifying the cluster or region where specific data should reside.

Fauna is delivered as a managed cloud database-as-a-Service (DBaaS). But don't call it a database. Fauna is appealing to developers, not DBAs or data engineers. And, to avoid scaring off developers, the company has deliberately not called itself Fauna DB. Hold that thought.

While Fauna does not position itself as a database, underneath all the positioning, it is. Fauna is one of a growing array of globally distributed transaction databases, joining a crowd that includes Amazon DynamoDB, Azure Cosmos DB, Google Cloud SpannerCockroach DBTiDBYugabyte and others. Unlike most databases, which rely on a central or primary node for committing writes, truly distributed databases allow writes to be committed from multiple nodes.

The going notion of globally distributed databases is that reads and writes of data can be handled locally. This capability is often referred to as multi-master (meaning there is no single primary or master node determining database commits) or active-active (referring how distributed databases replicate updates between nodes).

The irony is that, as global deployment of the same instance of a database becomes more practical, there is also a rising tide of privacy and data sovereignty regulations aimed at protecting PII data and keeping data within a nation's borders. And that's where Fauna's latest announcement comes in.

Emerging concerns over data privacy and data localization are compelling enterprises with global ambitions to geographically partition databases if those instances span multiple regions. At present, very few databases supporting multi-region, multimaster, or active-active replication provide the capability to keep specific data within specific nodes or specific regions. In most cases, the alternative is to set up separate regional instances.

With region groups, Fauna has become one of the only distributed databases to support regional isolation of data; to date, Cockroach DB is the only other one that supports geo-partitioning within the same logical instance of the database. As noted, given the rising tide of privacy and data sovereignty regulations, we expect that such geo-partitioning will soon become a checkbox feature for distributed databases supporting deployments across multiple regions.

So, what is Fauna, anyway? Fauna is based on an API that allows data to be presented in relational or document views. It emphasizes flexibility: you can write stored procedures using Fauna's SQL-like FQL language, and then access data using stripped down GraphQL calls. 

It is not easy to pigeonhole Fauna as it has differing mixes of similarities – and differences – with each of its counterparts.

Here are the similarities. Fauna is a distributed operational database that looks a lot like Spanner and Cockroach DB because of its relational support. But Fauna is also API-driven, which makes it a closer cousin to Cosmos DB, and provides several views of data: relational and document. And Fauna is serverless, providing parallels to Amazon Dynamo DB and others like DataStax Astra, where the default option for customers is now serverless.

But here are the differences. Fauna departs from Cosmos DB because it uses a single API for relational and document views, whereas Cosmos DB has different APIs, but also has more views: relational, MongoDB-compatible document, graph, key-value, and Cassandra-compatible wide column. Furthermore, Cosmos DB offers five levels of consistency, whereas Fauna supports strong consistency using its own implementation of the Calvin protocol developed by Daniel Abadi and colleagues from Yale University where write nodes agree in advance on transaction sequencing. In fact, Fauna's approach to distributed ACID is also quite distinct from its other counterparts as well: it differs from the consensus-oriented approaches of Google Cloud Spanner and Cockroach DB.

Probably the biggest differentiator from all the other distributed transaction/operational databases out there is that Fauna takes a MongoDB-like approach in positioning itself as developer-friendly: APIs to simplify access, and serverless to eliminate the hassles of deployment. In an email, Fauna's CEO Eric Berg made the analogy of Fauna as the database equivalent of Stripe from the payments space, and Twilio from the unified communications space; all are known for being developer-friendly. As noted above, Fauna positions itself, not as a database, but an API to data. But keep in mind that region groups is very much a reminder that while Fauna is presenting itself as an API for developers, underneath, it is a database. Sorry, Fauna.

Fauna also recently introduced introduced Fauna Labs, providing a sandbox of plugins and tools for developers to embed Fauna into their applications. For instance, the plugin for Fauna's serverless framework allows it to be integrated into test and CI/CD pipelines.

The company views itself as a developer-oriented "replacement" for the usual suspects like SQL Server and Oracle. In Berg's words, cloud database services against which it s compared are just that – database as a service solutions. They are not data API solutions that, said Berg, "allows developers to focus on their applications, instead of worrying about the database infrastructure."

Fauna's region groups feature is available now on its cloud service.