DataStax unveils Stargate project to turn Cassandra into a multi-model database

With the Stargate open source project, DataStax hopes to widen Cassandra’s reach to non-Cassandra developers.
Written by Tony Baer (dbInsight), Contributor

DataStax is releasing the first preview of Stargate, a new open source API framework that could eventually turn Apache Cassandra into a multi-model database. It's an approach that has parallels with cloud databases from Microsoft Azure and Google that also take the API approach, and more recently, from household brands like Oracle.

While the project name conjures up memories of David Bowie, the goal of Stargate is exposing Cassandra beyond the existing developer base skilled in CQL (Cassandra Query Language) or Gremlin to JavaScript developers versed in JSON, or Java developers accustomed to working with SQL. Access would be via APIs supporting full CRUD (create-read-update-delete) functions. At the starting gun, it's not surprising that the first (and for now, only) API supports Apache Cassandra with CQL and REST APIs.

Stargate is designed as a gateway that sits apart from the storage engine, running either on-premises or in the cloud. It's based on the familiar coordinator node proxy that determines how Cassandra handles requests. As a multi-master database, any node can act as the coordinator for routing the processing of a query, and the nodes are separated from storage. By utilizing the same proxies that handle CQL requests, Cassandra does not have to be rearchitected to handle other APIs.

The project, which is hosted on GitHub, is available through a standard Apache 2.0 open source license. At this point, DataStax has not announced further plans for Stargate, but it is likely that the community will tackle SQL, JSON Documents (we'd expect a MongoDB style API), GraphQL, and Gremlin. At this point, we don't know how Stargate with its Cassandra API performs compared to what is already natively baked into Apache Cassandra.

Going the API route, Stargate plots a similar path as Azure Cosmos DB, which offers five APIs including SQL, MongoDB wire protocol, Cassandra, table (for key-value), and Gremlin in the same database. (In Cosmos DB, once you pick an API for the data set, you're bound to using it.) There's also a parallel with Google, which uses the same storage engine for Cloud Spanner, which is exposed through a SQL API, and Cloud Firestore, which adheres to a JSON document API.

Conceivably, Stargate could evolve into the preferred mode of access to Cassandra, but that depends on two big "ifs." First, there must be no performance penalty compared to the existing native access approach, and secondly, the project would have to get accepted by the Apache Cassandra community and formally become part of the project.

Multi-model support is not new for DataStax Enterprise, DataStax's commercial distribution of Cassandra. Through an earlier acquisition, the DataStax platform also supported Gremlin, but prior to the DSE 6.8 release, the graph engine wasn't integrated into the core database, and so graph data had to be modeled and ingested separately. With DSE 6.8, graph views could work off the same native CQL API, off the same data ingest. But graph support was only available to DSE customers, and was not part of the core open source platform. If Stargate gets accepted by the Apache Cassandra project, that would be a way for mainstreaming use of Gremlin, and potentially, other APIs on the mother ship.

Editorial standards