DataStax 5.0: Embracing most everything but SQL

The release of DataStax 5.0 provides a good snapshot on how the database market has evolved over the past decade. DataStax, the company that offers a commercial version of the open-source Apache Cassandra database, has spread its wings in the new release.

The release of DataStax 5.0 places the company squarely in the mainstream of a database market, where variety and convergence are the spices of life.

It's the latest evidence that, in a database market where demand for what Martin Fowler termed "polyglot persistence" has become the norm, a decade of innovation in new data platform types has culminated in a crossing of paths, where those new specialized platforms are crossing over and overlapping.

​DataStax snaps up Aurelius and its Titan team to build new graph database

The prize in DataStax's acquisition of open-source firm Aurelius is not the Titan database but rather engineering expertise, which will be used in developing a new graph database.

Read More

The core engine of the DataStax platform, Apache Cassandra, is known as a "wide column" store for key/values that could run inside or independently of Hadoop. It is well known for its ability to perform fast writes of huge volumes of data, and it's well-suited for deployment on scale-out clusters. But Cassandra was traditionally not so adept at analytic queries nor the ability to search, roll up, or aggregate data.

With the 5.0 release, DataStax has surrounded the open-source Cassandra database with new faces. It has added the ability to deal with complex JSON data types -- providing full insert (write), update, and delete capabilities.

And then there's graph; DataStax has added a graph analytics skin (a view, as opposed to a separate data storage engine), using the Apache TinkerPop open-source project, which is a de facto standard for representing graph entities, and support of Spark. Given that graph computing, which represents many-to-many relationships (think: the interrelationships of people across social tribes, or the interrelationships of sensors and devices in the Internet of Things), is a new concept to most developers. DataStax is providing visual, web-based tooling (replete with code completion and correction) to make graph views and querying more intuitive.

The missing piece -- which at this point is offered by a partner (K2View) -- is putting a SQL face on Cassandra. That's where NoSQL platform Couchbase, with N1QL, has the edge.

With addition of JSON and graph support, DataStax is clearly in the mainstream of the market, where most platforms are adding multiple faces or personas. By adding JSON, DataStax is not necessarily going to directly compete with MongoDB or Couchbase, where the application and data is mostly JSON, nor with emerging graph database providers like Neo4J, where the use case is exclusively graph. Instead, the rationale is for edge cases.

The reality for most organizations is that there are going to be multiple data platforms floating around because of the diversity of needs: a SQL OLTP database, a data warehouse for production analytics, a NoSQL data store for operational application, and Hadoop for big data analytics. The case for overlap is not that you're going to find one magic silver bullet. Instead, you'll have multiple options on where to deploy data or functionality, and you might be able to erode some of those database silos that specialization has built up over the last few years.