It's the latest evidence that, in a database market where demand for what Martin Fowler termed "polyglot persistence" has become the norm, a decade of innovation in new data platform types has culminated in a crossing of paths, where those new specialized platforms are crossing over and overlapping.
The core engine of the DataStax platform, Apache Cassandra, is known as a "wide column" store for key/values that could run inside or independently of Hadoop. It is well known for its ability to perform fast writes of huge volumes of data, and it's well-suited for deployment on scale-out clusters. But Cassandra was traditionally not so adept at analytic queries nor the ability to search, roll up, or aggregate data.
With the 5.0 release, DataStax has surrounded the open-source Cassandra database with new faces. It has added the ability to deal with complex JSON data types -- providing full insert (write), update, and delete capabilities.
And then there's graph; DataStax has added a graph analytics skin (a view, as opposed to a separate data storage engine), using the Apache TinkerPop open-source project, which is a de facto standard for representing graph entities, and support of Spark. Given that graph computing, which represents many-to-many relationships (think: the interrelationships of people across social tribes, or the interrelationships of sensors and devices in the Internet of Things), is a new concept to most developers. DataStax is providing visual, web-based tooling (replete with code completion and correction) to make graph views and querying more intuitive.
With addition of JSON and graph support, DataStax is clearly in the mainstream of the market, where most platforms are adding multiple faces or personas. By adding JSON, DataStax is not necessarily going to directly compete with MongoDB or Couchbase, where the application and data is mostly JSON, nor with emerging graph database providers like Neo4J, where the use case is exclusively graph. Instead, the rationale is for edge cases.
The reality for most organizations is that there are going to be multiple data platforms floating around because of the diversity of needs: a SQL OLTP database, a data warehouse for production analytics, a NoSQL data store for operational application, and Hadoop for big data analytics. The case for overlap is not that you're going to find one magic silver bullet. Instead, you'll have multiple options on where to deploy data or functionality, and you might be able to erode some of those database silos that specialization has built up over the last few years.