X
Innovation

Microsoft and DataStax tie up Cassandra on Azure deal as new Titan graph database rolls out

It's big day for Cassandra firm DataStax, with its database offering now on Microsoft Azure, plus the release of the Titan graph database.
Written by Toby Wolpe, Contributor
billybosworthdatastaxfeb14300x348.jpg
CEO Billy Bosworth: Microsoft has opened its aperture to very non-traditional Microsoft technologies.
Image: DataStax

After a year's technical collaboration, Microsoft and DataStax have today unveiled a tie-up that puts the distributed database firm's enterprise Apache Cassandra offering on the Azure cloud computing platform.

The two companies say DataStax Enterprise on Microsoft Azure will help developers create and manage internet-of-things, web and mobile apps across public and private clouds.

On top of the Microsoft deal, DataStax is also launching the Titan 1.0 graph database, developed from its February acquisition of Aurelius, along with the technical preview of Cassandra 3.0 with its new storage engine, and DataStax Enterprise 4.8, which features support for Docker and Spark 1.4.

DataStax CEO Billy Bosworth said the partnership between Microsoft and a company offering an open-source database, written in Java that runs primarily on Linux in production might seem surprising.

"It's been a very interesting shift dealing with Microsoft in this era of Azure, where they've opened their aperture to very non-traditional Microsoft technologies," he said.

"The bottom half of this relationship is on our technical teams, which are working intimately and have been for about a year on making sure the Azure experience can deliver all the things that our customers are going to need.

"Then the mid-level in most of these partnerships is usually just fluff unless you actually meet at the customer. We've managed to figure out the right strategy for field engagement so that the customer experience is really seamless."

According to Bosworth, the common theme that makes Azure and Cassandra attractive to businesses is that most have or will have some sort of cloud hybridisation strategy driven by their need to rationalise datacenter use.

"The three key things are availability with extreme throughput and very low latency. So the way you accomplish that is by geographically pushing your data out to the places where the endpoints actually exist, whether they be people or machines or cell phones or websites - doesn't matter. Therefore, you have to have your back-end infrastructure distributed in that way," Bosworth said.

There is no conflict between the Windows development ecosystem and the widespread use of Linux with Apache Cassandra.

"For production environments with DataStax Enterprise, Linux is the recommended platform, and we've received zero pushback from Microsoft on that, which really tells me it's a new era. They were just 100 percent, 'Doesn't matter. You guys want to run Linux? Great. Windows? Great, no problem," he said.

"But with the Windows development ecosystem, we actually have quite a strong community. We provide drivers for the .NET environment like C# and C++. One of our most robust demo applications actually is written all in .NET. So don't confuse the Linux backend running on Azure with the Windows development ecosystem, which actually is quite robust."

Although DataStax already has support for other cloud services including Amazon's, the ties with Microsoft will provide a strong boost for the company and the use of Apache Cassandra.

"Having a relationship with Microsoft where Microsoft can sit down next to us, talk to the CIO or to the VP and say, 'Of course, this is the right architecture. Here's how it works. Here's how you spin it up in our world' - that brings a level of comfort to the customer.

"They say, 'OK, this technology and this company, they're here to stay if Microsoft is going to give their endorsement. This is a pretty important technology for the future."

The general availability of the Titan graph database announced today is the second product from the acquisition of Aurelius, following the release of open-source graph framework TinkerPop3 earlier this year.

Graph databases are suited to highly-connected data and identify commonalities and anomalies to describe networks and contexts. Last year, Forrester Research predicted that about 25 percent of enterprises will be using graph databases by 2017 to support next-generation business applications that need connected datasets.

DataStax products vice president Robin Schumacher said Titan 1.0 offers support for Spark and for TinkerPop3, which allows the database to deliver transactional and analytical queries in the graph engine.

"Where Titan is particularly strong is they've cracked the nut where scale-out graph is concerned. So, much like Cassandra, it's able to scale out across multiple machines in a distributed fashion," he said.

Titan also now offers what Schumacher describes as a sophisticated query optimiser with a rewrite engine, which helps rewrite code written in the Gremlin SQL-like graph language.

Version 1.0 of the graph database also supports Cassandra 2.2 and Elasticsearch 1.5.

Also out today from the Apache Software Foundation is a technical preview of the first release candidate for Cassandra 3.0. With general availability expected in October, the 3.0 version features a new and more efficient storage framework that offers claimed storage savings of about 50 percent.

Cassandra 3.0 also brings materialized views to allow developers to dispense with hand-written code to write information between two different tables of similar data.

"When one table is written to, it automatically propagates that data to the other tables and does so in a very performant manner. Our internal tests are showing that materialized views are outperforming the manual denormalisation of that data by around anywhere from 50 to 80 percent, which is pretty meaningful," Schumacher said.

DataStax Enterprise 4.8 is also out with support for Docker and improvements to search through Live Indexing and to analytics, with production certification for Spark 1.4.

More on databases

Editorial standards