Now Basho gives Riak a time-series data option for IoT apps

​Because Internet-of-Things data is mushrooming, Riak firm Basho thinks it's time to add a specific time-series backend to the open-source database.
Written by Toby Wolpe, Contributor
EMEA managing director Emmanuel Marchal: Time-series is strategic for any organisation because that's the future.
Image: Basho

Having unveiled its integrated big-data platform in June, Riak database firm Basho has today introduced a new storage backend option, designed to deal specifically with Internet-of-Things data.

Riak TS, short for time series, now joins Riak KV for key-value data and cloud large-object store Riak S2 as a new storage instance at the bottom of the Basho platform.

On top of those three storage instances sit the core services, such as replication and synchronisation, capped by various options for accessing the data - Spark for handling analytics, Redis for low-latency read access, and Solr for search.

Basho EMEA managing director Emmanuel Marchal said Riak TS has been produced to save businesses, such as The Weather Company, which handles 40TB of such data per day, from having to adapt Riak KV to cope with time-series data.

"If you're in the weather industry or in the energy industry, usually you want to query all the sensor readings within, say, 200 miles of a certain point. That has tremendous implications for how you store the data," he said.

"Otherwise you end up with a query performance that's abysmal. Combined with all this is a need for analytics, which is primarily rolling, compressing or aggregating the data for certain dimensions. All that functionality is partially served by Riak KV today.

"But a lot of our customers have had to build things on top of Riak KV to get those capabilities. That's why we're launching Riak TS - so that they no longer need to build those additional things and have all these capabilities."

The Riak key-value store, which Basho created and continues to develop as an open-source project, notched up a significant success last year when it replaced Oracle as the database behind a national patient database system for the UK's health service.

Marchal said the volume of time-series data continues to grow rapidly, generated by devices ranging from smart meters in the energy industry, IP-enabled healthcare instruments, engines, industrial machinery to wearables and smartphones.

"If you look at the amount of data that's being generated nowadays, time-series is probably the biggest type - and we're only at the start of it. Time-series is strategic for any organisation because that's the future," he said.

"Time-series data comes in extremely fast but the way you query the data predicates how you need to store it."

Marchal said, to be optimised, applications may need to use a combination of storage backends that cater for the specific types of data involved.

"If you look at a typical time-series use case, like weather, maybe you get your temperature reading. But also every 10 minutes you get a picture of the sky to see what it looks like. The picture is not suited for a time-series database. You need Riak S2, which is capable of handling large objects very efficiently," he said.

"As you store all the data, you probably have a need for storing metadata - for example, about the configuration of your sensors - and that's more like a Riak KV use case. So you get applications that are going to use Riak KV for specific types of data, Riak S2 for large objects and Riak TS for storing the time-series readings."

The goal behind the Basho Data Platform is to offer a single system capable of coping with a variety of data, running at scale, which is available, operationally simple and fault tolerant.

"They get one platform that gets them access to all the different types of use cases without the need for building the glue at the data planes - the integration between the different systems - because that's provided to them through the platform," he said.

According to Marshal, one of the most important features offered by Riak TS is the ability to co-locate data by time, device, or geographic location through simple data modelling.

"We've also introduced a SQL-like query language into Riak TS and that's new because in KV the access to data is key value so you don't have a need for a SQL-like language," he said.

"But in time series, because you do range queries when you, say, select all the sensors between this time and that time, a SQL-like language makes a lot of sense."

Riak TS also features a number of optimisations for rapidly ingesting and reading data, and a Spark connector.

"While a lot of players in the space are trying to be multi-model - they offer a bit of graph and a bit of this and that - we're now onto the next stage already," he said.

"We're not only looking at a multi-model type database but also the integration with all the other tools that you need to make use of your data."

More on databases

Editorial standards