Couchbase says the integration announced today of its open-source database with Hortonworks Hadoop big-data software will allow firms to run a single system for both operational and analytical data.
The NoSQL database company is using Apache Kafka for distributed messaging and Apache Storm stream processing to link Couchbase Server 3.0.2 with Hortonworks Data Platform 2.2.
"By integrating Hortonworks Data Platform and Couchbase Server, enterprises can meet both operational and analytical requirements with a single solution to improve both short-term and long-term operations," Couchbase said in a statement.
Conventionally, operational and analytical data are handled separately, with Hadoop employed in analysing large static offline datasets from an historical perspective.
Couchbase explained that under the integration between its database and Hortonworks Hadoop, Apache Kafka uses Couchbase Server's Database Change Protocol to stream data from Couchbase Server to the message queue in real time.
Once the messages have been worked on by Storm, the data is then written to Hadoop for further processing, with the analysis written to Couchbase Server, where users can access it through real-time reporting and visualisation dashboards.
Online payments company PayPal has been using Couchbase for fast access to user information at scale while streaming data into Hadoop.
"The Kafka connector strengthens the streaming data to and from Couchbase to Hadoop ecosystems. Couchbase accommodates fast access to data at scale while leveraging Kafka to stream data to Hadoop for deep analytics," PayPal senior director of engineering Anil Madan said in a statement.
"As the operational data store, Couchbase is easily capable of processing tens of millions of updates a day. Streaming through Kafka into Hadoop, these key events are turned into business insight."
Hortonworks has certified the Couchbase Server plugin for Sqoop to support bi-directional data transfer between Couchbase Server 3.0.2 and Hortonworks Data Platform 2.2.
Apache Sqoop is a tool for moving bulk data between Apache Hadoop and structured datastores, such as relational databases. It graduated from the Apache incubator in March 2012 and is now a top-level project.
"This integration enables enterprises to export operational big data, produced and consumed by enterprise web, mobile, and IoT [internet- of-things] applications stored in Couchbase Server, to HDP [Hortonworks Data Platform] for offline analysis and refinement; refined data is imported back into Couchbase Server," Couchbase said.
More on Hadoop and big data
- Databricks CEO: Why so many firms are fired up over Apache Spark
- MySQL: Percona plugs in TokuDB storage engine for big datasets
- Cloudera links up with Hadoop developer Cask
- Mesosphere and MapR link up over Myriad to create one big data platform to rule them all
- Teradata rolls out big data apps, updates Loom
- MapR CEO talks Hadoop, IPO possibilities for 2015
- Teradata acquires archival app maker RainStor
- Hortonworks expands certification program, looks to accelerate enterprise Hadoop adoption
- Actian adds SPARQL City's graph analytics engine to its arsenal
- Splice Machine's SQL on Hadoop database goes on general release
Join Discussion