Business

Confluent announces Infinite Storage for Apache Kafka

The company founded by Apache Kafka's creators introduces infinite data retention on its Confluent Cloud platform. As a result, the company is pushing the event streaming data technology as a "database of record." That's a big change.

Written by Andrew Brust, Contributor July 1, 2020 at 8:11 a.m. PT

Confluent, Inc. is today announcing infinite data retention as a new feature on its Confluent Cloud-managed Apache Kafka service. The company, founded by Kafka's creators, is announcing infinite data retention as part of its "Project Metamorphosis," which aims to imbue Kafka with modern cloud properties. Infinite retention rolls out this month for Confluent Cloud Standard and Dedicated Clusters on AWS, with other cloud providers coming later. The feature could really change the way Kafka is used.

Top Cloud Providers

Top cloud providers: AWS, Microsoft Azure, and Google Cloud, hybrid, SaaS players

Here's a look at how the cloud leaders stack up, the hybrid market, and the SaaS players that run your company as well as their latest strategic moves.

Read now

Must read:

Cloudy Kafka

In a briefing with ZDNet, Confluent CEO Jay Kreps explained that modern cloud properties, like elasticity, fully-managed operations, and separation of compute and storage, have largely eluded Kafka. Instead, companies that have adopted Kafka have had to manage a lot of moving parts and, in particular, have had to manage storage very explicitly and diligently.

Customers using Confluent Cloud dedicated clusters have had to pre-provision the storage they've needed and, typically, only a one-week window of data has been kept in Kafka topics. Now data will simply be able to accumulate, without limit. Customers will of course pay for the cloud storage needed to retain this data, but that is much more cost-efficient than using storage on the Kafka cluster nodes themselves. As a result, customers won't have to pre-provision any such node-level storage, and therefore charges for the clusters themselves won't change, with costs continuing to be based on the volume of data ingested and processed.

Take off your coat and stay awhile

Since its inception, the very premise of streaming data systems has been that they act as a transfer point for data, and not a permanent home. As a result, they've typically served up a short window of recent and real-time data, while historical data has had to be retrieved from other systems. So-called "Lambda architectures" that have sought to integrate real-time and historical/batch data platforms, in order to provide applications with both types of data, might be better-characterized as Rube Goldberg architectures.

But with infinite data retention, Confluent is promoting the idea that Kafka can be used as a database of record, rather than just a conduit. Even Confluent would likely not suggest that Kafka replace data warehouses and data lakes, or act as an analytics database platform. But just as relational OLTP (online transactional processing) platforms act as permanent database stores for transactional data, Confluent is saying Kafka can now do the same for event stream data. Essentially, Kafka can be data's permanent home, rather than just a hotel where data stays when it first arrives.

It's a (K)SQL world, we're just living in it

The relational database analogy is more than just conceptually useful. About three years ago, Confluent introduced KSQL, a SQL query layer for Kafka. This feature allows developers to work with Kafka as if were a relational database and Kreps acknowledged KSQL as a major driver for customer demand that Kafka acts as a repository for historical as well as real-time data.

Cloud

Editorial standards

Show Comments

Confluent announces Infinite Storage for Apache Kafka

Top Cloud Providers

Top cloud providers: AWS, Microsoft Azure, and Google Cloud, hybrid, SaaS players

Cloudy Kafka

Take off your coat and stay awhile

It's a (K)SQL world, we're just living in it

Cloud

Related

This Kindle Scribe Prime Day deal is actually worth buying (unlike the other Kindle deals)

The flagship Roborock S7 Mav Ultra robot vacuum mop is still $500 off after Prime Day

I stress-tested this rugged external drive. Now, it goes with me everywhere