'

Confluent release adds enterprise, developer, IoT savvy to Apache Kafka

Confluent, the company founded by the creators of streaming data platform Apache Kafka, is announcing a new release today. Confluent Platform 5.0, based on yesterday's release of open source Kafka 2.0, adds enterprise security, new disaster recovery capabilities, lots of developer features, and important IoT support.

Apache Kafka, the open source streaming data platform, has gained immense momentum over the past few years, both in terms of its core technology and as an API standard for proprietary streaming data solutions. Yesterday saw the release of version 2.0.0 of this streaming data juggernaut and today Confluent, the company founded by Kafka's creators, is releasing its enterprise distribution of that release, in the form of Confluent Platform 5.0.

executive guide

Business analytics: The essentials of data-driven decision-making

Data shows that data-driven organizations perform better. But what does it take to get there?

Read More

In a briefing last week, Confluent Co-Founder and CTO, Neha Narkhede, explained to me how Confluent Platform is much more than the open source code and a support contract. Confluent Platform is engineered as more of an open core Enterprise-strength product that is based on the Apache bits. And this latest release adds a lot on top of that open core foundation.

A console for operations
For one thing, Confluent has become serious about Enterprise security and will ship an LDAP authorizer plugin with Platform 5.0, that will bring compatibility with Microsoft's Active Directory standard. The product will also include an Confluent Kafka Replicator-based solution that will automate the cluster failover process for more robust disaster recovery (DR).

Confluent's Control Center application gets a lot of enhancement in the Platform 5.0 release. For example, Control Center adds a broker configuration view that works across multiple clusters.

And for developers
Those are nice management features but the Confluent Platform 5.0 release of Control Center also adds developer features, too. This includes a new topic inspection feature that allows for the viewing of streaming messages in topics and read key, header, and value data for each message. Topics in JSON, string, and Avro format are supported.

control-center-topic-management.gif

Topic Management in Confluent Control Center

Credit: Confluent

Control Center now also includes a GUI for composing and executing KSQL queries. KSQL is Kafka's implementation of the Structured Query Language (SQL). Since SQL is such a ubiquitous skill among developers, KSQL provides that broad audience with the ability to query and manipulate topics, streams and tables without having to use lower level APIs. This gives developers without no streaming data experience a big head start on writing streaming data applications.

KSQL deluxe
And in this release, KSQL is greatly enhanced. Support for a new STRUCT type lets KSQL developers work with nested data in Avro and JSON formats. KSQL now also supports user-defined functions (UDFs) and user defined aggregate functions (UDAFs). UDFs and UDAFs can be written in Java and then called from KSQL queries as if they were part of the KSQL language itself.

In other K-SQL news, the JOIN keyword now supports joins between multiple tables, multiple streams or even between tables and streams, and JOIN allows the use of the INNER and OUTER keywords in all such cases. Finally, new support for the INSERT INTO command allows KSQL developers to write events from different source streams and queries to the same output stream.

APIs and IoT
Confluent is also announcing general availability of the KSQL REST API, which now works from Python, Go, .NET/C#, Java, JavaScript, and even shell scripting. In addition, the open source Kafka 2.0 release itself has added support for a Scala-based wrapper for the Kafka Streams API.

Last but not least, in the Internet of Things (IoT) department, Confluent Platform 5.0 now includes a proxy for the Message Queue Telemetry Transport (MQTT) standard, a lightweight messaging protocol, which is heavily -- and increasingly -- used in IoT scenarios.

Since IoT analytics is, by and large, an application of streaming data analytics, this alignment between Kafka and MQTT makes a great deal of sense; in fact, it's a bit overdue. Meanwhile, the old but still widely-used Advanced Message Queuing Protocol (AMQP) is not supported by a proxy in Confluent Platform.

Apache Kafka keeps getting better, and it's the work at Confluent that drives much of its momentum. Focusing on Enterprise-readiness for the Kafka stack is the right way to go. Setting up and running Kafka clusters is still no simple task, however. Improving this would seem to be the next frontier for Kafka adoption.

This post was updated at 5:33pm ET on Thursday, July 31 to remove the incorrect statement that Confluent Control Center contains a UI for the new disaster recovery functionality and to clarify that the general availability of the KSQL REST API.