LinkedIn open source monitoring system for Kafka

The move is likely to help Kafka users and engineers to better monitor the streaming data system.

LinkedIn has open sourced monitoring software to put analytics around the popular Apache Kafka messaging system for streaming data.

Kafka was initially developed by LinkedIn and later open sourced. LinkedIn also remains a big user of Kafka.

The problem, however, is that streaming data administrators of Kafka have had trouble pulling metrics from the system. Bugs have also been an issue.

LinkedIn's monitoring system, dubbed Kafka Monitor, is a framework for testing and checking deployments in clusters. Monitor also reports health metrics and runs validation tests before bugs are deployed.

Also: Streaming data, simplified: Kafka Streams reaches GA | Why Apache Kafka firm Confluent thinks IoT and streaming will gain from $24m injection | Data in the emerging world of stream processing

In a post, LinkedIn explained that site reliability engineers (SREs) working with Kafka have struggled to pull data from a cluster. Monitor was cooked up by LinkedIn to work around that problem.

Here's a rough sketch of how it works.

linkedin-kafka-cluster.jpg

The code is available on GitHub.