Hard problems at scale, the future of application development, and building an open source business. If any of that is of interest, or if you want to know about Kafka, real-time data, and streaming APIs in the cloud and beyond, Jay Kreps has some thoughts to share.
That was a great opportunity to have a long talk with Jay Kreps, Confluent's CEO and co-founder, about everything from the future of application development to the subtle differences in streaming APIs and paradigms.
Whether you're a streaming enthusiast or wondering what all the fuss is about, you will most likely find something of interest here.
Let's take it from the start. Kreps, together with co-founders Neha Narkhede and Jun Rao, started working on Kafka in 2008, while they were all LinkedIn employees. The problem they were trying to solve was dealing with continuous streams of data.
"Very little data is batch in nature," says Kreps. "Data in real life is not produced when the sun goes up or down -- when your business is digital data keeps coming all the time".
Pretty much every business is digital today, but the difference is LinkedIn's core is digital and its needs and scale were already an order of magnitude beyond that of ordinary businesses back then. So Kreps and his team hit that problem before others did.
LinkedIn had some infrastructure for transactional processing in place, plus infrastructure for analytical processing, consisting of standard components for those stacks, such as Oracle, Hadoop, and Key-value stores.
"We could do processing in a few milliseconds and analysis as fast as we do today, but the in-between was missing," says Kreps.
LinkedIn had a messaging layer that allowed applications built on top of that infrastructure to communicate, and it wanted to make this the center of application development.
Kreps and his team spent some time trying to build on top of both proprietary and open-source messaging infrastructure, but at some point, they realized none of these options worked for them, so they had to bite the bullet and go for building their own system.
The Kafka edge
What was it that made existing solutions unfit for LinkedIn purpose and how did the Kafka team tackle the challenge? In other words, what makes Kafka special? Kreps says they focused on improving three key areas.
The first one was building Kafka as a modern distributed system. "You typically don't think of your messaging system as a cluster, you think of it as a broker system. Brokers can connect to each other and it's kind of distributed."
But really distributed systems are better -- easier to expand and operate as a service and so on. We had messaging and distributed backgrounds, so we understood how to operate such systems," says Kreps.
Another aspect Kreps emphasizes is storage. Messaging systems act as dispatchers of information, but what happens if some recipients cannot receive their messages?
"You can't expect every system to be running at every moment," says Kreps. And when systems are offline, their messages needed to be kept in store until they can receive them.
"Things were not working really well in such situations in LinkedIn," Kreps continues. "Besides it's not just a matter of getting back online smoothly, there's a number of architectural benefits that come with guaranteed storage and delivery".
And then, of course, the streaming model -- the continuous flow of messages. Kreps' team believed in this model and wanted to support it, but they felt existing messaging systems were not very well set up for that.
Streaming for mainstream application development
That stream has grown to a river and a lot of water has flown under the bridge since then. Today, Kafka is a big part of real-time data architectures known as Lambda and Kappa. How big exactly? For Kreps there are two ways of measuring this: How many companies use Kafka and how central it is for them.
Kreps claims that "Kafka has near 100 percent of early adopters. You can go to any tech conference and look at people's architecture diagrams, and you will find Kafka there as a key component."
But Kafka is moving beyond that according to Kreps:
"We're starting to see non-tech companies adopting Kafka and building their architecture around it, and that's very exciting. How much can the world move toward streaming architecture and what's the chance of that happening? One-hundred percent. The hard part is to get the ball rolling, and the ball has started rolling. The timeline is always longer than you think though.
"The current status is that there are big pluses and a couple of cons. Streaming is adopted most where it makes sense most: In the financial sector, in IoT -- wherever you have big streams of data. New projects are going to be built that way once we reach a level of maturity, simplicity, convenience, and operability that makes it the tipping point. Once you can have continuous and real-time nature projects without big tradeoffs. We are still in the process of making that happen."
For Kreps, it's all about taking streaming from the lab and making it as easy to use as, say, REST services. Part of the reason REST is so successful is the fact that there are the frameworks and methodologies in place that have put it in the mainstream application development map.
Kreps says there's still work to get there, but "we're on that trajectory." And since we're talking about REST, what would you say if you heard Kafka is the place to build your Microservices?
Not exactly the first place you would think of probably. But for Kreps, this exactly where they want Kafka to be: "If we look at how Microservices are deployed, there's actually two different types," he says. And he uses retail as an example.
In retail, there is a line of synchronous interaction taking place that has to do with client actions -- showing items, adding items to basket, and so on. But there are also actions taking place in the background, such as updating stock, prices, logistics, etc.
The first type of actions are synchronous, while the second one is asynchronous, and Kreps argues that for asynchronous services that are critical (you can't afford to drop updates or get them in the wrong order, etc) a platform like Kafka is the right one to build on.
"We had Microservices that take quick actions to support APIs using REST in LinkedIn, and REST was a good technology to use for those. Kafka is not particularly suited for something like this. But then you have other Microservices that are asynchronous, and they are triggered by some event and take some kind of action. What kind of technology should you use to build those?
"We believe the next generation of such services should be built on a company-wide platform rather on a per-application basis, and the abstraction to use should be stream processing rather than a low-level messaging API."
Kafka vs the world
Is streaming set to take over the world then? And when we talk about streaming, is Kafka the only game in town?
Pro-streaming arguments sound compelling, and Kreps is not the only one who supports them. Flink, for example, is a big data platform that has people who are also passionate about real-time data applications and streaming, and their views and philosophy seem to be coming from the same place.
"If you talk to smart technologists in this space, the answers you will get today should be pretty consistent," says Kreps. "Maybe a few years ago you would hear from people things like 'streaming cannot get you correct answers,' 'it's not efficient,' 'it's lossy,' and so on.
Today, we know that's not true. Yes, there are tradeoffs, but let's take efficiency, for example. Streaming may be 10-percent less efficient, but that does not make it inefficient."
It's interesting that Kreps sees Kafka primarily as a streaming platform to build services on, rather than infrastructure to dispatch messages. "That changed last year -- we added substantial stream processing capabilities to Kafka. We've been meaning to add this for years -- now we have it," Kreps notes.
So, streaming may be great, but why go specifically for Kafka then? There are other streaming platforms out there too, like Flink, Spark Streaming, or Storm -- all Apache open-source projects. How is Kafka's relationship with these? Complicated, according to Kreps.
Kafka's vision is to serve as a streaming platform to connect all platforms, and in order to do that, you need to be able to do three things, as per Kreps. You have to be able to connect to and integrate streams and APIs and store them, to process them, and to transform them and build applications on top of them.
Kreps says that overlap exists only in the last point. "We can now not only read/write streams, but also do transformations on them, even complicated SQL processing, joining or aggregating them. The way we imagined and built it is a little different than other platforms."
How? In Kreps' words:
No cluster required. To build a stream application, you just do it like any other application -- no need for a Kafka streams cluster. Kafka does not touch upon deployment, but delegates it to an external layer likes Mesos or Kubernetes.
Full integration. Kafka should be like a database, where there is a processing layer and a storage layer and turning on features -- like security, for example -- works across them.
Database support. The majority of data lives in tables in relational databases, and Kafka supports stream integration with them to be able to have a holistic view.
In typical real-time data architectures, Kafka is the entry point for other streaming platforms, so now you see why things are getting complicated. Kreps says they're happy to work with people using Kafka either as a gateway or on its own, and they make sure the integration with other platforms works.
So, let's recap. Real time data processing is on the rise. Kafka is a key component of real-time data architectures. And now it's expanding its reach to do what other parts of that architecture are doing, and it wants to establish itself as a mainstream development platform. And it's going cloud and taking on Amazon.
Does that sound like a plan for world domination to you?
Kreps says they have a big vision for Kafka -- to be the central nervous system for companies. "When we started working on it, we were not thinking about building a company or IPOs and the like, but world domination was in our list."
Were Kreps and his team the right people in the right place at the right time? All other organizations of LinkedIn's magnitude were facing similar challenges and using similar systems internally at the same time, so why them? Maybe it's a combination of framing, strategy, and vision.
"This problem did not look very sexy from the outside at that time," Kreps recalls. "Databases were cool, Hadoop was cool, but moving data and messaging was very uncool. People were asking why we are working on that stuff and not doing something else.
"Other organizations were looking at it as solving problems like how to aggregate log files or how to manage messaging systems. So they would end up using relatively low-end solutions, even if their scale was similar." Plus, they did not go open source. So, what now?
"When we started the company, there was a huge demand for a software offering," says Kreps. "We started with that, which allowed us to build many of the tools people needed to start using Kafka and to add features to drive adoption.
We are now following up with a SaaS offering, which we have been using internally for a while, so we're really excited to make it available to the world. It's a better way for us to offer our service, people running Kafka in the cloud get licensing and support and their operation is managed by us.
The response has been overwhelming, and it's an essential step for any business -- if you're not able to make that transition, you may not exist in 10 years from now. Plus, we can't be picky about our offering -- it's harder as a startup, but we just want to offer things the way people want to use them."