Datameer Blog post

How Apache Kafka Works and Why

by on Feb 19, 2018

I’m sure many of you know Jay Kreps, the man who developed Apache Kafka. Considering how well known it is now, it’s funny to hear Jay say at first it wasn’t a popular open-source project. There were only a small number of enthusiastic fans (including me!), but for the most part, people weren’t sure what it was.

Jay and his team initially went with calling it a “messaging system” but that didn’t really get any attention. But, the industry finally took notice when it needed to consider solutions for data flow and stream processing. I was also really interested in hearing Jay’s experience building Confluent. If you’d like to hear his thoughts around being a first time entrepreneur, be sure to check out episode six.

How Apache Kafka Works

Episode 1: What is Apache Kafka and How Does It Work?

Learn why Jay Kreps founded Kafka. What is it, and how does Apache Kafka work? Back then, they were focused on solving the problem of having data spread out over many systems. Fun fact: they thought it was going to be easy — and it wasn’t.

Episode 2: How Does Apacha Kafka Work? [Diagram]

Jay and I whiteboard the design. Who came up with Kafka’s design and what were their learnings from it? Originally, the challenge was around how to represent it. It’s really clear when representing a file so it’s easy to make it a distributed file. But how do you represent a stream?

Episode 3: What is Apache Kafka Used For?

So what are the use cases around Apache Kafka and the problems it’s solving? Jay talks about data pipelines, and how you don’t have to think ahead of time about where the data’s going. You can publish, and others can tap into the data. The other main use case is stream processing – building applications that respond to data in real time.

Episode 4: Where Do Apache Kafka and Internet of Things Connect?

Kafka often comes up in IoT conversations. For Jay, IoT seeks Kafka because of its ability for stream processing, and fine-grained analytics around feedback loops and data-driven products.

Episode 5: Let’s Talk Endpoint Compression & Apache Spark

What’s Jay’s philosophy around endpoint compression, and what are the future conversations going to be around that?

Episode 6: What It’s Like as a First Time Entrepreneur

It’s pretty difficult being an entrepreneur in Silicon Valley. Learn about Jay’s inspiration for founding Confluent and the future challenges he foresees.

Episode 7: What New Tech Are You Keeping an Eye On?

What gets Jay excited about what’s happening in the tech world? He talks about streaming data and streaming processing. But he also makes a new prediction for databases – he sees another generation of database companies.

Posted in Big Data & Brews

At Datameer, we’re obsessed with making data the most valuable asset in any organization. We believe that when people have unconstrained access to explore massive amounts of data at the speed of thought, they can make data-driven decisions that can wholly impact the future of any business.

Back to Overview

Subscribe to the Datameer Blog