What Is Apache Kafka?

Apache Kafka® is a widely-used open source platform for processing and preserving event-based messages in a sequential format. The software scales horizontally across multiple servers for high-velocity, high-volume data. The messaging system is known as a “publish/subscribe” (or “pub/sub”) system. It consists of publishers, which are data sources, and subscribers, which are data consumers.

Apache Kafka Diagram
Apache Kafka is a software platform used to store streaming data. Application developers can use Kafka to build streaming applications. These applications process and react to the streaming data. It stores data in a persistent, fault-tolerant manner.

Why Is Kafka Used?

Apache Kafka is a streaming data storage system. It enables application developers to create streaming applications that process and respond to the data.

Kafka stores data in a persistent, fault-tolerant manner. It can be used as a replacement for traditional message brokers with its ability to handle large volumes of data with high speed. It can also be used for log aggregation and in stream processing data pipelines.

Kafka is often used with “stream processing engines”. These are the frameworks for building applications that read data from Kafka and act on it. Examples of stream processing engines include Hazelcast Jet, Apache Flink, and Apache Spark Streaming.

In many cases, the stream processing application writes data to a final store such as a database, where additional applications can be built for running analytics. This architecture is sometimes referred to as the Kappa Architecture.

Kafka Use Cases

Kafka is useful for a variety of purposes, such as messaging for microservices, website activity tracking, and operational metrics. It can also be used to send and receive messages, generate real-time financial alerts, and predict advertising budgeting. Additionally, it is helpful for threat detection. New Kafka use cases are regularly being identified and deployed.

The open source community shares many technology-centric Kafka use cases at conferences and in academic papers.

Who Uses Apache Kafka?

Apache Kafka is used in nearly every industry. Adidas, AirBnB, Cisco, Etsy, Goldman Sachs, Netflix, The New York Times, Oracle, Square, Uber, and Yahoo! are a few of the companies using Kafka.