What Is Kafka?

Kafka, or more officially known as Apache Kafka®, is a popular open source software platform for streaming data, used to store event-based messages in sequential order. The software scales horizontally across multiple servers for high-velocity, high-volume data. The messaging system is often described as a “publish/subscribe” (or “pub/sub”) system that includes publishers (i.e., the data sources) and subscribers (the data consumers).

Apache Kafka Diagram
Apache Kafka is used for storing streaming data, to let application developers build streaming applications that process and react to that data. It stores data in a persistent, fault-tolerant manner.

Why Is Kafka Used?

Apache Kafka is used for storing streaming data, to let application developers build streaming applications that process and react to that data. It stores data in a persistent, fault-tolerant manner. It can be used as a replacement for traditional message brokers with its ability to handle large volumes of data with high speed. It can also be used for log aggregation and in stream processing data pipelines.

Kafka is commonly used in conjunction with a technology known as “stream processing engines,” which are the frameworks for creating applications that read data from Kafka and take action on that data. Examples of stream processing engines include Hazelcast Jet, Apache Flink, and Apache Spark Streaming.

In many cases, the stream processing application writes data to a final store such as a database, where additional applications can be built for running analytics. This architecture is sometimes referred to as the Kappa Architecture.

Kafka Use Cases

Use cases for Kafka include messaging for microservices, website activity tracking, operational metrics, sending and receiving messages, real-time financial alerts, predictive budgeting for advertising, and threat detection. New Kafka use cases are regularly being identified and deployed.

The open source community shares many technology-centric Kafka use cases at conferences and in academic papers.

Who Uses Apache Kafka?

Apache Kafka is used in nearly every industry. Adidas, AirBnB, Cisco, Etsy, Goldman Sachs, Netflix, The New York Times, Oracle, Square, Uber, and Yahoo! are a few of the companies using Kafka.

 

Related Topics

Stream Processing   Microservices   Event-Driven Architecture   Kappa Architecture   Data Pipeline