Introduction

Apache Kafka is an event streaming platform used to collect, process, store, and integrate data at scale. It has numerous use cases including distributed streaming, stream processing, data integration, and pub/sub messaging. Data streaming involves continuous flow of high volumes of data from different sources for processing and analyzing. An event is any type of action, incident, or change that's identified or recorded by software or applications.

Kafka consists of these key components:

  • Producer: An application that write data(events) to Kafka topics. A producer can send data to any broker in the Kafka cluster.
  • Consumer: An application that reads data from Kafka topics.
  • Brokers: Kafka servers that store and replicate messages.
  • Topic: Streams of records that Kafka recognizes data into.
  • Zookeeper: A distributed coordination service that manages metadata, leader election, and other critical tasks in a Kafka cluster.
  • Clusters: Group of servers working together to enhance durability, low latency and scalability.
  • Partitions: Division of topics for scalability and parallelism.
  • Connect: it manages the tasks.

Installation

Kafka works well on Linux operating system. If you are on windows, you can download Windows Sub-Linux(WSL). To install Kafka, make sure you have Java(Version 11 or 17) installed on your system.
Download Kafka from official website , unzip it using the following command on a terminal:

wget https://archive.apache.org/dist/kafka/3.6.0/kafka_2.12-3.6.0.tgz
tar  -xzf kafka_2.12-3.6.0.tgz

mv kafka_2.12-3.6.0 kafka

Start Kafka environment

Kafka traditionally requires Zookeeper for coordination. Start Zookeeper by running the following command on inside the directory that you have installed Kafka:

kafka/bin/zookeeper-server-start.sh kafka/config/zookeeper.properties

Once the zookeeper is running, open another terminal window and run Kafka broker service:

kafka/bin/kafka-server-start.sh kafka/config/server.properties

Kafka environment is running successfully and ready to be used.

Topics in Kafka

Topics are streams of records that Kafka recognizes data into. Producers publish messages to topics, and consumers subscribe to them.
In Kafka, before you write an event, you will to create a topic using the following command:

kafka/bin/kafka-topics.sh --create --topic --victor-topic --bootstrap-server 127.0.0.1:9092

By default Kafka runs on port 9092 and localhost 127.0.0.1.
To list down all the topics available, run the command:

kafka/bin/kafka-topics.sh --list --bootstrap-server 127.0.0.1:9092

Kafka Events

A Kafka client communicates with the Kafka brokers via the network for writing (or reading) events.
Once the brokers receive the events, they will store them in the specified topic for as long as you need.

Run the console producer client to write events into your topic:

kafka/bin/kafka-console-producer.sh --topic victor-topic --bootstrap-server 127.0.0.1:9092
My first event in victor-topic

Run the console consumer client to read the events you just created:

kafka/bin/kafka-console-consumer.sh --topic victor-topic --from-beginning --bootstrap-server 127.0.0.1:9092
My first event in victor-topic

To stop the consumer client, press ctrl + C