Bare Metal Kafka Using KRaft

eric | May 16, 2023, 9:09 a.m.

A basic Apache Kafka test-setup with 2 servers using KRaft. The recommended setup for production is at least 3 brokers and 3 controllers. The procedure is the same as below. Just add one more broker.  

Preparations

Get the Kafka Files

Download Kafka from the Kafka downloads page.

Move the files to somewhere where SELinux will allow you to run them:

$ mv ~/Downloads/kafka /usr/local/bin/

Customize Broker Properties

First off, you need to edit the broker's properties file:

$ vim /usr/local/bin/kafka/kafka_2.13-3.3.2/config/kraft/server1.properties

For this example, we are using two brokers on two different servers with given IPv4 and ports:

- Broker 1: IP-address 10.0.0.20, listener on port 9092, controller on port 19092

- Broker 2: IP-address 10.0.0.22, listener on port 9093, controller on port 19093

The following must be specified in the properties file for Broker 1 (server1.properties):

node.id=1

controller.quorum.voters=1@10.0.0.20:19092,2@10.0.0.22:19093

listeners=PLAINTEXT://0.0.0.0:9092,CONTROLLER://0.0.0.0:19092

advertised.listeners=PLAINTEXT://10.0.0.20:9092

The following must be specified in the properties file for Broker 2 (server2.properties):

node.id=2

controller.quorum.voters=1@10.0.0.20:19092,2@10.0.0.22:19093

listeners=PLAINTEXT://0.0.0.0:9093,CONTROLLER://0.0.0.0:19093

advertised.listeners=PLAINTEXT://10.0.0.22:9093

For this to work, make sure that your firewalls are open for the ports above. For Ubuntu and the like:

$ sudo ufw allow 9093

etc...

Create a Kafka User

Add a user that will run Kafka on both brokers:

$ sudo adduser --system kafka-user

$ sudo usermod -a -G adm kafka

$ sudo chown -R kafka /usr/local/bin/kafka

Generate a Cluster UUID

$ cd /usr/local/bin/kafka/kafka_?.??-?.?.?
$ KAFKA_CLUSTER_ID="$(bin/kafka-storage.sh random-uuid)"

Format Log Directories

$ bin/kafka-storage.sh format -t $KAFKA_CLUSTER_ID -c config/kraft/server1.properties

In addition, the kafka-user needs access to the kafka log file in the /tmp-directory:

$ touch /tmp/kafka-combined-logs

$ sudo chown kafka /tmp/kafka-combined-logs

Kafka Service

To enable automatic restart in case of failures or system restarts, you can use a process manager like systemd (assuming you are using a Linux-based operating system). Create a systemd service file for Kafka on each server. For example, create a file named kafka.service in the /etc/systemd/system/ directory.

$ touch /etc/systemd/system/kafka.service

Minimum Service File

Add the following contents to the kafka.service file for Broker 1, modifying the paths and options according to your setup:

[Unit]
Description=Apache Kafka Server
Wants=network.target
After=network.target

[Service]
Type=simple
Restart=always
RestartSec=1
User=kafka-user

ExecStart=/usr/local/bin/kafka/kafka_2.13-3.3.2/bin/kafka-server-start.sh /usr/local/bin/kafka//kafka_2.13-3.3.2/config/kraft/server1.properties

ExecStop=/usr/local/bin/kafka/kafka_2.13-3.2.2/bin/kafka-server-stop.sh

[Install]
WantedBy=multi-user.target

Repeat the process for Broker 2.

Enable & Start

Enable the service and start it on each broker:

$ sudo systemctl enable kafka
$ sudo systemctl start kafka
$ sudo systemctl status kafka

That's it!

Elaborate Service File, Including Service Hardening

It is possible to harden the service significantly. The price is more complexity, including more complex fault-finding. The service file for Broker 1 could look like this:

[Unit]

Description=Apache Kafka Server

Wants=network.target

After=network.target

[Service]

Type=simple

User=kafka-user

Restart=always

RestartSec=1

ExecStart=/usr/local/bin/kafka/kafka_2.13-3.3.2/bin/kafka-server-start.sh /usr/local/bin/kafka/kafka_2.13-3.3.2/config/kraft/server1.properties

ExecStop=/usr/local/bin/kafka/kafka_2.13-3.2.2/bin/kafka-server-stop.sh

NoNewPrivileges=true

PrivateTmp=yes

RestrictNamespaces=uts ipc pid user cgroup

ProtectKernelTunables=yes

ProtectKernelModules=yes

ProtectControlGroups=yes

PrivateUsers=strict

CapabilityBoundingSet=CAP_NET_BIND_SERVICE CAP_DAC_READ_SEARCH

[Install]
WantedBy=multi-user.target

The hardening chiefly consists of privilege restrictions configured with User=, Group=, CapabilityBoundingSet= or the various file system namespacing options (such as PrivateDevices=, PrivateTmp=), control of privileges, private temporary directories made inaccessible to other services, and prevention of explicit kernel module loading.

Create a similar file for Broker 2. Remember to change the properties file name.

Enable & Start

Enable the service and start it:

$ sudo systemctl enable kafka
$ sudo systemctl start kafka
$ sudo systemctl status kafka

Making Changes

If you introduce changes to the kafka service, you normally need to reformat the kafka storage, reload the service daemon, and restart the service:

$ bin/kafka-storage.sh format -t $KAFKA_CLUSTER_ID -c config/kraft/server.properties
$ sudo systemctl daemon-reload
$ sudo systemctl restart kafka
$ sudo systemctl status kafka

Troubleshooting

Read-only file system

If you forget to format the storage after changes to the kafka setup, you tend to get the following message in the error log:

Jul 21 16:50:41 broker02 kafka-server-start.sh[1975]: Could not rename log file '/usr/local/bin/kafka/kafka_2.13-3.3.2/bin/../logs/kafkaServer-gc.log' to '/usr/local/bin/kafka/kafka_2.13-3.3.2/bin/../logs/kafkaServer-gc.log.6' (Read-only file system).

Format the kafka storage anew as set out above.

References

Apache Kafka® Quick Start - https://developer.confluent.io/quickstart/kafka-local/

Kafka Command-Line Interface (CLI) Tools - https://docs.confluent.io/kafka/operations-tools/kafka-tools.html

Console Producer and Consumer Basics - https://developer.confluent.io/tutorials/kafka-console-consumer-producer-basics/kafka.html

Running Kafka in Production - https://docs.confluent.io/platform/current/kafka/deployment.html

Running Apache Kafka in Production (Podcast)- https://developer.confluent.io/podcast/running-apache-kafka-in-production/

About Me

Experienced dev and PM. Data science, DataOps, Python and R. DevOps, Linux, clean code and agile. 10+ years working remotely. Polyglot. Startup experience.
LinkedIn Profile

By Me

Statistics & R - a blog about - you guessed it - statistics and the R programming language.
R-blog

Erlang Explained - a blog on the marvelllous programming language Erlang.
Erlang Explained