Kafka on Kubernetes - Part 1: Introduction to Kubernetes

Lydtech
Kafka on Kubernetes - Part 1: Introduction to Kubernetes

Introduction

Kubernetes is a popular container orchestration tool used for managing deployments across different environments. This series of articles provides an overview of Kubernetes, and how using minikube enables running Kubernetes locally. It then steps through deploying Kafka and Zookeeper using minikube, followed by deploying a Spring Boot application. The application integrates with Kafka, and provides a REST API that is called to trigger sending events to Kafka.

In this, the first part, an overview of Kubernetes is provided, covering the main components that should be understood in order to complete the deployment walkthroughs in the following parts. In the second part, the steps to deploy dockerised Kafka and Zookeeper to Kubernetes are described, and events are sent and received from the command line to Kafka. In the third part, a dockerised Spring Boot application is deployed to Kubernetes, and the steps required to make this callable from an external source are described, both via REST calls and using Kafka to send and receive events to/from the application.

The source code for the accompanying Spring Boot application is available here.

Kubernetes Overview

Kubernetes is an open source container orchestration tool originally developed by Google. It manages OCI containers. Applications are deployed in these containers, and are therefore managed by Kubernetes. Deployments can be to different environments such as physical or virtual machines, or in the cloud. Kubernetes is therefore a popular framework for microservices, which can comprise of many containers.

Kubernetes provides a number of guarantees. It provides the tools to help make high availability achievable, enabling applications to be available with no downtime (although this still requires thought and effort). It provides scalability by automatically scaling application containers horizontally based on application requirements. It also provides disaster recovery, allowing applications to automatically recover in the case of failure scenarios.

Kubernetes Components

This section covers some of the main components that should be understood in order to follow the deployment steps for the demo in the later articles in this series.

Kubernetes uses a Pod as an abstraction over a container. This creates the running environment over the container. Usually a pod will run one container. Pods themselves run on a Node. In the demo there will be three pods used, each containing one of a dockerised Kafka, Zookeeper, and Spring Boot application. These then run on one node.

Pods are ephemeral, meaning that they can die easily, and Kubernetes takes care of replacing old pods with new pods. As new pods will be assigned new IP addresses, a Kubernetes Service is attached to a pod and provides a static IP address for the pod. The service is also a load balancer, and will forward the request to one of the application pods. Pods then communicate with each other using services. As pods and services do not have the same lifecycle, if a pod dies the service remains and retains the same IP address for the new pod.


Figure 1: A two node Kubernetes deployment

Figure 1: A two node Kubernetes deployment

In the diagram above two nodes are shown, each with two pods. The second node is a replica of the first, and each run on their own server. The Kubernetes Deployment defines the blueprint for the pods and the number of replicas to run, and so acts as an abstraction over pods. As the demo will illustrate, deployments are created, rather than pods or services directly. Each pod contains a container running an instance of an application.

A distinction is made between external and internal services. External services will be accessible to external sources, whereas internal services, such as a database or a messaging broker, are not. While external services can be called via their IP and port, a standard URL with a secure protocol (https) can be applied by using a Kubernetes Ingress, which clients then use. The ingress maps the URL to the external address of the service, and routes traffic into the cluster.

Minikube

Minikube is a lightweight open source implementation of Kubernetes that allows testing on a local machine. Whereas a Production Kubernetes deployment would likely have multiple master and worker nodes, minikube runs both a master and worker on one node in a Virtual Box. This means that memory and CPU do not become a bottleneck when running the deployment locally.

Kubectl

In order to interact with the Kubernetes cluster such as to create and configure pods and services, the command line tool kubectl is provided. Commands are sent to the API Server, which is one of the Kubernetes master processes and the entry point into the cluster. The commands are then executed by the worker processes on minikube.

Summary

In this first part the key concepts and components of Kubernetes, minikube and kubectl have been outlined. This provides the foundation for the following parts in this series that walk though deploying Kafka, Zookeeper and a Spring Boot application to Kubernetes.

Source Code

The source code for the accompanying Spring Boot demo application is available here:

https://github.com/lydtechconsulting/kafka-kubernetes-demo/tree/v1.0.0

More Articles in the Series

  • Kafka on Kubernetes - Part 2: Deploying Kafka: Walks through deploying Kafka and Zookeeper to Kubernetes, and explains the kubectl commands used to query the state of the deployment. Steps through sending and receiving events to the deployed Kafka instance using the Kafka command line tools.
  • Kafka on Kubernetes - Part 3: Spring Boot Demo: Walks through deploying a Spring Boot application to Kubernetes. The application connects to the deployed Kafka to consume and produce events. It provides a REST API enabling a client to trigger sending events to Kafka, and the steps to expose this to an external source are described.

View this article on our Medium Publication.