Understanding StatefulSets in Kubernetes

A great deal of the work involved with creating a clustered environment of any type is the ability to deploy and scale with consistency. In Kubernetes, one method of managing these actions is by using StatefulSets. Here, we dive deeper into using this API object to manage stateful applications. We will also look into an example of StatefulSets in use.

StatefulSets for applications

In Kubernetes, scaling is completed by managing a set of Pods. For this to happen in a standardized way, a resource is needed at the workload level. One common workload resource is a Deployment that will handle ReplicaSets. While this works well for stateless applications, it will not work for situations where data must persist through any application scaling.

StatefulSets is for use with stateful applications in a Kubernetes cluster. It allows the orchestration of replicated containers based on the same specifications, but maintains an identity that remains with the Pod through rescheduling. These Pods are no longer interchangeable due to this persistent identifier. However, matching any new Pods to existing volumes allows the state to persist.

Stability and persistence

As we discussed, if your application does not need to take state into consideration, ReplicaSets may be the better option for simplicity. Otherwise, using StatefulSets will help ensure a number of requirements can be met for your stateful app. It provides a more graceful method of deployment and scaling and ensures data is available via unique identity.

A number of features intrinsic with the use of StatefulSets are the guarantees around deployment and scaling. These features are what contributes to the stability of an application running under this method of Pod management. There are a number of things to know in regards to these deployment and scaling guarantees:

For a StatefulSet with N replicas, when Pods are being deployed, they are created sequentially, in order from {0..N-1}.
When Pods are being deleted, they are terminated in reverse order, from {N-1..0}.
Before a scaling operation is applied to a Pod, all of its predecessors must be Running and Ready.
Before a Pod is terminated, all of its successors must be completely shutdown.

Courtesy kubernetes.io Concepts Documentation

This type of pod management (OrderedReady) is the default when using StatefulSets. Alternatively, Parallel pod management allows the controller to launch pods without waiting for them to be Running and Ready. As a behavior of scaling, pods may also be terminated in parallel. This negates the normal routine of waiting for the pods to finish startup or completing the shutdown of N successors during deployment or scaling.

Application example using nginx

Taking a queue from the available documentation at kubernetes.io, we will be creating a web service using nginx to publish a web service. During this process, we will use kubectl commands that show the state of our pods throughout their lifecycle.

The OS cross-compatibility of minikube makes it handy for demonstrating StatefulSets. We will use it to emulate a Kubernetes cluster and set up a Stateful application example. If you have not had the chance to use it yet, take a look at the Chocolatey Package Manager. This example uses the choco CLI to install minikube.

Other OS users can find instructions for their systems here.

From an administrative command line:

choco install minikube

After the installation, it is a good idea to close and re-open your command shell. The next command will prepare and start the cluster.

minikube start

Look at this YAML example. It is meant to show one method of creating a headless service using nginx. Save this file as web.yaml for use in upcoming steps:

yaml

So that we can watch the creation of the Pods, we open two command line windows. In the first window we run the following:

kubectl get pods -w -l app=nginx

In the second, the service is created based off the web.yaml, above:

kubectl apply -f web.yaml

Now we can see the 2 instances of the application as they are instantiated.‍

We can use kubectl to get additional information regarding our running containers:

Just from this short example, it is easy to see there is a large amount of manual toil required for monitoring a real-world application. By using the various tools for Kubernetes management, you begin to surface all of the complexities. Technology for cloud and hybrid cloud management is moving towards a self-healing model thanks to the metrics in the underlying infrastructure.

Important metrics include but should not be limited to:

Container Image Health
Job Failures
Control Plan Failures
Resource-related Latency
Node & Pod Load

The first step in a successful Kubernetes implementation is being able to look at the big picture. Putting it all together, it is clear that having a way to visualize applications in a real-world scenario is critical.

If you're looking for an internal tooling platform to quickly build powerful UIs and workflows, then check out Airplane. Airplane is the developer platform for building custom internal tools. The basic building blocks of Airplane are Tasks, which are single or multi-step functions that anyone can use. Airplane also offers Views, a React-based platform for building UIs quickly.

To try it out and build your first internal tools within minutes, sign up for a free account or book a demo.

Understanding StatefulSets in Kubernetes

StatefulSets for applications

Stability and persistence

Application example using nginx

How to use NGINX Prometheus exporter

Collecting logs from AWS Fargate