Have you ever logged into an email acccout or your banking app? Sure you have! You probably do it every day. When you do it, you’re interacting with a stateful application. A stateful application is a way to store data that’s needed again and again.
The question for this blog post is; can you have applications like that inside of Kubernetes?
Let’s figure it out together!
What Are StatefulSets
As mentioned in the opening paragraph of this blog post, a stateful application is a process that can be returned again and again. An example of that is email. Every time you’re logging into email, you’re the “process” of the stateful application that’s returning again and again. If your email account wasn’t stateful, that means all of your emails and everything else (signatures, drafts, etc.) would be gone.
In Kubernetes, StatefulSets are trying to solve the problem of your data going away when a Kubernetes Pod is destroyed. In short, a StatefulSet is an API object in Kubernetes that handles stateful applications.
The way that a StatefulSet object does that is by:
- Unique network identifiers that stick with Pods as they come up and go down
- Persistent storage
- Scaling gracefully. New replicas won’t come up until the previous replica is up and running
- Rolling updates
Essentially, Kubernetes keeps every Pod unique even when a Pod goes down. As a new one comes up, it has the same network ID and storage, which helps the application know where the data lives.
StatefulSet vs Stateless
There are, more or less, two types of applications:
- Stateful apps
- Stateless apps
You’ve already learned about stateful apps, so let’s talk about stateless apps.
When a stateless application is deployed, it’s essentially always being deployed from scratch. There’s no data attached to it. No running history of the application. No persistent storage is deployed with it.
A good example of a stateless application is going to search for something on Google. Let’s say you open up Google and you search “how to make bread”. If you accidentally close out of the search web browser, you simply open up a new web browser, go back to Google, and search “how to make bread” again. You start the process over again. That’s what stateless applications do. When a new stateless app is started and comes up, it’s starting from scratch.
Example StatefulSet Breakdown (Deployment Spec vs StatefulSet Spec)
Now that you know what stateless and stateful are, let’s take a look at a few examples via Kubernetes manifests.
The first example is of a stateless application.
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
selector:
matchLabels:
app: nginx
replicas: 2
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
The Kubernetes manifest above is deploying an Nginx application. The Docker image it’s using is of version latest
and it’s running on port 80
. Notice how there’s no storage or anything of the sort. When this application runs, it runs. When it goes away, it goes away. When it runs again, it starts fresh.
Notice how there’s a Deployment spec. You’ll see it on the second line; kind: Deployment
. Keep this in mind for a minute.
Now let’s look at a stateful application.
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
spec:
serviceName: "nginx"
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
name: web
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html
volumeClaimTemplates:
- metadata:
name: www
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 1Gi
The Kubernetes manifest above is also deploying an Nginx application except for this time, you’ll see a few new pieces including:
- The
kind
is nowStatefulSet
- There are volumes
- Mount paths exist to store the data
Let’s talk about the Deployment spec and StatefulSet differences
Deployment Spec vs StatefulSet Spec
Deployment specs and StatefulSet specs are quite similar. They both have self-healing, scalability features, and replica counts. The key difference between the two is a StatefulSet spec maintains a sticky identity for each Pod. Remember in the previous section when you learned that with StatefulSets, the Pods keep the network IDs and they’re unique? This is what was being referred to.
You can create volume mounts for stateless applications and store data via a stateless application using a Deployment spec. The key difference to remember is the unique network ID differentiation between StatefulSets and Deployments.
StatefulSets For Production?
To answer this question, there are two ways you have to think about it:
- From a technical perspective
- From a user perspective
From a technical perspective, yes, StatefulSets are absolutely ready for production. The Kubernetes API is part of the core API group, so it’s not in beta or anything like that, which means the API has been vigorously tested and works as expected. It’s already been proven that Pods will keep Network IDs, so stateful apps will remain stateful. Storing application data via Kubernetes Volumes works perfectly for almost any scenario. So yes, from a technical perspective, StatefulSets are ready for production.
From a user perspective, some folks still believe that StatefulSets aren’t ready for production. I could be wrong here, but the arguments that I hear about this topic stem a lot from the fact that by definition, containers are meant to be ephemeral. They’re supposed to come up, go down, and that’s that. They aren’t supposed to store stateful data. This is very true, but because of how StatefulSets are set up and keep the Pods network ID and use volumes, the containers inside of a Pod can go down all day every day, and the application will still be Stateful.