Have you ever grown tired of running the same kubectl
commands again and again? Well, the good folks over at the Kubernetes team understand you. With the addition of custom resources and the operator pattern, you can now make use of extensions, or addons as I like to call them, to the Kubernetes API that help you manage applications and components.
Operators follow Kubernetes principles including the control loop. The Operator Pattern is set out to help DevOps teams manage a service or set of services by automating repeatable tasks.
This article will show you the pros and cons of using the Operator Pattern versus StatefulSets, as I explained in our previous tutorial about Running and Deploying Elasticsearch on Kubernetes. It will also guide you through installing and running the Elasticsearch Operator on a Kubernetes cluster. I will also explain how to quickly set up basic monitoring with the Sematext Elasticsearch monitoring integration. You can also peek at Kubernetes monitoring integration on your own.
Keep in mind, there are no silver bullets. Both solutions are valid, but are useful for different scenarios. At Sematext we're using the StatefulSet approach, and it's working great for us.
The Elasticsearch Operator I'll be using in this tutorial is the official Operator from Elastic. It automates the deployment, provisioning, management, and orchestration of Elasticsearch on Kubernetes.
With that out of the way, let's jump into the tutorial!
What are Kubernetes Operators?
Operators are extensions to Kubernetes that use custom resources to manage applications. By using the CustomResourceDefinition (CRD) API resource, you can define custom resources. In this tutorial you'll learn how to create a custom resource in a separate namespace.
When you define a CRD object, it creates a new custom resource with a name and schema that you specify. What's so cool about this? Well, you don't have to write a custom configuration to handle the custom resource. The Kubernetes API does it all for you. It serves and handles the storage of your custom resource.
The point of using the Operator Pattern is to help you, the DevOps engineer, automate repeatable tasks. It captures how you can write code to automate a task beyond what Kubernetes itself provides.
You deploy an Operator by adding the Custom Resource Definition and Controller to your cluster. The Controller will normally run outside of the control plane, much as you would run any containerized application. More about that a bit further down. Let me explain what the Elasticsearch Operator is first.
What is the Elasticsearch Operator?
The Elasticsearch Operator automates the process of managing Elasticsearch on Kubernetes.
There are a few different Elasticsearch Operators you can choose from. Some of them are made by active open-source contributors, however only one is written and maintained by Elastic.
However, I won't go into details about any of them except for the official ECK Operator built by Elastic. For the rest of this tutorial, I'll demo how to manage and run this particular Elasticsearch Operator.
ECK simplifies deploying the whole Elastic stack on Kubernetes, giving you tools to automate and streamline critical operations. You can add, remove, and update resources with ease. Like playing with Lego bricks, changing things around is incredibly simple. It also makes it much easier to handle operational and cluster administration tasks. What is streamlined?
- Managing multiple clusters
- Upgrading versions
- Scaling cluster capacity
- Changing cluster configuration
- Dynamically scaling storage
- Scheduling backups
Why Use the Elasticsearch Operator: Pros and Cons?
When I first learned about the Operator Pattern, I had an overwhelming feeling of hype. I wanted it to be better than the "old" way. I was hoping the added automation would make managing and deploying applications on Kubernetes much easier. I was literally hoping it would be the same breakthrough as Helm.
In the end, it's not. Well, at least not yet. If you compare the stars of the most popular Helm charts that configure Elasticsearch StatefulSets versus the official Elasticsearch Operator, they're neck-and-neck. We still seem to be a bit conflicted about what to use.
Elasticsearch Operator vs. StatefulSet
The Elasticsearch Operator essentially creates an additional namespace that houses tools to automate the process of creating Elasticsearch resources in your default namespace. It's literally an addon you add to your Kubernetes system to handle Elasticsearch-specific resources.
This gives you more automation but also abstracts away things you might need more fine-tuned control over. Configuring your own StatefulSets can often be the better approach because this is the way the community is used to configuring Elasticsearch clusters. It also gives you more control.
However, the Operator can do things that are not available with the StatefulSets. It uses Kubernetes resources in the background to automate your work with some additional features:
- S3 snapshots of indexes
- Automatic TLS - the operator automatically generates secrets
- Spread loads across zones
- Support for Kibana and Cerebro
- Instrumentation with statsd
- Secure by default, with encryption enabled and password protected
- Official Operator maintained by Elastic
Why Use the Elasticsearch Operator?
If you want to get up and running quickly, choose the Operator. You'll get all of this out of the box:
- Elasticsearch, Kibana and APM Server deployments
- TLS certificates management
- Safe Elasticsearch cluster configuration & topology changes
- Persistent volumes usage
- Custom node configuration and attributes
- Secure settings keystore updates
However, keep in mind there are downsides.
Why Stay Away From the Elasticsearch Operator?
Like with any new and exciting tool, there are a few issues. The biggest one being that it's a totally new tool you need to learn. Here are my reasons for staying away from the Operator:
- An additional tool to learn
- Additional Kubernetes resources in a separate namespace to worry about
- Additional resources create overhead
- Less fine-tuned control
Most of what the Elasticsearch Operator offers is already available with prebuilt Helm charts.
With that out of the way. Let's start by building something!
How to Run and Deploy the Elasticsearch Operator on Kubernetes
Installing the Elasticsearch Operator is as simple as running one command. Don't believe me? Follow along and find out for yourself.
Prerequisites
To follow along with this tutorial you’ll need a few things first:
- A Kubernetes cluster with role-based access control (RBAC) enabled.
- Ensure your cluster has enough resources available, and if not scale your cluster by adding more Kubernetes Nodes. You’ll deploy a 3-Pod Elasticsearch cluster. I’d suggest you have 3 Kubernetes Nodes with at least 4GB of RAM and 10GB of storage.
- The
kubectl
command-line tool installed on your local machine, configured to connect to your cluster. You can read more about how to install kubectl in the official documentation.
Installing the Elasticsearch Operator
This command will install custom resource definitions and the Operator with RBAC rules:
kubectl apply -f https://download.elastic.co/downloads/eck/1.0.0/all-in-one.yaml
Once you've installed the Operator, you can check the resources by running this command:
kubectl -n elastic-system get all
[Output]
NAME READY STATUS RESTARTS AGE
pod/elastic-operator-0 1/1 Running 0 18s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/elastic-webhook-server ClusterIP 10.96.52.149 <none> 443/TCP 19s
NAME READY AGE
statefulset.apps/elastic-operator 1/1 19s
As you see the Operator will live under the elastic-system namespace. You can monitor the logs of the Operator's StatefulSet with this command:
kubectl -n elastic-system logs -f statefulset.apps/elastic-operator
A better way of monitoring logs on a cluster-level is to add the Sematext Operator to collect these logs and send them to a central location, alongside performance metrics about your Elasticsearch cluster. It’s pretty straightforward.
kubectl apply -f https://raw.githubusercontent.com/sematext/sematext-operator/master/bundle.yaml
cat <<EOF | kubectl apply -f -
apiVersion: sematext.com/v1alpha1
kind: SematextAgent
metadata:
name: sematext-agent
spec:
region: <"US" or "EU">
containerToken: YOUR_CONTAINER_TOKEN
logsToken: YOUR_LOGS_TOKEN
infraToken: YOUR_INFRA_TOKEN
EOF
All you need are these two commands above, and you’re set to go. Next up, let's take a look at the CRDs that were created as well.
kubectl get crd
[Output]
NAME CREATED AT
apmservers.apm.k8s.elastic.co 2020-02-05T15:46:33Z
elasticsearches.elasticsearch.k8s.elastic.co 2020-02-05T15:46:33Z
kibanas.kibana.k8s.elastic.co 2020-02-05T15:46:33Z
These are the APIs you'll have access to, in order to streamline the process of creating and managing Elasticsearch resources in your Kubernetes cluster. Next up, let's deploy an Elasticsearch cluster.
Deploying the Elasticsearch Cluster
Once the Operator is installed you'll get the access elasticsearch.k8s.elastic.co/v1 API. Now you can spin up an Elasticsearch server in no time. Run this command to create an Elasticsearch cluster with a single node:
cat <<EOF | kubectl apply -f -
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: elasticsearch
spec:
version: 7.5.2
nodeSets:
- name: default
count: 1
config:
node.master: true
node.data: true
node.ingest: true
node.store.allow_mmap: false
EOF
Give it a minute to start. You can check the cluster health during the creation process:
kubectl get elasticsearch
[Output]
NAME HEALTH NODES VERSION PHASE AGE
elasticsearch green 1 7.5.2 Ready 61s
You now have a running Elasticsearch Pod, which is tied to a StatefulSet in the default namespace. Alongside this, you also have two Services you can expose to access the Pod.
kubectl get all
[Output]
NAME READY STATUS RESTARTS AGE
pod/elasticsearch-es-default-0 1/1 Running 0 2m18s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/elasticsearch-es-default ClusterIP None <none> <none> 2m18s
service/elasticsearch-es-http ClusterIP 10.96.192.180 <none> 9200/TCP 2m19s
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 2d2h
NAME READY AGE
statefulset.apps/elasticsearch-es-default 1/1 2m18s
To make sure your Pod is working, check its logs:
kubectl logs elasticsearch-es-default-0
...
If you see logs streaming in, you know it's working. The Services both have ClusterIPs and you get credentials generated automatically.
First, open up another terminal window, there you expose the quickstart-es-http
service, so you can access it from your local machine:
kubectl port-forward service/elasticsearch-es-http 9200
A default user named elastic is automatically created with the password stored in a Kubernetes secret. Back in your initial terminal window, run this command to retrieve the password:
PASSWORD=$(kubectl get secret elasticsearch-es-elastic-user -o=jsonpath='{.data.elastic}' | base64 --decode)
Use curl to test the endpoint:
curl -u "elastic:$PASSWORD" -k "https://localhost:9200"
[Output]
{
"name" : "elasticsearch-es-default-0",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "7auDvcXLTwqLmXfBcAXIqg",
"version" : {
"number" : "7.5.2",
"build_flavor" : "default",
"build_type" : "docker",
"build_hash" : "8bec50e1e0ad29dad5653712cf3bb580cd1afcdf",
"build_date" : "2020-01-15T12:11:52.313576Z",
"build_snapshot" : false,
"lucene_version" : "8.3.0",
"minimum_wire_compatibility_version" : "6.8.0",
"minimum_index_compatibility_version" : "6.0.0-beta1"
},
"tagline" : "You Know, for Search"
}
Hey presto! It works. This might be good for starters, but the cluster only has one Pod. Let's spice things up a bit and add a few more.
Upgrade and Configure the Elasticsearch Cluster
Any edits you do to the configuration will automatically upgrade the cluster. The Operator will try to update all the configuration changes you tell it, except for existing volume claims, these cannot be resized. Make sure your Kubernetes cluster has enough resources to handle any resizing you do.
If you want to have 3 Pods, run this command:
cat <<EOF | kubectl apply -f -
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: elasticsearch
spec:
version: 7.5.2
nodeSets:
- name: default
count: 3
config:
node.master: true
node.data: true
node.ingest: true
node.store.allow_mmap: false
EOF
This will bump up the Pod count. Check out this sample to see all the configuration options. Let's check if our Pods have updated:
kubectl get all
[Output]
NAME READY STATUS RESTARTS AGE
pod/elasticsearch-es-default-0 1/1 Running 0 25m
pod/elasticsearch-es-default-1 1/1 Running 0 3m8s
pod/elasticsearch-es-default-2 1/1 Running 0 2m46s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/elasticsearch-es-default ClusterIP None <none> <none> 25m
service/elasticsearch-es-http ClusterIP 10.96.192.180 <none> 9200/TCP 25m
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 2d2h
NAME READY AGE
statefulset.apps/elasticsearch-es-default 3/3 25m
Awesome! Our cluster is starting to look nice! This cluster that you deployed by default only allocates a persistent volume of 1 GB for storage using the default storage class defined for the Kubernetes cluster.
Here's a sample of what adding more storage looks like:
cat <<EOF | kubectl apply -f -
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: elasticsearch
spec:
version: 7.5.2
nodeSets:
- name: default
count: 3
config:
node.master: true
node.data: true
node.ingest: true
node.store.allow_mmap: false
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 4Gi
storageClassName: standard
EOF
You'll most likely want to have more control over this for production workloads. Check out the Volume claim templates for more information.
How to Run and Deploy Kibana with the Elasticsearch Operator
This Operator is called ECK for a reason. It comes packaged with Kibana. In one of the sections above we ran this command:
kubectl get crd
[Output]
NAME CREATED AT
apmservers.apm.k8s.elastic.co 2020-02-05T15:46:33Z
elasticsearches.elasticsearch.k8s.elastic.co 2020-02-05T15:46:33Z
kibanas.kibana.k8s.elastic.co 2020-02-05T15:46:33Z
Check it out. You have a kibana.k8s.elastic.co/v1
API as well. This is what you'll use to create your Kibana instance.
Go ahead and specify a Kibana instance and reference your Elasticsearch cluster:
cat <<EOF | kubectl apply -f -
apiVersion: kibana.k8s.elastic.co/v1
kind: Kibana
metadata:
name: elasticsearch
spec:
version: 7.5.2
count: 1
elasticsearchRef:
name: elasticsearch
EOF
Give it a second to spin up the Pod. Similar to Elasticsearch, you can retrieve details about Kibana instances with this simple command:
kubectl get kibana
[Output]
NAME HEALTH NODES VERSION AGE
elasticsearch green 1 7.5.2 2m31s
Wait until the health is green, then check the Pods:
kubectl get pod --selector='kibana.k8s.elastic.co/name=elasticsearch'
[Output]
NAME READY STATUS RESTARTS AGE
elasticsearch-kb-5f568dcdb6-xd55w 1/1 Running 0 3m19s
When the Pods are up and running as well, you can go ahead and set up accessing Kibana. A ClusterIP Service is automatically created for Kibana:
kubectl get service elasticsearch-kb-http
[Output]
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
elasticsearch-kb-http ClusterIP 10.96.199.44 <none> 5601/TCP 4m24s
Once again, open up another terminal window, and use kubectl port-forward
to access Kibana from your local machine:
kubectl port-forward service/elasticsearch-kb-http 5601
Open https://localhost:5601
in your browser. Log in as the elastic user. Get the password with this command:
kubectl get secret elasticsearch-es-elastic-user -o=jsonpath='{.data.elastic}' | base64 --decode; echo
Once you're signed in, you'll see the Kibana quickstart screen.
There you have it. You've added a Kibana instance to your Kubernetes cluster.
Cleaning Up and Deleting the Elasticsearch Operator
With all resources installed and working, you should see this when running kubectl
get all.
NAME READY STATUS RESTARTS AGE
pod/elasticsearch-es-default-0 1/1 Running 0 13m
pod/elasticsearch-kb-5f568dcdb6-xd55w 1/1 Running 0 11m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/elasticsearch-es-default ClusterIP None <none> <none> 13m
service/elasticsearch-es-http ClusterIP 10.96.168.225 <none> 9200/TCP 13m
service/elasticsearch-kb-http ClusterIP 10.96.199.44 <none> 5601/TCP 11m
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 2d3h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/elasticsearch-kb 1/1 1 1 11m
NAME DESIRED CURRENT READY AGE
replicaset.apps/elasticsearch-kb-5f568dcdb6 1 1 1 11m
NAME READY AGE
statefulset.apps/elasticsearch-es-default 1/1 13m
Way to go, you've configured an Elasticsearch cluster with Kibana using the Elasticsearch Operator! But, what if you need to delete resources? Easy. Run two commands and you're done.
First, delete all Elastic resources from all namespaces:
kubectl delete elastic --all --all-namespaces
Then, delete the Operator itself:
kubectl delete -f https://download.elastic.co/downloads/eck/1.0.0/all-in-one.yaml
That's it, all clean!
Final Thoughts About the Elasticsearch Operator
In this tutorial you've learned about the Kubernetes Operator pattern, and how to run and deploy the Elasticsearch Operator on a Kubernetes cluster. You've also scaled up the number of Elasticsearch Pods on the cluster, and installed Kibana.
With this knowledge on top of what you learned in part 1 of this series, you can make a decision whether to use a Helm chart with StatefulSets or the Elasticsearch Operator.
Why bother learning Operators?
In the last year we've witnessed a huge increase in popularity for the Operator Pattern. Right now, the official Elasticsearch Operator has the same number of stars on GitHub as the most popular Elasticsearch Helm chart. This popularity will seemingly continue to grow.
What can you do now? Contribute! Learn even more about Kubernetes, and give back to the community. These projects are open-source for a reason. Help them grow!
Hope you guys and girls enjoyed reading this as much as I enjoyed writing it. If you liked it, feel free to hit the share button so more people will see this tutorial. Until next time, be curious and have fun.