Kubernetes Configuration: Migrating Elasticsearch from EC2 Instances to Kubernetes

December 16, 2022

In our previous article, I explored the question: Why migrate to Kubernetes? I looked at the resources needed for this migration, and you might remember that I showed you the architecture of an Elasticsearch cluster running on Kubernetes.

In this article, I will describe the essential configuration the adjoe Cloud Engineering team needed to carry out in order to migrate Elasticsearch from EC2 instances to Kubernetes. We’re talking ECK operators, Helm charts – you name it.

Let’s dive deeper into the details.

Configuring an Elasticsearch Cluster via ECK Helm Chart

Elasticsearch offers an ECK operator that not only makes deploying Elasticsearch and Kibana simple; it also goes further by handling most of the mundane tasks that require human intervention. We’re talking about tasks such as upgrading or updating the cluster, and adding and/or removing nodes to/from the cluster. But the ECK operator does this all for us without any downtime.

We use Terraform to manage our Kubernetes cluster, which runs on AWS EKS. The ECK operator was installed on the Kubernetes cluster using the ECK Helm chart.

resource "helm_release" "es_operator" {
 name             = "elasticsearch-operator"
 repository       = "https://helm.elastic.co"
 chart            = "eck-operator"
 create_namespace = true
 namespace        = "elastic-system"
 version          = var.eck_operator_version
}

We then create a Helm chart to deploy the following custom resource, which instructs the operator to make an Elasticsearch cluster. This is what it looks like.

apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
 annotations:
   eck.k8s.elastic.co/downward-node-labels: "topology.kubernetes.io/zone"
 name: {{ .Values.cluster_name }}
spec:
 version: {{ .Values.es_version }}
 auth:
   fileRealm:
   - secretName: secret-basic-auth
 http:
   service:
     spec:
       type: NodePort
       ports:
       - name: http
         port: 9200
         targetPort: 9200
   tls:
     selfSignedCertificate:
       disabled: true
 nodeSets:
   - name: masters
     count: {{ .Values.master_count }}
     config:
       node.attr.zone: ${ZONE}
       cluster.routing.allocation.awareness.attributes: k8s_node_name,zone
       bootstrap.memory_lock: true
       node.roles: ["master"]
       xpack.ml.enabled: true
     podTemplate:
       spec:
       # restricts Elasticsearch nodes so they are only scheduled on Kubernetes hosts tagged with label instance-type: m5.2xlarge
         affinity:
           nodeAffinity:
             requiredDuringSchedulingIgnoredDuringExecution:
               nodeSelectorTerms:
               - matchExpressions:
                 - key: node.kubernetes.io/instance-type
                   operator: In
                   values: {{- range .Values.kube_es_master_instance_type }}
                     - {{ . }}
                     {{- end }}
         containers:
           - name: elasticsearch
             env:
               - name: ZONE
                 valueFrom:
                   fieldRef:
                     fieldPath: metadata.annotations['topology.kubernetes.io/zone']
             resources:
               requests:
                 memory: {{ .Values.master_memory_request }}
                 cpu: {{ .Values.master_cpu_request }}
               limits:
                 memory: {{ .Values.master_memory_limit }}
                 cpu: {{ .Values.master_cpu_limit }}
           # Pod topology spread constraints to spread the Pods across availability zones in the Kubernetes cluster.
         topologySpreadConstraints:
           - maxSkew: {{.Values.kube_es_master_maxSkew}}
             topologyKey: topology.kubernetes.io/zone
             whenUnsatisfiable: DoNotSchedule
             labelSelector:
               matchLabels:
                 elasticsearch.k8s.elastic.co/cluster-name: {{ .Values.cluster_name }}
     volumeClaimTemplates:
       - metadata:
           name: elasticsearch-data
         spec:
           accessModes:
             - ReadWriteOnce
           resources:
             requests:
               storage: {{ .Values.master_disk_size }}
           storageClassName: {{ .Values.storage_class }}
   - name: data
     count: {{ .Values.data_count }}
     config:
       node.attr.zone: ${ZONE}
       cluster.routing.allocation.awareness.attributes: k8s_node_name,zone
       bootstrap.memory_lock: true
       node.roles: ["data"]
     podTemplate:
       spec:
       # restricts Elasticsearch nodes so they are only scheduled on Kubernetes hosts tagged with any of the specified instance types.
         affinity:
           nodeAffinity:
             requiredDuringSchedulingIgnoredDuringExecution:
               nodeSelectorTerms:
               - matchExpressions:
                 - key: node.kubernetes.io/instance-type
                   operator: In
                   values: {{- range .Values.kube_es_data_instance_type }}
                     - {{ . }}
                     {{- end }}
         containers:
           - name: elasticsearch
             env:
               - name: ZONE
                 valueFrom:
                   fieldRef:
                     fieldPath: metadata.annotations['topology.kubernetes.io/zone']
             resources:
               requests:
                 memory: {{ .Values.data_memory_request }}
                 cpu: {{ .Values.data_cpu_request }}
               limits:
                 memory: {{ .Values.data_memory_limit }}
                 cpu: {{ .Values.data_cpu_limit }}
       # Pod topology spread constraints to spread the Pods across availability zones in the Kubernetes cluster.
         topologySpreadConstraints:
           - maxSkew: {{.Values.kube_es_data_maxSkew}}
             topologyKey: topology.kubernetes.io/zone
             whenUnsatisfiable: DoNotSchedule
             labelSelector:
               matchLabels:
                 elasticsearch.k8s.elastic.co/cluster-name: {{ .Values.cluster_name }}
     volumeClaimTemplates:
       - metadata:
           name: elasticsearch-data
         spec:
           accessModes:
             - ReadWriteOnce
           resources:
             requests:
               storage: {{ .Values.data_disk_size }}
           storageClassName: {{ .Values.storage_class }}

How Do You Configure Kibana via Helm Chart?

There are two configurations set, one for the Elasticsearch master nodes; the other is for data nodes. Both sections consist of similar configurations.

One of the most important parts of configuration are the topology spread constraints. These were specified in order for the pods to be spread on all the availability zones. We did this because we didn’t want all or most of the pods to be scheduled in one availability zone. We also wanted the pods to be scheduled on certain instance types only, which is described in both of the sections as well.

Then we come to Kibana. We not only needed the Kibana pod but also a proxy container to handle all the incoming traffic, authenticate the user with Google SSO, and then redirect the authenticated user to Kibana. SSO is also available as a feature in Kibana; however, we use the basic Elasticsearch license, which does not include this feature. Hence we used OAuth2 Proxy for Google authentication.

Here’s the Helm chart for Kibana.

apiVersion: kibana.k8s.elastic.co/v1
kind: Kibana
metadata:
 name: kibana
spec:
 version: {{ .Values.es_version }}
 http:
   service:
     spec:
       type: NodePort
       ports:
       - name: http
         port: 80
         targetPort: 3000
   tls:
     selfSignedCertificate:
       disabled: true
 count: 1
 elasticsearchRef:
   name: {{ .Values.cluster_name }}
 config:
   server.publicBaseUrl: {{ .Values.kibana_url }}
   xpack.security.authc.providers:
     anonymous.anonymous1:
       order: 0
       credentials:
         username: "xxx"
         password: {{ .Values.es_readonly_password }}
     basic.basic1:
       order: 1
 podTemplate:
   spec:
     containers:
     - name: kibana
       resources:
         requests:
           memory: {{ .Values.kibana_memory_request }}
           cpu: {{ .Values.kibana_cpu_request }}
         limits:
           memory: {{ .Values.kibana_memory_limit }}
           cpu: {{ .Values.kibana_cpu_limit }}
       volumeMounts:
       - name: elasticsearch-templates
         mountPath: /etc/elasticsearch-templates
         readOnly: true
     - name: kibana-proxy
       image: 'quay.io/oauth2-proxy/oauth2-proxy:latest'
       imagePullPolicy: IfNotPresent
       args:
         - --cookie-secret={{ .Values.cookie_secret }}
         - --client-id={{ .Values.client_id }}
         - --client-secret={{ .Values.client_secret }}
         - --upstream=http://localhost:5601
         - --email-domain=example.com
         - --footer=-
         - --http-address=http://:3000
         - --redirect-url={{ .Values.redirect_url }}
         - --custom-sign-in-logo=https://path/to/logo
       ports:
         - containerPort: 3000
           name: http
           protocol: TCP
       resources:
         requests:
           memory: {{ .Values.proxy_memory_request }}
           cpu: {{ .Values.proxy_cpu_request }}
         limits:
           memory: {{ .Values.proxy_memory_limit }}
           cpu: {{ .Values.proxy_cpu_limit }}
     volumes:
       - name: elasticsearch-templates
         configMap:
           name: ilm-and-index-templates

As you might notice in the metadata, a service of type NodePort is requested. An ingress load balancer (not in this configuration) has been configured to route http and https traffic to port 3000 of the Kibana-proxy container. The .Values.{variable} fields in these configurations are the placeholders for the variables, which will be passed from the various environments.

Automating Post-Cluster Setup Configurations

Once the charts were deployed and the cluster was up and running, we also needed to carry out some manual configurations. This included index templates, index lifecycle management, policies, roles, data view index pattern creation, etc.

To automate these tasks, we used Kubernetes resources – that is, ConfigMaps and CronJobs. We created all the necessary API requests as JSON files and, using ConfigMaps, mounted them on the attached volumes. A shell script that makes the API requests using the JSON files is also mounted. This script is executed on a daily basis with CronJob as well as after spawning the Kibana pod.

Here’s the CronJob.

apiVersion: batch/v1
kind: CronJob
metadata:
 name: script-execution
spec:
 schedule: "0 5 * * *"
 jobTemplate:
   spec:
     template:
       spec:
         containers:
         - name: script-execution
           image: alpine/curl:latest
           imagePullPolicy: IfNotPresent
           command:
           - /bin/sh
           - -c
           - sh /etc/elasticsearch-templates/execution-script.sh
           volumeMounts:
           - name: elasticsearch-templates
             mountPath: /etc/elasticsearch-templates
             readOnly: true
         restartPolicy: OnFailure
         volumes:
           - name: elasticsearch-templates
             configMap:
               name: ilm-and-index-templates

The ConfigMap is mounted on the volumes using Terraform.

resource "kubernetes_config_map" "ilm-and-index-templates" {
 metadata {
   name      = "ilm-and-index-templates"
   namespace = var.namespace
 }
 data = {
    "application-indices.json" = templatefile("${path.module}/templates/application-indices.json",
     {
       ENV            = var.environment
       APP_LOG_PREFIX = var.app_log_prefix
     }
   )
   "execution-script.sh" = templatefile("${path.module}/templates/execution-script.sh",
     {
       ENV             = var.environment
       CLUSTER_NAME    = var.cluster_name
       ES_USER         = var.es_user
       ES_PASSWORD     = data.kubernetes_secret.pass.data["elastic"]
   )
   "delete-old-indices-policy.json" = templatefile("${path.module}/templates/delete-old-indices-policy.json",
     {
       RETENTION_AGE = var.retention_age
     }
   )
 }
}

To give you an idea, I’ll give you the shell script, too.

# Life Cycle Policy
curl -s -XPUT "http://${ES_USER}:${ES_PASSWORD}@${CLUSTER_NAME}-es-http.elasticsearch.svc:9200/_ilm/policy/Delete_app_indices" -H 'Content-Type: application/json' -d @/etc/elasticsearch-templates/delete-old-indices-policy.json


# Index templates
curl -s -XPUT "http://${ES_USER}:${ES_PASSWORD}@${CLUSTER_NAME}-es-http.elasticsearch.svc:9200/_index_template/application-indices" -H 'Content-Type: application/json' -d @/etc/elasticsearch-templates/application-indices.json

Following on from Our Kubernetes Configuration

In the next article, I discuss the lessons we have learned as a team while migrating Elasticsearch from EC2 instances to Kubernetes.

What could we have done initially in order to optimize the time it took to find a working solution?
Was this migration worthwhile? Or was it just some fancy way to run Elasticsearch without any added benefit to the simple installation on EC2 (or a physical node)?

Stay tuned for my final article!

Cloud Engineering

Senior DevOps Engineer (f/m/d)

Full-time,
Hamburg

Conquer cloud technologies at adjoe

See vacancies

Contents