adjoe Engineers’ Blog
 /  Infrastructure  /  Kubernetes Configuration: Migrating Elasticsearch from EC2 Instances to Kubernetes
decorative image of purple kubernetes logo with cloud element
Infrastructure

Kubernetes Configuration: Migrating Elasticsearch from EC2 Instances to Kubernetes

In our previous article, I explored the question: Why migrate to Kubernetes? I looked at the resources needed for this migration, and you might remember that I showed you the architecture of an Elasticsearch cluster running on Kubernetes.

In this article, I will describe the essential configuration the adjoe Cloud Engineering team needed to carry out in order to migrate Elasticsearch from EC2 instances to Kubernetes. We’re talking ECK operators, Helm charts – you name it.

Let’s dive deeper into the details.

Configuring an Elasticsearch Cluster via ECK Helm Chart

Elasticsearch offers an ECK operator that not only makes deploying Elasticsearch and Kibana simple; it also goes further by handling most of the mundane tasks that require human intervention. We’re talking about tasks such as upgrading or updating the cluster, and adding and/or removing nodes to/from the cluster. But the ECK operator does this all for us without any downtime.

We use Terraform to manage our Kubernetes cluster, which runs on AWS EKS. The ECK operator was installed on the Kubernetes cluster using the ECK Helm chart.

resource "helm_release" "es_operator" {
 name             = "elasticsearch-operator"
 repository       = "https://helm.elastic.co"
 chart            = "eck-operator"
 create_namespace = true
 namespace        = "elastic-system"
 version          = var.eck_operator_version
}

We then create a Helm chart to deploy the following custom resource, which instructs the operator to make an Elasticsearch cluster. This is what it looks like.

apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
 annotations:
   eck.k8s.elastic.co/downward-node-labels: "topology.kubernetes.io/zone"
 name: {{ .Values.cluster_name }}
spec:
 version: {{ .Values.es_version }}
 auth:
   fileRealm:
   - secretName: secret-basic-auth
 http:
   service:
     spec:
       type: NodePort
       ports:
       - name: http
         port: 9200
         targetPort: 9200
   tls:
     selfSignedCertificate:
       disabled: true
 nodeSets:
   - name: masters
     count: {{ .Values.master_count }}
     config:
       node.attr.zone: ${ZONE}
       cluster.routing.allocation.awareness.attributes: k8s_node_name,zone
       bootstrap.memory_lock: true
       node.roles: ["master"]
       xpack.ml.enabled: true
     podTemplate:
       spec:
       # restricts Elasticsearch nodes so they are only scheduled on Kubernetes hosts tagged with label instance-type: m5.2xlarge
         affinity:
           nodeAffinity:
             requiredDuringSchedulingIgnoredDuringExecution:
               nodeSelectorTerms:
               - matchExpressions:
                 - key: node.kubernetes.io/instance-type
                   operator: In
                   values: {{- range .Values.kube_es_master_instance_type }}
                     - {{ . }}
                     {{- end }}
         containers:
           - name: elasticsearch
             env:
               - name: ZONE
                 valueFrom:
                   fieldRef:
                     fieldPath: metadata.annotations['topology.kubernetes.io/zone']
             resources:
               requests:
                 memory: {{ .Values.master_memory_request }}
                 cpu: {{ .Values.master_cpu_request }}
               limits:
                 memory: {{ .Values.master_memory_limit }}
                 cpu: {{ .Values.master_cpu_limit }}
           # Pod topology spread constraints to spread the Pods across availability zones in the Kubernetes cluster.
         topologySpreadConstraints:
           - maxSkew: {{.Values.kube_es_master_maxSkew}}
             topologyKey: topology.kubernetes.io/zone
             whenUnsatisfiable: DoNotSchedule
             labelSelector:
               matchLabels:
                 elasticsearch.k8s.elastic.co/cluster-name: {{ .Values.cluster_name }}
     volumeClaimTemplates:
       - metadata:
           name: elasticsearch-data
         spec:
           accessModes:
             - ReadWriteOnce
           resources:
             requests:
               storage: {{ .Values.master_disk_size }}
           storageClassName: {{ .Values.storage_class }}
   - name: data
     count: {{ .Values.data_count }}
     config:
       node.attr.zone: ${ZONE}
       cluster.routing.allocation.awareness.attributes: k8s_node_name,zone
       bootstrap.memory_lock: true
       node.roles: ["data"]
     podTemplate:
       spec:
       # restricts Elasticsearch nodes so they are only scheduled on Kubernetes hosts tagged with any of the specified instance types.
         affinity:
           nodeAffinity:
             requiredDuringSchedulingIgnoredDuringExecution:
               nodeSelectorTerms:
               - matchExpressions:
                 - key: node.kubernetes.io/instance-type
                   operator: In
                   values: {{- range .Values.kube_es_data_instance_type }}
                     - {{ . }}
                     {{- end }}
         containers:
           - name: elasticsearch
             env:
               - name: ZONE
                 valueFrom:
                   fieldRef:
                     fieldPath: metadata.annotations['topology.kubernetes.io/zone']
             resources:
               requests:
                 memory: {{ .Values.data_memory_request }}
                 cpu: {{ .Values.data_cpu_request }}
               limits:
                 memory: {{ .Values.data_memory_limit }}
                 cpu: {{ .Values.data_cpu_limit }}
       # Pod topology spread constraints to spread the Pods across availability zones in the Kubernetes cluster.
         topologySpreadConstraints:
           - maxSkew: {{.Values.kube_es_data_maxSkew}}
             topologyKey: topology.kubernetes.io/zone
             whenUnsatisfiable: DoNotSchedule
             labelSelector:
               matchLabels:
                 elasticsearch.k8s.elastic.co/cluster-name: {{ .Values.cluster_name }}
     volumeClaimTemplates:
       - metadata:
           name: elasticsearch-data
         spec:
           accessModes:
             - ReadWriteOnce
           resources:
             requests:
               storage: {{ .Values.data_disk_size }}
           storageClassName: {{ .Values.storage_class }}

How Do You Configure Kibana via Helm Chart?

There are two configurations set, one for the Elasticsearch master nodes; the other is for data nodes. Both sections consist of similar configurations. 

One of the most important parts of configuration are the topology spread constraints. These were specified in order for the pods to be spread on all the availability zones. We did this because we didn’t want all or most of the pods to be scheduled in one availability zone. We also wanted the pods to be scheduled on certain instance types only, which is described in both of the sections as well.

Then we come to Kibana. We not only needed the Kibana pod but also a proxy container to handle all the incoming traffic, authenticate the user with Google SSO, and then redirect the authenticated user to Kibana. SSO is also available as a feature in Kibana; however, we use the basic Elasticsearch license, which does not include this feature. Hence we used OAuth2 Proxy for Google authentication. 

Here’s the Helm chart for Kibana.

apiVersion: kibana.k8s.elastic.co/v1
kind: Kibana
metadata:
 name: kibana
spec:
 version: {{ .Values.es_version }}
 http:
   service:
     spec:
       type: NodePort
       ports:
       - name: http
         port: 80
         targetPort: 3000
   tls:
     selfSignedCertificate:
       disabled: true
 count: 1
 elasticsearchRef:
   name: {{ .Values.cluster_name }}
 config:
   server.publicBaseUrl: {{ .Values.kibana_url }}
   xpack.security.authc.providers:
     anonymous.anonymous1:
       order: 0
       credentials:
         username: "xxx"
         password: {{ .Values.es_readonly_password }}
     basic.basic1:
       order: 1
 podTemplate:
   spec:
     containers:
     - name: kibana
       resources:
         requests:
           memory: {{ .Values.kibana_memory_request }}
           cpu: {{ .Values.kibana_cpu_request }}
         limits:
           memory: {{ .Values.kibana_memory_limit }}
           cpu: {{ .Values.kibana_cpu_limit }}
       volumeMounts:
       - name: elasticsearch-templates
         mountPath: /etc/elasticsearch-templates
         readOnly: true
     - name: kibana-proxy
       image: 'quay.io/oauth2-proxy/oauth2-proxy:latest'
       imagePullPolicy: IfNotPresent
       args:
         - --cookie-secret={{ .Values.cookie_secret }}
         - --client-id={{ .Values.client_id }}
         - --client-secret={{ .Values.client_secret }}
         - --upstream=http://localhost:5601
         - --email-domain=example.com
         - --footer=-
         - --http-address=http://:3000
         - --redirect-url={{ .Values.redirect_url }}
         - --custom-sign-in-logo=https://path/to/logo
       ports:
         - containerPort: 3000
           name: http
           protocol: TCP
       resources:
         requests:
           memory: {{ .Values.proxy_memory_request }}
           cpu: {{ .Values.proxy_cpu_request }}
         limits:
           memory: {{ .Values.proxy_memory_limit }}
           cpu: {{ .Values.proxy_cpu_limit }}
     volumes:
       - name: elasticsearch-templates
         configMap:
           name: ilm-and-index-templates

As you might notice in the metadata, a service of type NodePort is requested. An ingress load balancer (not in this configuration) has been configured to route http and https traffic to port 3000 of the Kibana-proxy container. The .Values.{variable} fields in these configurations are the placeholders for the variables, which will be passed from the various environments.

Automating Post-Cluster Setup Configurations

Once the charts were deployed and the cluster was up and running, we also needed to carry out some manual configurations. This included index templates, index lifecycle management, policies, roles, data view index pattern creation, etc. 

To automate these tasks, we used Kubernetes resources – that is, ConfigMaps and CronJobs. We created all the necessary API requests as JSON files and, using ConfigMaps, mounted them on the attached volumes. A shell script that makes the API requests using the JSON files is also mounted. This script is executed on a daily basis with CronJob as well as after spawning the Kibana pod. 

Here’s the CronJob.

apiVersion: batch/v1
kind: CronJob
metadata:
 name: script-execution
spec:
 schedule: "0 5 * * *"
 jobTemplate:
   spec:
     template:
       spec:
         containers:
         - name: script-execution
           image: alpine/curl:latest
           imagePullPolicy: IfNotPresent
           command:
           - /bin/sh
           - -c
           - sh /etc/elasticsearch-templates/execution-script.sh
           volumeMounts:
           - name: elasticsearch-templates
             mountPath: /etc/elasticsearch-templates
             readOnly: true
         restartPolicy: OnFailure
         volumes:
           - name: elasticsearch-templates
             configMap:
               name: ilm-and-index-templates

The ConfigMap is mounted on the volumes using Terraform.

resource "kubernetes_config_map" "ilm-and-index-templates" {
 metadata {
   name      = "ilm-and-index-templates"
   namespace = var.namespace
 }
 data = {
    "application-indices.json" = templatefile("${path.module}/templates/application-indices.json",
     {
       ENV            = var.environment
       APP_LOG_PREFIX = var.app_log_prefix
     }
   )
   "execution-script.sh" = templatefile("${path.module}/templates/execution-script.sh",
     {
       ENV             = var.environment
       CLUSTER_NAME    = var.cluster_name
       ES_USER         = var.es_user
       ES_PASSWORD     = data.kubernetes_secret.pass.data["elastic"]
   )
   "delete-old-indices-policy.json" = templatefile("${path.module}/templates/delete-old-indices-policy.json",
     {
       RETENTION_AGE = var.retention_age
     }
   )
 }
}

To give you an idea, I’ll give you the shell script, too.

# Life Cycle Policy
curl -s -XPUT "http://${ES_USER}:${ES_PASSWORD}@${CLUSTER_NAME}-es-http.elasticsearch.svc:9200/_ilm/policy/Delete_app_indices" -H 'Content-Type: application/json' -d @/etc/elasticsearch-templates/delete-old-indices-policy.json


# Index templates
curl -s -XPUT "http://${ES_USER}:${ES_PASSWORD}@${CLUSTER_NAME}-es-http.elasticsearch.svc:9200/_index_template/application-indices" -H 'Content-Type: application/json' -d @/etc/elasticsearch-templates/application-indices.json

Following on from Our Kubernetes Configuration

In the next article, I discuss the lessons we have learned as a team while migrating Elasticsearch from EC2 instances to Kubernetes.

  • What could we have done initially in order to optimize the time it took to find a working solution?
  • Was this migration worthwhile? Or was it just some fancy way to run Elasticsearch without any added benefit to the simple installation on EC2 (or a physical node)?

Stay tuned for my final article!

DevOps Engineer (f/m/d)

  • adjoe
  • Cloud Engineering
  • Full-time

adjoe is a leading mobile ad platform developing cutting-edge advertising and monetization solutions that take its app partners’ business to the next level. Part of the applike group ecosystem, adjoe is home to an advanced tech stack, powerful financial backing from Bertelsmann, and a highly motivated workforce to be reckoned with.

Meet Your Team: Cloud Engineering

The Cloud Engineering team is the core of adjoe’s tech department. It is responsible for the underlying infrastructure that helps adjoe’s developers to run their software – and the company to grow its business. 

From various AWS services to Kubernetes and Open source projects, the team continuously validates new cloud and architecture services to efficiently handle a huge amount of data. Cloud Engineering tackles the challenge of choosing when to use self-hosted or managed services to reduce $300K of monthly hosting costs, while still ensuring convenience and data security for adjoe’s developers. 

Because adjoe needs to ensure high-quality service and minimal downtime to grow its business, Cloud Engineering invests heavily in monitoring and alerting technologies for insights into system health (networking, application logs, cloud service information, hardware, etc.). The cloud engineers also provide working solutions, knowledge, and documentation to the entire community of adjoe developers, giving them the autonomy to work on the infrastructure themselves and ensure the smooth sailing of adjoe’s systems.
What You Will Do:
  • You will collaborate with a team of experienced DevOps engineers to reinvent our cloud infrastructure by introducing new technologies and improving the existing environment.
  • You will help transfer our current managed AWS cloud infrastructure to self-hosted and open source technologies: We believe a hybrid approach, combining managed and self-hosted solutions,offers the best cost/efficiency ratio.
  • You will support our developers in building a high-performance backend using Go, utilizing our existing backend structures separated over several globally located data centers.
  • You will collaborate with experts from different technological backgrounds and countries, learn from highly experienced colleagues, and share your knowledge.
  • You will work with our current tech stack: Go, DruidDB, Kafka, DynamoDB, ScyllaDB, RDS, Kubernetes, Terraform, GitLab, ECS, EMR, Lambda, complex CI/CD pipelines, Prometheus, Data pipelines, and many more.
  • You will introduce new technologies, such as migrating parts of the architecture to our new Kubernetes and Kafka clusters, implementing Apache Spark/Flink, and establishing our own hosted object storage.
  • You will troubleshoot issues in complex systems, conduct root cause analysis, and implement appropriate solutions.

  • Who You Are:
  • You have a strong passion for Linux and a solid understanding of Linux operating systems.
  • You have proficiency in at least one programming language (e.g., Golang, Python, Rust, etc).
  • Familiarity with Amazon Web Services (AWS) and experience in provisioning and managing cloud resources.
  • Experience with containerization technologies like Docker.
  • Knowledge of networking concepts, including VPNs, network stack, and network encryption.
  • Familiarity with CI/CD tools and pipelines.
  • Self-motivated and eager to learn new technologies and tools.
  • Understanding of Kubernetes and experience in deploying and managing containerized applications using Kubernetes.
  • Familiarity with infrastructure-as-code tools like Terraform or CloudFormation
  • Experience with monitoring and logging tools (e.g., Prometheus, ELK stack).
  • Heard of Our Perks?
  • Tech Package: Create game-changing technologies and work with the newest technologies out there.
  • Wealth building: virtual stock options for all our regular employees.
  • Work-Life Package: 2 remote days per week, 30 vacation days, 3 weeks per year of remote work, flexible working hours, dog-friendly kick-ass office in the center of the city.
  • Relocation Package: Visa & legal support, relocation bonus, reimbursement of German Classes costs and more.
  • Happy Belly Package: Monthly company lunch, tons of free snacks and drinks, free breakfast & fresh delicious pastries every Monday
  • Physical & Mental Health Package: In-house gym with personal trainer, various classes like Yoga with expert teachers.
  • Activity Package: Regular team and company events, hackathons.
  • Education Package: Opportunities to boost your professional development with courses and trainings directly connected to your career goals 
  • Free of charge access to our EAP (Employee Assistance Program) which is a counseling service designed to support your mental health and well-being.
  • Skip writing cover letters. Tell us about your most passionate personal project, your desired salary and your earliest possible start date. We are looking forward to your application!We welcome applications from people who will contribute to the diversity of our company.

    Conquer cloud technologies at adjoe

    See vacancies