adjoe Engineers’ Blog
 /  Infrastructure  /  7 Lessons I Learned While Migrating Elasticsearch to Kubernetes
decorative purple image with kubernetes logo and code
Infrastructure

7 Lessons I Learned While Migrating Elasticsearch to Kubernetes

Welcome to the final article of the series on migrating Elasticsearch to Kubernetes!

I’ll give you a quick recap. In previous articles, I discussed various details from the implementation stage, including the Elasticsearch and Kibana deployments via Helm charts. I also discussed the post-cluster setup, which we automated via ConfigMap and CronJob. 

Following on from these steps, I’ve dedicated this last article to describing the lessons we learned when we migrated Elasticsearch to Kubernetes. I’ll also discuss all the migration’s benefits and the drawbacks we encountered.

Which Main Lessons Did We Learn?

First things first, let’s go through some of the most important lessons we learned while migrating Elasticsearch to Kubernetes.

1. Planning Is Key

There are many ways to deploy Elasticsearch to Kubernetes. Our DevOps team wanted to go for the Kubernetes operator way with Helm charts, as it is one of the most scalable and maintainable ways to deploy applications on Kubernetes. We also wanted everything to be provisioned by Terraform, so we used Terraform’s helm_release API. 

After reviewing enough documentation, we deployed a simple one-node Elasticsearch cluster. Once we validated our initial setup, we scaled to three master- and multiple-data node clusters. Then, we worked on the various tasks mentioned in the previous articles. The entire planning stage ultimately involved planning, executing, validating, and then repeating steps – proceeding in incremental stages and recognizing these successes as checkpoints.

At the migration stage, we planned to route all of our logs from different applications toward the new Elasticsearch setup. We also wanted to retain our previous cluster, which had older logs. We only brought down the old cluster after our minimum log retention period had passed. This ensured two things: a) we had old logs from applications; b) we had our old cluster as a backup in case anything went wrong with the new one.

2. StatefulSets Are Essential

Elasticsearch is a stateful application, which means that it needs to maintain its state across restarts. In Kubernetes, StatefulSets ensure that pods are created and destroyed in a specific order and that their data is preserved.

3. PersistentVolumes Are Crucial

Elasticsearch stores its data on disk, so it’s important to use PersistentVolumes to ensure that data is not lost when pods are recreated. Our team uses PersistentVolumeClaims, which requests for storage resource (PersistentVolume) each time the pod is created/recreated.

4. Plan the CPU and Memory Resources Required for Master/Data Pods

Resources planning is essential, which can vary depending on the amount of logs ingested by your cluster, number of indices, and how much and how fast the indices grow. You also need to take into consideration the resources that the JVM application needs. 

If you are migrating from an existing cluster, you might have an idea about the resources needed and can set the pod resources accordingly. Still, it might take a few attempts of configuring and monitoring to reach the final resource allocation.

5. Monitoring and Logging Are Important

Monitoring and logging are essential for troubleshooting and debugging issues. In Kubernetes, you can use tools like Prometheus and Fluentd to monitor and log your Elasticsearch cluster.

6. Test Thoroughly

Before deploying to production, it’s important that you thoroughly test the Elasticsearch cluster in a development environment to ensure that it’s configured correctly and that there are no issues.

7. Specify Annotations

If you want to manage load balancers via AWS, you need to create an ingress object. Make sure the load balancer is private by specifying annotations:

alb.ingress.kubernetes.io/scheme: internal.

What Were the Pros of the Migration?

You can deploy Elasticsearch and Kubernetes in different ways, with some popular solutions including Helm charts, operator, or custom controllers. Make sure you choose the solution that works best for your use case. 

As you might remember, we wanted to migrate to Kubernetes to reduce hosting costs and ensure convenience and security for our developers here at adjoe. So, with this in mind, these are some of the pros we identified as a team when we migrated to Kubernetes.

  • The ECK operator brought us mostly benefits and automated various mundane tasks. For example: By decreasing the number of data nodes in StatefulSets, the ECK operator relocates all the shards from the top of the data pods stack and terminates those pods. Without the ECK operator, removing the nodes from the cluster would require draining the nodes. The cluster would need to be communicated to exclude the nodes from allocating any shards and to reallocate the existing shards to other nodes. Only then would the nodes be marked as safe to remove from the cluster. 

  • We previously needed to upgrade Elasticsearch versions manually by bringing down the nodes one by one and upgrading the Elasticsearch version. This required considerable time and effort. The Elasticsearch StatefulSet, however, performs a rolling update of the pods incrementally. Also, the operator that sits above the StatefulSet ensures the cluster is in green state before proceeding further to another pod. 

  • The ECK operator incorporates most of the security best practices by default into the cluster.

  • Kubernetes is a better platform for automating various tasks with the help of ConfigMaps and CronJobs.

What Were the Cons of the Migration?

Despite the extra layer of abstraction between the Elasticsearch cluster and the underlying infrastructure, Kubernetes events allow us to troubleshoot issues by giving us more insights into them. Such as when a pod is not getting scheduled. 

However, the events have a retention period of 60 minutes in EKS and will be lost if not saved to some other location. Also, if a pod is crash-looping due to the application, we might not be able to extract the logs of the crashed container if Kubernetes is not connected to a central logging system.

What to Know When Migrating to Kubernetes

In conclusion, migrating Elasticsearch to Kubernetes can offer various benefits, but it is not without its challenges. By planning carefully, using StatefulSets and PersistentVolumes, monitoring and logging, and thoroughly testing the Elasticsearch cluster, you can ensure a successful migration and reduce hosting costs like our DevOps team did.

Senior DevOps Engineer (f/m/d)

  • adjoe
  • Cloud Engineering
  • Full-time
adjoe is a leading mobile ad platform developing cutting-edge advertising and monetization solutions that take its app partners’ business to the next level. Part of the applike group ecosystem, adjoe is home to an advanced tech stack, powerful financial backing from Bertelsmann, and a highly motivated workforce to be reckoned with.  We are looking for a Senior DevOps engineer to strengthen the Cloud Engineering team.

Meet Your Team: Cloud Engineering

The Cloud Engineering team is the core of adjoe’s tech department. It is responsible for the underlying infrastructure that helps adjoe’s developers to run their software – and the company to grow its business. 

From various AWS services to Kubernetes and Apache foundation open source projects, the team continuously validates new cloud and architecture services to efficiently handle a huge amount of data. Cloud Engineering tackles the challenge of choosing when to use self-managed or managed services to reduce $300K of monthly hosting costs, while still ensuring convenience and data security for adjoe’s developers. 

Because adjoe needs to ensure high-quality service and minimal downtime to grow its business, Cloud Engineering invests heavily in monitoring and alerting technologies for insights into system health (networking, application logs, cloud service information, hardware, etc.). The cloud engineers also provide working solutions, knowledge, and documentation to the entire community of adjoe developers, giving them the autonomy to work on the infrastructure themselves and ensure the smooth sailing of adjoe’s systems. 
What You Will Do
  • You will work together in a team of experienced DevOps engineers to reinvent our cloud infrastructure by introducing new technologies and improving the existing environment.
  • You will help transfer our current managed AWS cloud infrastructure to self-hosted and open source technologies: We believe a hybrid combination between managed and self-hosted offers the best cost/efficiency ratio.
  • You will support our developers in building a high-performance backend with Go (based on our existing backend structures separated over several globally located data centers).
  • You will collaborate with experts from different technological backgrounds and countries, learn from highly experienced colleagues, and share your knowledge.
  • You will work with our current tech stack: Go, DruidDB, Kafka, DynamoDB, ScyllaDB, RDS, Kubernetes, Terraform, Gitlab, ECS, EMR, Lambda, complex CI/CD pipelines, Prometheus, Data pipelines, and many more.
  • You will introduce new technologies, including migrating part of the architecture to our new Kubernetes and Kafka clusters and introducing Apache Spark/Flink and our own hosted object storage.
  • You will troubleshoot issues in complex systems, conduct root cause analysis, and implement appropriate solutions.
  • You will collaborate with junior team members and foster their professional growth.
  • Who You Are
  • You are a skilled DevOps engineer/SRE/Platform engineer with a strong passion for improving scalability and cloud infrastructure and a keen interest in coding.
  • You have an extensive experience with Kubernetes and ECS.
  • You have a profound understanding of the AWS ecosystem and infrastructure as code with Terraform.
  • You have good knowledge of microservice architecture and c communication between microservices (e.g. Topics, Queues, Object storage, etc.).
  • You have a deep understanding of CI/CD tools and experience with building and maintaining pipelines.
  • You have strong problem-solving skills and ability to tackle complex technical challenges.
  • You are self-motivated and eager to learn new technologies and tools.
  • You are open to relocating to Hamburg, Germany
  • Heard of Our Perks?
  • Work-Life Package: 2 remote days per week, 30 vacation days, 3 weeks per year of remote work, flexible working hours, dog-friendly kick-ass office in the center of the city.
  • Relocation Package: Visa & legal support, relocation bonus, reimbursement of German Classes costs, and more.
  • Happy Belly Package: Monthly company lunch, tons of free snacks and drinks, free breakfast & fresh delicious pastries every Monday
  • Physical & Mental Health Package: In-house gym with a personal trainer, various classes like Yoga with expert teachers & free of charge access to our EAP (Employee Assistance Program) to support your mental health and well-being
  • Activity Package: Regular team and company events, and hackathons.
  • Education Package: Opportunities to boost your professional development with courses and training directly connected to your career goals 
  • Wealth building: virtual stock options for all our regular employees
  • Skip writing cover letters. Tell us about your most passionate personal project, your desired salary and your earliest possible start date. We are looking forward to your application!

    We welcome applications from people who will contribute to the diversity of our company.

    Conquer cloud technologies at adjoe

    See vacancies