Gruntwork Marketing Site

At some point during the lifetime of your Kubernetes cluster, you will need to perform maintenance on the underlying nodes. This may include package updates, kernel upgrades, or deploying new VM images. This is considered a “Voluntary Disruption” in Kubernetes.

This is part 1 of a 4 part blog series:

In this series, we will be covering all the tools that Kubernetes provides to achieve a zero downtime update for the underlying worker nodes in your cluster.

Stating the problem

We will start with a naive approach, identify challenges and potential risks of the approach, and incrementally build up to solve each problem that we identify throughout the series. We will finish with a config that leverages lifecycle hooks, readiness probes, and Pod disruption budgets to achieve our zero downtime rollout.

To start our journey, let’s look at a concrete example. Suppose we have a two node Kubernetes cluster running an application with two Pods backing a Service resource:

We want to upgrade the kernel version of the two underlying worker nodes in our cluster. How will we do this? A naive approach would be to launch new nodes with the updated config and then shutdown the old nodes once the new nodes were launched. While this works, there are a few issues with this approach:

When you shutdown the old nodes, you would be taking down the running pods with it. What if the pods need to clean up for a graceful shutdown? The underlying VM technology might not wait for the clean up process.
What if you shutdown all the nodes at the same time? You could have a brief outage while the pods are relaunched into the new nodes.

What we want is a way to gracefully migrate the pods off of the old nodes to ensure that none of our workloads are running while we make changes to the node. Or if we are doing a full replacement of the cluster as in the example (e.g replacing VM images), we want to move the workloads off of the old nodes to the new nodes. In both cases, we want to prevent new pods from being scheduled on the old node and then evict all the running pods off of it. We can use the kubectl drain command to achieve this.

Rescheduling Pods off of a node

The drain operation achieves the goal of rescheduling all the Pods off of the node. During the drain operation, the node is marked as unschedulable (the NoSchedule taint). This prevents new pods from being scheduled on the node. Afterwards, the drain operation starts evicting the pods from the node, shutting down the containers that are currently running on the node by sending the TERM signal to the underlying containers of the pods.

Although kubectl drain will gracefully handle pod eviction, there are still two factors that could cause service disruption during the drain operation:

Your service application needs to be able to gracefully handle the TERM signal. When a Pod is evicted, Kubernetes will send the TERM signal to the container, and then will wait for a configurable amount of time for the container to shutdown after giving the signal before forcefully terminating it. However, if your containers do not handle the signal gracefully, you could still shutdown the pods uncleanly if it is in the middle of doing work (e.g committing a database transaction).
You lose all the pods servicing your application. Your service could experience downtime while the new containers are being started on the new nodes, or, if you did not deploy your pods with controllers, they could end up never restarting.

Avoiding outages

To minimize downtime from a voluntary disruption like draining a node, Kubernetes provides the following disruption handling features:

In the rest of the series, we will use these features of Kubernetes to mitigate service disruption from an eviction event. To make it easier to follow along, we will use our example above with the following resource config:

---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.15
ports:
- containerPort: 80
---
kind: Service
apiVersion: v1
metadata:
name: nginx-service
spec:
selector:
app: nginx
ports:
- protocol: TCP
targetPort: 80
port: 80

This config is a minimal example of a Deployment resource that manages multiple nginx pods (in our case two). This resource will work towards maintaining two nginx pods in the cluster. Additionally, the config will provision a Service resource that can be used to access the nginx pods within the cluster.

We will incrementally add to this throughout this series to build up to a final config that implements all of the features Kubernetes provides to minimize downtime during a maintenance operation. Here is our roadmap:

Head on over to the next post to learn how you can leverage lifecycle hooks to gracefully shutdown your Pods.

To get a fully implemented and tested version of zero downtime Kubernetes cluster updates on AWS and more, check out Gruntwork.io.

Enjoyed the article?

Zero Downtime Server Updates For Your Kubernetes Cluster

Stating the problem

Rescheduling Pods off of a node

Avoiding outages

Automated Testing for Kubernetes and Helm Charts using Terratest

Delaying Shutdown to Wait for Pod Deletion Propagation

Gracefully Shutting Down Pods in a Kubernetes Cluster

Platform

Services

Open Source

Resources

Company

Zero Downtime Server Updates For Your Kubernetes Cluster

Stating the problem

Rescheduling Pods off of a node

Avoiding outages

Related Articles

Automated Testing for Kubernetes and Helm Charts using Terratest

Delaying Shutdown to Wait for Pod Deletion Propagation

Gracefully Shutting Down Pods in a Kubernetes Cluster

Platform

Services

Open Source

Resources

Company