Zero Downtime Server Updates For Your Kubernetes Cluster
At some point during the lifetime of your Kubernetes cluster, you will need to perform maintenance on the underlying nodes. This may…
At some point during the lifetime of your Kubernetes cluster, you will need to perform maintenance on the underlying nodes. This may include package updates, kernel upgrades, or deploying new VM images. This is considered a “Voluntary Disruption” in Kubernetes.
This is part 1 of a 4 part blog series:
- This post
- Gracefully shutting down Pods
- Delaying Shutdown to Wait for Pod Deletion Propagation
- Avoiding Outages with PodDisruptionBudgets
In this series, we will be covering all the tools that Kubernetes provides to achieve a zero downtime update for the underlying worker nodes in your cluster.
Stating the problem
We will start with a naive approach, identify challenges and potential risks of the approach, and incrementally build up to solve each problem that we identify throughout the series. We will finish with a config that leverages lifecycle hooks, readiness probes, and Pod disruption budgets to achieve our zero downtime rollout.
To start our journey, let’s look at a concrete example. Suppose we have a two node Kubernetes cluster running an application with two Pods backing a Service
resource:

We want to upgrade the kernel version of the two underlying worker nodes in our cluster. How will we do this? A naive approach would be to launch new nodes with the updated config and then shutdown the old nodes once the new nodes were launched. While this works, there are a few issues with this approach:
- When you shutdown the old nodes, you would be taking down the running pods with it. What if the pods need to clean up for a graceful shutdown? The underlying VM technology might not wait for the clean up process.
- What if you shutdown all the nodes at the same time? You could have a brief outage while the pods are relaunched into the new nodes.
What we want is a way to gracefully migrate the pods off of the old nodes to ensure that none of our workloads are running while we make changes to the node. Or if we are doing a full replacement of the cluster as in the example (e.g replacing VM images), we want to move the workloads off of the old nodes to the new nodes. In both cases, we want to prevent new pods from being scheduled on the old node and then evict all the running pods off of it. We can use the kubectl drain
command to achieve this.
Rescheduling Pods off of a node
The drain operation achieves the goal of rescheduling all the Pods off of the node. During the drain operation, the node is marked as unschedulable (the NoSchedule
taint). This prevents new pods from being scheduled on the node. Afterwards, the drain operation starts evicting the pods from the node, shutting down the containers that are currently running on the node by sending the TERM
signal to the underlying containers of the pods.
Although kubectl drain
will gracefully handle pod eviction, there are still two factors that could cause service disruption during the drain operation:
- Your service application needs to be able to gracefully handle the
TERM
signal. When a Pod is evicted, Kubernetes will send theTERM
signal to the container, and then will wait for a configurable amount of time for the container to shutdown after giving the signal before forcefully terminating it. However, if your containers do not handle the signal gracefully, you could still shutdown the pods uncleanly if it is in the middle of doing work (e.g committing a database transaction). - You lose all the pods servicing your application. Your service could experience downtime while the new containers are being started on the new nodes, or, if you did not deploy your pods with controllers, they could end up never restarting.
Avoiding outages
To minimize downtime from a voluntary disruption like draining a node, Kubernetes provides the following disruption handling features:
In the rest of the series, we will use these features of Kubernetes to mitigate service disruption from an eviction event. To make it easier to follow along, we will use our example above with the following resource config:
--- apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment labels: app: nginx spec: replicas: 2 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:1.15 ports: - containerPort: 80 --- kind: Service apiVersion: v1 metadata: name: nginx-service spec: selector: app: nginx ports: - protocol: TCP targetPort: 80 port: 80
This config is a minimal example of a Deployment
resource that manages multiple nginx pods (in our case two). This resource will work towards maintaining two nginx pods in the cluster. Additionally, the config will provision a Service
resource that can be used to access the nginx pods within the cluster.
We will incrementally add to this throughout this series to build up to a final config that implements all of the features Kubernetes provides to minimize downtime during a maintenance operation. Here is our roadmap:
- Gracefully shutting down Pods
- Delaying Shutdown to Wait for Pod Deletion Propagation
- Avoiding outages with PodDisruptionBudgets
Head on over to the next post to learn how you can leverage lifecycle hooks to gracefully shutdown your Pods.
To get a fully implemented and tested version of zero downtime Kubernetes cluster updates on AWS and more, check out Gruntwork.io.