Increase your application availability with pod anti-affinity settings in Azure Kubernetes Service

This is the second blog post of a series of posts covering the topic about increasing the application availability on Azure Kubernetes Services / Kubernetes.

Today we cover the pod anti-affinity setting.

What is the pod anti-affinity?

In the first post of the series, I talked about the PodDisruptionBudget. The PDB guarantees that a certain amount of your application pods is available.

Defining a pod anti-affinity is the next step increasing your application’s availability. A pod anti-affinity guarantees the distribution of the pods across different nodes in your Kubernetes cluster.

You can define a soft or a hard pod anti-affinity for your application.

The soft anti-affinity is best-effort and might lead to the state that a node runs two replicas of your application instead of distributing it across different nodes.

Using the hard anti-affinity guarantees the distribution across different nodes in your cluster. The only downside using the hard anti-affinity in certain circumstances is a reduction in the overall replica count of your deployment when a node or several nodes have an outage.

Combined with a PDB this can also lead to a deadlock.

So, I recommend using the soft anti-affinity.

Using the pod anti-affinity setting

Let us have a look at the following Kubernetes template which makes use of the pod anti-affinity.

...
  template:
    metadata:
      labels:
        app: go-webapp
        version: v1
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - go-webapp
              topologyKey: kubernetes.io/hostname
      containers:
...

In the template itself I am using a soft anti-affinity which is defined using the term preferredDuringSchedulingIgnoredDuringExecution where a hard anti-affinity is defined by requiredDuringSchedulingIgnoredDuringExecution.

The soft anti-affinity has a special configuration setting called weight which is added to the scheduler calculation controlling the likelihood distributing the pods across different nodes. 1 is the lowest value and 100 the highest. When you want a higher chance of distributing the pods across different nodes with the soft anti-affinity use the value 100 here.

The labelSelector and topologyKey then defines how the scheduling works. The definition above is read like this: A pod should not be scheduled on the node if a pod with the label app=go-webapp is already running on it.

When we deploy our template on the AKS cluster all our replicas run on different nodes.

> kubectl get pods -o wide
NAME                         READY   STATUS    RESTARTS   AGE   IP             NODE                                NOMINATED NODE   READINESS GATES
go-webapp-75c66f85cf-984sk   2/2     Running   0          41s   10.240.0.28    aks-nodepool1-14987876-vmss00001m   <none>           <none>
go-webapp-75c66f85cf-plnk5   2/2     Running   0          26s   10.240.2.10    aks-nodepool1-14987876-vmss00001o   <none>           <none>
go-webapp-75c66f85cf-twck2   2/2     Running   0          41s   10.240.1.145   aks-nodepool1-14987876-vmss00001n   <none>           <none>

Frankly, Kubernetes always tries to distribute your application pods across different nodes. But the pod anti-affinity allows you to better control it.

Soft vs. hard anti-affinity

As mentioned previously soft is best-effort and hard guarantees the distribution. For instance, let us deploy the Kubernetes template on a Docker for Mac single node Kubernetes cluster. First time with the soft anti-affinity setting and the second time with the hard anti-affinity setting.

...
  template:
    metadata:
      labels:
        app: go-webapp
        version: v1
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - go-webapp
            topologyKey: kubernetes.io/hostname
      containers:
...

Using the soft anti-affinity setting brings up all three replicas compared to the one replica using the hard anti-affinity setting.

### Soft anti-affinity ###
> kubectl get pods -o wide
NAME                         READY   STATUS    RESTARTS   AGE   IP          NODE             NOMINATED NODE   READINESS GATES
go-webapp-666859f746-hnnrv   2/2     Running   0          59s   10.1.0.65   docker-desktop   <none>           <none>
go-webapp-666859f746-ltgvr   2/2     Running   0          82s   10.1.0.64   docker-desktop   <none>           <none>
go-webapp-666859f746-tjqqp   2/2     Running   0          38s   10.1.0.66   docker-desktop   <none>           <none>

### Hard anti-affinity ###
> kubectl get pods -o wide
NAME                         READY   STATUS    RESTARTS   AGE   IP          NODE             NOMINATED NODE   READINESS GATES
go-webapp-5748776476-cxq76   0/2     Pending   0          48s   <none>      <none>           <none>           <none>
go-webapp-5748776476-sdnkj   2/2     Running   0          74s   10.1.0.67   docker-desktop   <none>           <none>
go-webapp-5748776476-twwbt   0/2     Pending   0          48s   <none>      <none>           <none>           <none>

Also have a look at the following screenshots where I did the same on the AKS cluster and drained one of the nodes.

As you see using the hard anti-affinity leads to a state where the overall replica count is reduced until a new node is available to host the pod.

What protection provides the pod anti-affinity?

The pod anti-affinity provides protection against node failures and thus ensures a higher availability of your application.

Summary

Using the pod anti-affinity protects your application against node failures distributing the pods across different nodes on a best-effort or guarantee basis.

As mentioned earlier Kubernetes always tries to distribute your application pods across different nodes even without a specified pod anti-affinity. But the pod anti-affinity allows you to better control it.

You can even go further and use another topologyKey like topology.kubernetes.io/zone protecting your application against zonal failures.

A better solution for this are pod topology spread constraints which reached the stable feature state with Kubernetes 1.19.

I will cover pod topology spread constraints in the next blog post of this series. Stay tuned.