If you experience during an AKS Kubernetes version upgrade that only the control plane got upgraded, you are certainly using the Terraform Azure provider in version 1.40.0 or higher.
-> https://github.com/terraform-providers/terraform-provider-azurerm/issues/5541
A current workaround is a null_resource with a trigger on the Kubernetes version and running a Bash script with the local-exec provisioner. So, you do not have to upgrade the node pools manually.
...
resource "null_resource" "aks" {
triggers = {
aks_kubernetes_version = azurerm_kubernetes_cluster.aks.kubernetes_version
}
provisioner "local-exec" {
command = "./cluster-upgrade-fix.sh ${var.name} ${azurerm_resource_group.aks.name}"
working_dir = path.module
}
}
...
The script compares the Kubernetes version of the control plane and node pools. Differs the version it upgrades each node pool in the cluster accordingly.
#!/bin/bash
CLUSTER_NAME=$1
RESOURCE_GROUP=$2
CLUSTER_VERSION=$(az aks show --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP | jq -r .kubernetesVersion)
NODE_POOLS=$(az aks nodepool list --cluster-name $CLUSTER_NAME --resource-group $RESOURCE_GROUP --query '[].name' -o tsv)
for NODE_POOL in $NODE_POOLS; do
NODE_VERSION=$(az aks nodepool show --cluster-name $CLUSTER_NAME --resource-group $RESOURCE_GROUP --name $NODE_POOL | jq -r .orchestratorVersion)
if [[ $CLUSTER_VERSION != $NODE_VERSION ]]; then
az aks nodepool upgrade --kubernetes-version $CLUSTER_VERSION --cluster-name $CLUSTER_NAME --resource-group $RESOURCE_GROUP --name $NODE_POOL --verbose
fi
done
You find my updated Terraform AKS module on GitHub.
-> https://github.com/neumanndaniel/terraform/tree/master/modules/aks