Daniel's Tech Blog

Cloud Computing, Cloud Native & Kubernetes

Azure Kubernetes Fleet Manager – Advance your Kubernetes cluster update management on Azure

The Azure Kubernetes Fleet Manager comes with two different configuration options with and without a hub cluster configuration.

In today’s blog post, we focus on the Azure Kubernetes Fleet Manager without a hub cluster configuration. This configuration option only provides the Azure Kubernetes Service update management, and this is our focus for today,

Before we dive into the topic, let us step back and answer the question of why we need the Azure Kubernetes Fleet Manager in times of infrastructure as code.

Why we need the Azure Kubernetes Fleet Manager?

Imagine you use Terraform for your infrastructure as code and use either GitHub Actions or Terraform Cloud to apply your definitions. You have a large number of Azure Kubernetes Service clusters with hundreds of nodes each. Depending on how you have configured the max surge setting, a Kubernetes version upgrade can take a long time.

This is a problem in two ways. First, the Terraform provider for Azure has a default 90-minute timeout configured for the azurerm_kubernetes_cluster resource. It might be that you need to overwrite the default value to prevent running into a timeout. Second, costs. For instance, GitHub Actions are billed per minute and have default timeouts as well.

Another issue arises with infrastructure as code when using the Azure Kubernetes Service automated Kubernetes version upgrades. You must use then the lifecycle ignore changes instructions to prevent changes for the Kubernetes version when you apply your infrastructure as code configuration. This makes a Kubernetes version upgrade a bit more work, as you need to remove the instructions first and re-add them later.

Here comes the Azure Kubernetes Fleet Manager into play, which allows better scheduling, control, and execution of Kubernetes version upgrades.

Fleet Manager Deployment

In the Azure portal, we create a new Azure Kubernetes Fleet Manager instance and select the hub cluster mode without hub cluster.

Azure Kubernetes Fleet Manager - Azure portal Azure Kubernetes Fleet Manager - Azure portal

Afterward, we onboard the existing Azure Kubernetes Service cluster in this example aks-azst-1 and aks-azst-2, both running Kubernetes version 1.27.7.

Azure Kubernetes Fleet Manager - Azure portal Azure Kubernetes Fleet Manager - Azure portal

During the Azure Kubernetes Service cluster onboarding to the Fleet Manager instance, you can define update groups for each cluster that are important at a later stage. In our case, aks-azst-1 is assigned to the canary update group and aks-azst-2 to the production one.

Azure Kubernetes Fleet Manager - Azure portal

The clusters are now members of the Fleet Manager instance and show up on the overview page reporting their Kubernetes and node OS image versions.

Azure Kubernetes Fleet Manager - Azure portal Azure Kubernetes Fleet Manager - Azure portal

Define an update strategy

Before we start a Kubernetes cluster upgrade of our Azure Kubernetes Service, we define an update strategy to control the overall update process.

Azure Kubernetes Fleet Manager update strategy - Azure portal Azure Kubernetes Fleet Manager update strategy - Azure portal

Within a strategy, we can define multiple stages to control the update process. The first stage is called canary and targets the canary update group. Before we proceed with the next stage in the strategy, we will pause the update of other Azure Kubernetes Service clusters for an hour.

Azure Kubernetes Fleet Manager update strategy - Azure portal Azure Kubernetes Fleet Manager update strategy - Azure portal

Our second stage is called production and targets our production update group.

Azure Kubernetes Fleet Manager update strategy - Azure portal Azure Kubernetes Fleet Manager update strategy - Azure portal

The update strategy is now ready to be used within an update run.

Azure Kubernetes Fleet Manager update strategy - Azure portal Azure Kubernetes Fleet Manager update strategy - Azure portal

Update stages are executed sequentially during an update run, and all update groups within a stage in parallel.

-> https://learn.microsoft.com/en-us/azure/kubernetes-fleet/architectural-overview?WT.mc_id=AZ-MVP-5000119#update-orchestration-across-multiple-clusters

Execute an update run

Defining an update run does not execute the Kubernetes upgrade immediately. It has to be triggered manually by starting the defined update run.

For our update run, we set the update sequence to stages and copy the stages from our previously created update strategy. We set the upgrade scope to Kubernetes version and select 1.28.3 as the target version. Furthermore, we want to use the latest available node OS image in the respective Azure regions.

Azure Kubernetes Fleet Manager update run - Azure portal Azure Kubernetes Fleet Manager update run - Azure portal

As said before, an update run does not start automatically. Hence, we select the update run and hit Start, which kicks off the run.

Azure Kubernetes Fleet Manager update run - Azure portal Azure Kubernetes Fleet Manager update run - Azure portal

By clicking on the update run and the different stages, we get important information about the execution states of our update in progress.

Azure Kubernetes Fleet Manager update run - Azure portal Azure Kubernetes Fleet Manager update run - Azure portal Azure Kubernetes Fleet Manager update run - Azure portal Azure Kubernetes Fleet Manager update run - Azure portal

Once the update run succeeded, we can check the clusters’ Kubernetes and node OS image versions on the Fleet Manager’s overview page.

Azure Kubernetes Fleet Manager update run - Azure portal Azure Kubernetes Fleet Manager update run - Azure portal

Both Azure Kubernetes Service clusters are now running Kubernetes version 1.28.3.

Summary

The Azure Kubernetes Fleet Manager is an invaluable addition to the Azure Kubernetes Service cluster management in Azure and makes the Kubernetes version upgrade a breeze, as well in combination with already existing infrastructure as code configurations.

Besides the Kubernetes version upgrade capabilities, the Fleet Manager can also replicate and maintain Kubernetes resource objects when running the Fleet Manager with the hub cluster configuration.

-> https://learn.microsoft.com/en-us/azure/kubernetes-fleet/resource-propagation?WT.mc_id=AZ-MVP-5000119

You can use Terraform to define the bespoken update configuration entirely in infrastructure as code.

-> Create Azure Kubernetes Fleet Manager: https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/kubernetes_fleet_manager
-> Add member clusters: https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/kubernetes_fleet_member
-> Define an update strategy: https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/kubernetes_fleet_update_strategy

The only thing right now that needs to be defined and executed manually is the actual update run.

For further information about the Azure Kubernetes Fleet Manager, have a look at the Azure documentation.

-> https://learn.microsoft.com/en-us/azure/kubernetes-fleet/?WT.mc_id=AZ-MVP-5000119

WordPress Cookie Notice by Real Cookie Banner