Back in May Microsoft released the public preview of Windows Server support for Azure Kubernetes Service.
When you are starting with Windows Server node pools in AKS you should at least be aware of some prerequisites and limitations.
- Windows Server node pools require Azure CNI alias AKS Advanced Networking
- The first node pool is a Linux-based one hosting the Kubernetes system services and thus cannot be deleted. This node pool should have at least two nodes.
- Kubernetes Network Policies, Azure NPM and Calico, are not supported.
- Windows Server node pool names are restricted to 6 characters max.
- Azure Monitor for containers does not fully supports Windows Server node pools.
We talk about some of the points later in this article.
First, I like to highlight an important setting you should configure before deploying an AKS cluster with a Windows Server node pool. It is the option to specify one or several taints per node pool.
As we do not want to schedule pods accidentally on Windows Server nodes, we must set a taint. For instance, the following one.
... taints = [ "kubernetes.io/os=windows:NoSchedule" ] ...
... "taints": [ "kubernetes.io/os=windows:NoSchedule" ] ...
Thus, we ensure that only pods tolerating the taint are scheduled on Windows Server nodes. If we would not have specified the taint, Linux-based pods could be scheduled on the Windows Server nodes and fail to start. I have seen this for instance, when using Helm 2 with Tiller leveraging the Helm Terraform provider. For god sake Helm 3 is available now and we do not need Tiller anymore.
Long story short, always set a taint for Windows Server node pools.
> kubectl get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME aks-pool1-33037761-vmss000000 Ready agent 25m v1.14.8 10.240.0.4 <none> Ubuntu 16.04.6 LTS 4.15.0-1060-azure docker://3.0.7 aks-pool1-33037761-vmss000001 Ready agent 26m v1.14.8 10.240.0.255 <none> Ubuntu 16.04.6 LTS 4.15.0-1060-azure docker://3.0.7 akspool2000000 Ready agent 22m v1.14.8 10.240.1.250 <none> Windows Server 2019 Datacenter 10.0.17763.737 docker://19.3.2 > kubectl describe nodes | grep -e "Name:" -e "Taints" Name: aks-pool1-33037761-vmss000000 Taints: <none> Name: aks-pool1-33037761-vmss000001 Taints: <none> Name: akspool2000000 Taints: kubernetes.io/os=windows:NoSchedule
You do not break the AKS cluster, because all Kubernetes system services are running on the first node pool.
> kubectl get pods --all-namespaces -o wide NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES kube-system calico-node-jsnwk 1/1 Running 0 7m30s 10.240.0.4 aks-pool1-33037761-vmss000000 <none> <none> kube-system calico-node-kdbz9 1/1 Running 0 8m35s 10.240.0.255 aks-pool1-33037761-vmss000001 <none> <none> kube-system calico-typha-85664b5b66-4fj9d 1/1 Running 0 6m26s 10.240.0.255 aks-pool1-33037761-vmss000001 <none> <none> kube-system calico-typha-horizontal-autoscaler-77df4784d7-rfpd6 1/1 Running 0 6m29s 10.240.1.126 aks-pool1-33037761-vmss000001 <none> <none> kube-system coredns-7fc597cc45-g8wkf 1/1 Running 0 6m29s 10.240.0.156 aks-pool1-33037761-vmss000000 <none> <none> kube-system coredns-7fc597cc45-gpq86 1/1 Running 0 6m30s 10.240.1.120 aks-pool1-33037761-vmss000001 <none> <none> kube-system coredns-autoscaler-7ccc76bfbd-6qt9m 1/1 Running 0 6m25s 10.240.1.123 aks-pool1-33037761-vmss000001 <none> <none> kube-system kube-proxy-9dngq 1/1 Running 0 3m5s 10.240.0.255 aks-pool1-33037761-vmss000001 <none> <none> kube-system kube-proxy-tqncv 1/1 Running 0 2m30s 10.240.0.4 aks-pool1-33037761-vmss000000 <none> <none> kube-system metrics-server-58b6fcfd54-hvghz 1/1 Running 0 6m29s 10.240.1.241 aks-pool1-33037761-vmss000001 <none> <none> kube-system omsagent-ktjc8 1/1 Running 1 8m35s 10.240.1.226 aks-pool1-33037761-vmss000001 <none> <none> kube-system omsagent-nd66v 1/1 Running 0 7m30s 10.240.0.10 aks-pool1-33037761-vmss000000 <none> <none> kube-system omsagent-rs-649477b4c8-4f4b9 1/1 Running 1 6m28s 10.240.1.70 aks-pool1-33037761-vmss000001 <none> <none> kube-system tunnelfront-dbd5b5b9b-bxhvq 1/1 Running 0 6m26s 10.240.1.106 aks-pool1-33037761-vmss000001 <none> <none>
Here is an example of the node pool configuration for an AKS cluster with a Windows Server node pool.
module "aks" { source = "../modules/aks-windows" ... agent_pool_configuration = [ { agent_count = 2 vm_size = "Standard_D2_v3" zones = ["1", "2"] agent_os = "Linux" taints = null }, { agent_count = 1 vm_size = "Standard_D4_v3" zones = ["1", "2"] agent_os = "Windows" taints = [ "kubernetes.io/os=windows:NoSchedule" ] } ] }
... "agentPoolProfiles": { "value": [ { "nodeCount": 2, "nodeVmSize": "Standard_D2_v3", "nodeOsType": "Linux" "availabilityZones": [ "1", "2" ], "enableAutoScaling": false, "taints": null }, { "nodeCount": 1, "nodeVmSize": "Standard_D4_v3", "nodeOsType": "Windows" "availabilityZones": [ "1", "2" ], "enableAutoScaling": false "taints": [ "kubernetes.io/os=windows:NoSchedule" ], } ] }, ...
Our next step is to adjust the Kubernetes templates. So, our Windows containers get scheduled on the correct nodes. All we need to do is to define the tolerations and the nodeSelector options. Both settings are mandatory in our case.
... spec: tolerations: - key: kubernetes.io/os operator: Equal value: windows effect: NoSchedule nodeSelector: "kubernetes.io/os": windows containers: ...
> kubectl get pods --all-namespaces -o wide NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES ambassador ambassador-f69b8b555-7v8tb 1/1 Running 0 17m 10.240.1.144 aks-pool1-33037761-vmss000001 <none> <none> ambassador ambassador-f69b8b555-gmf2v 1/1 Running 0 19m 10.240.0.125 aks-pool1-33037761-vmss000000 <none> <none> default helloworld-655645ffc4-b9g4j 1/1 Running 0 11m 10.240.2.119 akspool2000000 <none> <none> kube-system calico-node-jsnwk 1/1 Running 0 36m 10.240.0.4 aks-pool1-33037761-vmss000000 <none> <none> kube-system calico-node-kdbz9 1/1 Running 0 37m 10.240.0.255 aks-pool1-33037761-vmss000001 <none> <none> kube-system calico-typha-85664b5b66-4fj9d 1/1 Running 0 35m 10.240.0.255 aks-pool1-33037761-vmss000001 <none> <none> kube-system calico-typha-horizontal-autoscaler-77df4784d7-rfpd6 1/1 Running 0 35m 10.240.1.126 aks-pool1-33037761-vmss000001 <none> <none> kube-system coredns-7fc597cc45-g8wkf 1/1 Running 0 35m 10.240.0.156 aks-pool1-33037761-vmss000000 <none> <none> kube-system coredns-7fc597cc45-gpq86 1/1 Running 0 35m 10.240.1.120 aks-pool1-33037761-vmss000001 <none> <none> kube-system coredns-autoscaler-7ccc76bfbd-6qt9m 1/1 Running 0 35m 10.240.1.123 aks-pool1-33037761-vmss000001 <none> <none> kube-system kube-proxy-9dngq 1/1 Running 0 32m 10.240.0.255 aks-pool1-33037761-vmss000001 <none> <none> kube-system kube-proxy-tqncv 1/1 Running 0 31m 10.240.0.4 aks-pool1-33037761-vmss000000 <none> <none> kube-system metrics-server-58b6fcfd54-hvghz 1/1 Running 0 35m 10.240.1.241 aks-pool1-33037761-vmss000001 <none> <none> kube-system omsagent-ktjc8 1/1 Running 1 37m 10.240.1.226 aks-pool1-33037761-vmss000001 <none> <none> kube-system omsagent-nd66v 1/1 Running 0 36m 10.240.0.10 aks-pool1-33037761-vmss000000 <none> <none> kube-system omsagent-rs-649477b4c8-4f4b9 1/1 Running 1 35m 10.240.1.70 aks-pool1-33037761-vmss000001 <none> <none> kube-system tunnelfront-dbd5b5b9b-bxhvq 1/1 Running 0 35m 10.240.1.106 aks-pool1-33037761-vmss000001 <none> <none>
Last but not least is the Azure Monitor for containers support. Windows Server is not fully supported, as Windows Server nodes do not have the oms-agent pod running. In the end no logs are gathered for Windows containers now. But the container live logging functionality works. So, we do not have to live completely without any logging functionality right now.
I hope you got some useful information about working with Windows Server node pools in Azure Kubernetes Service.
Summarizing the important steps, you need to take care of.
- Specify taints for Windows Server node pools.
- Specify the nodeSelector option and tolerations in your Kubernetes templates.
You find the code samples for Terraform and Azure Resource Manager templates on my GitHub repository as well the example Kubernetes template.
-> https://github.com/neumanndaniel/terraform/tree/master/modules/aks-windows
-> https://github.com/neumanndaniel/armtemplates/tree/master/container
-> https://github.com/neumanndaniel/kubernetes/blob/master/windows/hello-world.yaml