Note: As of the writing of this blog, Windows Server 2022 is on Public Preview on Azure Kubernetes Service (AKS).
If you’ve been playing with containers, the thought of upgrading to a new OS version might seem way too simple: Isn’t it just change the FROM statement on my docker file? You’d think so, right?
In reality, moving from one version of Windows to the next on a managed K8s cluster, such as Azure Kubernetes Service, requires you to look at many other aspects. From a Kubernetes standpoint you need to ensure docker file, node pool, node selector, and yaml files are correctly configured. However, from an application lifecycle standpoint, there’s much more to look at: Active Directory (gMSA) integration, node pool access on other resources, Azure Key Vault integration, just to name a few.
In this blog post we will cover some of these aspects so you can properly plan your upgrade from Windows Server 2019 to Windows Server 2022 on AKS.
Updating your docker file
Most likely, the very first step on upgrading a containerized application from Windows Server 2019 to Windows Server 2022 is to update the FROM statement on your docker file. The point here is that your application installation should not change because you moved from one OS version to another.
Your next step before deploying it on AKS is to ensure the app still works as expected in your development or testing environment. Granted, your app might require other components that are currently deployed with your cluster but making sure the app is not breaking because of the OS version change is a first step.
Adding a Windows Server 2022 node to an existing cluster
Windows Server 2019 and 2022 cannot co-exist on the same node pool on AKS. To upgrade your application, you need a separate node pool for Windows Server 2022. Since Windows Server 2022 is in Public Preview on AKS, adding a new node pool with that OS version requires two things:
- That you enable AKS-Preview for your subscription. This is a one-time configuration for your subscription.
- That you use Azure CLI. Azure PowerShell is usually updated after the feature goes GA.
Check out the AKS documentation on how to add a new Windows Server 2022 node pool to an existing AKS cluster.
Updating your YAML file
Currently, there are a few options to deploy Windows applications on AKS and ensure the Windows pods will run on the Windows nodes, such as Node Selector and Taints and Tolerations. Node Selector is the most common option as it enforces the placement of Windows pods on Windows nodes.
However, currently the recommendation to enforce that placement is made with the following annotation to YAML files:
nodeSelector: "kubernetes.io/os": windows
What the above annotation will do is to find *any* Windows node available and place the pod on that node (of course, following all other scheduling rules). When upgrading from Windows Server 2019 to Windows Server 2022, you need to enforce not only the placement on a Windows node, but also on a node that is running the latest OS version. To accomplish this, one option is to add another annotation. Let’s look at the details of both OS versions as AKS nodes. Here’s the list of nodes on my AKS cluster environment after I added the new node pool for Windows Server 2022:
PS C:> kubectl get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME aks-agentpool-18877473-vmss000000 Ready agent 5h40m v1.23.8 10.240.0.4 <none> Ubuntu 18.04.6 LTS 5.4.0-1085-azure containerd://1.5.11+azure-2 akspoolws000000 Ready agent 3h15m v1.23.8 10.240.0.208 <none> Windows Server 2022 Datacenter 10.0.20348.825 containerd://1.6.6+azure akspoolws000001 Ready agent 3h17m v1.23.8 10.240.0.239 <none> Windows Server 2022 Datacenter 10.0.20348.825 containerd://1.6.6+azure akspoolws000002 Ready agent 3h17m v1.23.8 10.240.1.14 <none> Windows Server 2022 Datacenter 10.0.20348.825 containerd://1.6.6+azure akswspool000000 Ready agent 5h37m v1.23.8 10.240.0.115 <none> Windows Server 2019 Datacenter 10.0.17763.3165 containerd://1.6.6+azure akswspool000001 Ready agent 5h37m v1.23.8 10.240.0.146 <none> Windows Server 2019 Datacenter 10.0.17763.3165 containerd://1.6.6+azure akswspool000002 Ready agent 5h37m v1.23.8 10.240.0.177 <none> Windows Server 2019 Datacenter 10.0.17763.3165 containerd://1.6.6+azure
When checking on one of the nodes, I get:
PS C:> kubectl describe node akspoolws000000 Name: akspoolws000000 Roles: agent Labels: agentpool=poolws beta.kubernetes.io/arch=amd64 beta.kubernetes.io/instance-type=Standard_D2s_v3 beta.kubernetes.io/os=windows failure-domain.beta.kubernetes.io/region=westus failure-domain.beta.kubernetes.io/zone=0 kubernetes.azure.com/agentpool=poolws kubernetes.azure.com/cluster=MC_AKSUpgrade_viniap-cluster_westus kubernetes.azure.com/kubelet-identity-client-id=2cf7d300-53a6-4adf-a6d9-ba841a39c0c5 kubernetes.azure.com/mode=user kubernetes.azure.com/node-image-version=AKSWindows-2022-containerd-20348.825.220713 kubernetes.azure.com/os-sku=Windows2022 kubernetes.azure.com/role=agent kubernetes.azure.com/storageprofile=managed kubernetes.azure.com/storagetier=Premium_LRS kubernetes.io/arch=amd64 kubernetes.io/hostname=akspoolws000000 kubernetes.io/os=windows kubernetes.io/role=agent node-role.kubernetes.io/agent= node.kubernetes.io/instance-type=Standard_D2s_v3 node.kubernetes.io/windows-build=10.0.20348 storageprofile=managed storagetier=Premium_LRS topology.disk.csi.azure.com/zone= topology.kubernetes.io/region=westus topology.kubernetes.io/zone=0 <redacted>
Notice on the above that the same annotation which is traditionally used for node selector is there: kubernetes.io/os. However, there are two other parameters that can better represent the nodes we need to schedule our existing pods: kubernetes.azure.com/os-sku and node.kubernetes.io/windows-build. The first one is specific to AKS, while the second is a more generic Kubernetes annotation.
Changing your Node Selector annotation will ensure your pods are scheduled on the right nodes. Here’s what I have in my environment before moving the application:
PS C:> kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES iis-sample-845488fbcc-67s4f 1/1 Running 0 5h 10.240.0.119 akswspool000002 <none> <none> iis-sample-845488fbcc-fp9xv 1/1 Running 0 5h 10.240.0.118 akswspool000000 <none> <none> iis-sample-845488fbcc-vnsvz 1/1 Running 0 5h 10.240.0.120 akswspool000001 <none> <none>
All 3 replicas that host my application are running on Windows Server 2019 nodes, because they were running before I deployed the 2022 node pool. Now that I have another node pool, I’ll change the YAML file that deploys my application to use:
nodeSelector: "kubernetes.azure.com/os-sku": Windows2022
Then I can run the following to update my application deployment:
PS C:> kubectl apply -f .IISSample.yaml deployment.apps/iis-sample configured service/iis-sample unchanged PS C:> kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES iis-sample-7794bfcc4c-k62cq 1/1 Running 0 2m49s 10.240.0.238 akspoolws000000 <none> <none> iis-sample-7794bfcc4c-rswq9 1/1 Running 0 2m49s 10.240.1.10 akspoolws000001 <none> <none> iis-sample-7794bfcc4c-sh78c 1/1 Running 0 2m49s 10.240.0.228 akspoolws000000 <none> <none>
Notice that now, none of the pods are running on the previous node pool with Windows Server 2019. The previous pods were terminated, and new ones are running on the right OS version. Also, notice that I moved the image that is being deployed to the application from 2019 to 2022. If I try to deploy the same container image running the 2019 version to a 2022 host, the deployment will fail as both host and container version need to match. Finally, no changes were made to the service (Load Balancer) that serves my application – other than new endpoints, which is configured automatically for me by the Kubernetes service.
Active Directory and gMSA impact
When you introduce a new client version to Active Directory, you don’t necessarily need to perform anything on your Active Directory Domain Controllers. If these were regular Windows or Windows Server clients, you’d probably want to update your schema, GPOs, upgrade domain and forest level (if they are not on the latest already), and add at least a new Windows Server 2022 Domain Controller to ensure you have the necessary compatibility. However, Windows containers only use the Domain Controller for a Kerberos authentication and no changes to the domain or Domain Controller are necessary.
However, if you are leveraging Group Managed Service Accounts (gMSA) you will need to update the Managed Identity configuration for the new node pool. To better explain: gMSA uses a secret (user account and password) so the node on which the Windows pod is running can authenticate the container against AD. To access that secret on Azure Key Vault, the node uses a Managed Identity that allows the node to access the resource. Since Managed Identities are configured per node pool, and the pod now resides on a new node pool, you need to update that configuration. Luckly, we have plenty of documentation on how to do that. In addition, the gMSA on AKS PowerShell module can help you identify which node pools have access to which Azure Key Vaults.
Managed Identity concerns
The concern here is the same as explained in the previous section. Any access provided to other Azure resources via Managed Identity needs to be updated to reflect the new node pool. gMSA ends up being one that is more evident to Windows workloads, but you might be using Managed Identities for other resource access in Azure.
The question here is how you would know what Managed Identities are being used for authenticating your AKS cluster against other Azure resources. For that, you can use the documentation on how to view update and sign-in activities for Managed Identities. Granted, you might need to force applications to authenticate or some cluster operations to ensure all attempts are caught and identified, but it’s a start.
Upgrading from Windows Server 2019 to 2022 could be a simple process of just updating your Docker and YAML files, or a more complex one where multiple services and authentication need to be thought through. Depending on your environment, you might need to check for Managed Identity changes, Active Directory integration and more.
Hopefully, this article will help you frame the upgrade process for your scenario. As always, let us know what your thoughts are here in the below comments section.