Upgrading your container app from Windows Server 2019 to 2022 on Azure Kubernetes Service

Note: As of the writing of this blog, Windows Server 2022 is on Public Preview on Azure Kubernetes Service (AKS).

If you've been playing with containers, the thought of upgrading to a new OS version might seem way too simple: Isn't it just change the FROM statement on my file? You'd think so, right?

In reality, moving from one version of Windows to the next on a managed K8s cluster, such as Azure Kubernetes Service, requires you to look at many other aspects. From a Kubernetes standpoint you need to ensure file, node pool, node selector, and yaml files are correctly configured. However, from an application lifecycle standpoint, there's much more to look at: (gMSA) integration, node pool access on other resources, Azure Key Vault integration, just to name a few.

In this blog post we will cover some of these aspects so you can properly plan your upgrade from Windows to Windows Server 2022 on AKS.

Updating your file

Most likely, the very first step on upgrading a containerized application from Windows to Windows Server 2022 is to update the FROM statement on your docker file. The point here is that your application installation should not change because you moved from one OS version to another.

Your next step before deploying it on AKS is to ensure the app still works as expected in your development or testing environment. Granted, your app might require other components that are currently deployed with your cluster but making sure the app is not breaking because of the OS version change is a first step.

We have plenty of documentation covering tips and tricks for docker files with Windows containers. You can check some of the documentation on how to write and optimize a docker file on our Docs page.

Adding a Windows Server 2022 node to an existing cluster

Windows and 2022 cannot co-exist on the same node pool on AKS. To upgrade your application, you need a separate node pool for Windows Server 2022. Since Windows Server 2022 is in Public Preview on AKS, adding a new node pool with that OS version requires two things:

  • That you enable AKS-Preview for your subscription. This is a one-time configuration for your subscription.
  • That you use Azure CLI. Azure PowerShell is usually updated after the feature goes GA.

Check out the AKS documentation on how to add a new Windows Server 2022 node pool to an existing AKS cluster.

Updating your YAML file

Currently, there are a few options to deploy Windows applications on AKS and ensure the Windows pods will run on the Windows nodes, such as Node Selector and Taints and Tolerations. Node Selector is the most common option as it enforces the placement of Windows pods on Windows nodes.

However, currently the recommendation to enforce that placement is made with the following annotation to YAML files:

      nodeSelector:
        "kubernetes.io/os": windows

What the above annotation will do is to find *any* Windows node available and place the pod on that node (of course, following all other scheduling rules). When upgrading from Windows Server 2019 to Windows Server 2022, you need to enforce not only the placement on a Windows node, but also on a node that is running the latest OS version. To accomplish this, one option is to add another annotation. Let's look at the details of both OS versions as AKS nodes. Here's the list of nodes on my AKS cluster environment after I added the new node pool for Windows Server 2022:

PS C:> kubectl get nodes -o wide
NAME                                STATUS   ROLES   AGE     VERSION   INTERNAL-IP    EXTERNAL-IP   OS-IMAGE                         KERNEL-VERSION     CONTAINER-RUNTIME
aks-agentpool-18877473-vmss000000   Ready    agent   5h40m   v1.23.8   10.240.0.4     <none>        Ubuntu 18.04.6 LTS               5.4.0-1085-azure   containerd://1.5.11+azure-2
akspoolws000000                     Ready    agent   3h15m   v1.23.8   10.240.0.208   <none>        Windows Server 2022 Datacenter   10.0.20348.825     containerd://1.6.6+azure
akspoolws000001                     Ready    agent   3h17m   v1.23.8   10.240.0.239   <none>        Windows Server 2022 Datacenter   10.0.20348.825     containerd://1.6.6+azure
akspoolws000002                     Ready    agent   3h17m   v1.23.8   10.240.1.14    <none>        Windows Server 2022 Datacenter   10.0.20348.825     containerd://1.6.6+azure
akswspool000000                     Ready    agent   5h37m   v1.23.8   10.240.0.115   <none>        Windows Server 2019 Datacenter   10.0.17763.3165    containerd://1.6.6+azure
akswspool000001                     Ready    agent   5h37m   v1.23.8   10.240.0.146   <none>        Windows Server 2019 Datacenter   10.0.17763.3165    containerd://1.6.6+azure
akswspool000002                     Ready    agent   5h37m   v1.23.8   10.240.0.177   <none>        Windows Server 2019 Datacenter   10.0.17763.3165    containerd://1.6.6+azure

When checking on one of the nodes, I get:

PS C:> kubectl describe node akspoolws000000
Name:               akspoolws000000
Roles:              agent
Labels:             agentpool=poolws
                    beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=Standard_D2s_v3
                    beta.kubernetes.io/os=windows
                    failure-domain.beta.kubernetes.io/region=westus
                    failure-domain.beta.kubernetes.io/zone=0
                    kubernetes.azure.com/agentpool=poolws
                    kubernetes.azure.com/cluster=MC_AKSUpgrade_viniap-cluster_westus
                    kubernetes.azure.com/kubelet-identity-client-id=2cf7d300-53a6-4adf-a6d9-ba841a39c0c5
                    kubernetes.azure.com/mode=user
                    kubernetes.azure.com/node-image-version=AKSWindows-2022-containerd-20348.825.220713
                    kubernetes.azure.com/os-sku=Windows2022
                    kubernetes.azure.com/role=agent
                    kubernetes.azure.com/storageprofile=managed
                    kubernetes.azure.com/storagetier=Premium_LRS
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=akspoolws000000
                    kubernetes.io/os=windows
                    kubernetes.io/role=agent
                    node-role.kubernetes.io/agent=
                    node.kubernetes.io/instance-type=Standard_D2s_v3
                    node.kubernetes.io/windows-build=10.0.20348
                    storageprofile=managed
                    storagetier=Premium_LRS
                    topology.disk.csi.azure.com/zone=
                    topology.kubernetes.io/region=westus
                    topology.kubernetes.io/zone=0
<redacted>

Notice on the above that the same annotation which is traditionally used for node selector is there: kubernetes.io/os. However, there are two other parameters that can better represent the nodes we need to schedule our existing pods: kubernetes.azure.com/os-sku and node.kubernetes.io/windows-build. The first one is specific to AKS, while the second is a more generic Kubernetes annotation.

Changing your Node Selector annotation will ensure your pods are scheduled on the right nodes. Here's what I have in my environment before moving the application:

PS C:> kubectl get pods -o wide
NAME                          READY   STATUS              RESTARTS   AGE   IP             NODE              NOMINATED NODE   READINESS GATES
iis-sample-845488fbcc-67s4f   1/1     Running             0          5h    10.240.0.119   akswspool000002   <none>           <none>
iis-sample-845488fbcc-fp9xv   1/1     Running             0          5h    10.240.0.118   akswspool000000   <none>           <none>
iis-sample-845488fbcc-vnsvz   1/1     Running             0          5h    10.240.0.120   akswspool000001   <none>           <none>

All 3 replicas that host my application are running on Windows Server 2019 nodes, because they were running before I deployed the 2022 node pool. Now that I have another node pool, I'll change the YAML file that deploys my application to use:

      nodeSelector:
        "kubernetes.azure.com/os-sku": Windows2022

Then I can run the following to update my application deployment:

PS C:> kubectl apply -f .IISSample.yaml
deployment.apps/iis-sample configured
service/iis-sample unchanged
PS C:> kubectl get pods -o wide
NAME                          READY   STATUS    RESTARTS   AGE     IP             NODE              NOMINATED NODE   READINESS GATES
iis-sample-7794bfcc4c-k62cq   1/1     Running   0          2m49s   10.240.0.238   akspoolws000000   <none>           <none>
iis-sample-7794bfcc4c-rswq9   1/1     Running   0          2m49s   10.240.1.10    akspoolws000001   <none>           <none>
iis-sample-7794bfcc4c-sh78c   1/1     Running   0          2m49s   10.240.0.228   akspoolws000000   <none>           <none>

Notice that now, none of the pods are running on the previous node pool with Windows Server 2019. The previous pods were terminated, and new ones are running on the right OS version. Also, notice that I moved the image that is being deployed to the application from 2019 to 2022. If I try to deploy the same container image running the 2019 version to a 2022 host, the deployment will fail as both host and container version need to match. Finally, no changes were made to the service (Load Balancer) that serves my application – other than new endpoints, which is configured automatically for me by the Kubernetes service.

Active Directory and gMSA impact

When you introduce a new client version to Active Directory, you don't necessarily need to perform anything on your Active Directory Domain Controllers. If these were regular Windows or Windows Server clients, you'd probably want to update your schema, GPOs, upgrade domain and forest level (if they are not on the latest already), and add at least a new Windows Server 2022 Domain Controller to ensure you have the necessary compatibility. However, Windows containers only use the Domain Controller for a Kerberos and no changes to the domain or Domain Controller are necessary.

However, if you are leveraging Group Managed Service Accounts (gMSA) you will need to update the Managed Identity configuration for the new node pool. To better explain: gMSA uses a secret (user account and password) so the node on which the Windows pod is running can authenticate the container against AD. To access that secret on Azure Key Vault, the node uses a Managed Identity that allows the node to access the resource. Since Managed Identities are configured per node pool, and the pod now resides on a new node pool, you need to update that configuration. Luckly, we have plenty of documentation on how to do that. In addition, the gMSA on AKS PowerShell module can help you identify which node pools have access to which Azure Key Vaults.

Managed Identity concerns

The concern here is the same as explained in the previous section. Any access provided to other Azure resources via Managed Identity needs to be updated to reflect the new node pool. gMSA ends up being one that is more evident to Windows workloads, but you might be using Managed Identities for other resource access in Azure.

The question here is how you would know what Managed Identities are being used for your AKS cluster against other Azure resources. For that, you can use the documentation on how to view update and sign-in activities for Managed Identities. Granted, you might need to force applications to authenticate or some cluster operations to ensure all attempts are caught and identified, but it's a start.

Conclusion

Upgrading from Windows Server 2019 to 2022 could be a simple process of just updating your Docker and YAML files, or a more complex one where multiple services and need to be thought through. Depending on your environment, you might need to check for Managed Identity changes, Active Directory integration and more.

Hopefully, this article will help you frame the upgrade process for your scenario. As always, let us know what your thoughts are here in the below comments section.

 

This article was originally published by Microsoft's Core Infrastructure and Security Blog. You can find the original article here.