This article focuses on network capacity planning of AKS clusters but keep in mind folks that it is important that for a complete capacity planning, compute and storage resources must also be considered.
- How many nodes do I need in my AKS cluster?
- How many application pods could run in my cluster?
- Do I need to modify the maximum pods per node parameter? If yes, what should it be?
If you have ever deployed an AKS cluster, you might already have asked yourself those questions above, probably found some answers but eventually decided to go with the cluster default settings and if any changes are to be made, you will make them down the road as you get more visibility into the workload running on your clusters or start facing issues.
I know this strategy speaks to most of us .
The main issue with this approach even we all know that is not the best is that it doesn’t take into account that some configuration settings directly or indirectly linked to your AKS clusters cannot be changed once set up which include but are not limited to AKS subnet IP range and max pods per node setting’s value.
Going through proper capacity planning will bring architects, IT and application operators answers to most of the resource capacity questions that will arise at some point during the scoping, the design or the implementation phase of any projects big or small.
This will be achieved by taking into consideration the impact of your AKS cluster design as well as the technical requirements and limitations of the service.
The process is to be completed in that order since the output of a task is used as an input of the next one:
- AKS network plugin
- AKS subnet size
- AKS cluster size
- AKS cluster applications pods capacity
Let’s now go over them one by one with some examples.
- Network plugins
The first decision you have to make in regards to networking with Kubernetes clusters in general is the choice of the network plugin which is used on all the nodes for intra/inter pods and nodes communications.
AKS gives you have 2 options: Kubenet (also referred to as ‘Basic’ in Azure Poral) and Azure CNI (also referred to as ‘Advanced’ in the Azure portal).
It is one of the settings that once configured it cannot be changed unless the entire cluster is recreated.
This topic alone around network plugins is worth its own article so I will not dive too deep into it, I just want to give you some insights to help you make a decision on which one to choose.
Kubenet is considered as a basic network plugin for Kubernetes that is implemented on Linux machines only.
It offers limited functionalities and due to its design, routing rules must be implemented on top of it to enable inter cluster node communication which increases the complexity of cluster network management. See link below for more details:
It has one major advantage though, it saves IP addresses by having each cluster node run its own NAT-ed /24 CIDR subnet for its pods. This network plugin is fully supported.
However, to better understand the impact of Kubenet’s limitations, I listed below some of them (as of this writing). Also, it is not an exhaustive list.
- Kubenet does not support
- Azure Virtual Nodes
- Azure Network Policies
- Windows node pools
- Shared AKS cluster subnets (having multiple AKS clusters running in the same subnet)
- Max nodes per cluster with Kubenet is 400 since UDRs do not support more than 400 routes
- Cannot query DNS Private Zones from a Pod in Kubenet
Moreover, most of the new AKS functionalities are first developed with Azure CNI and then when technically compatible, they are adapted to Kubenet. Here are below some examples of those features:
- Application Gateway Ingress Controller
- Allocating a separate subnet per node pool
- Outbound type of userDefinedRouting (AKS does not automatically provision a public IP address for the Standard Load Balancer frontend)
Moving on in this article we will choose the Azure CNI network plugin and the first thing I would recommend for anyone designing an AKS cluster with Azure CNI is to read the following documentation:
- AKS subnet size
The first implication of choosing Azure CNI as the network plugin for your AKS clusters is that all user* and system** pod IP’s are reserved upfront in your AKS subnet based on the max pods per node setting’s value times the number of nodes in your cluster at any time (example below).
*user pods are any pods that your business/infrastructure applications are composed of.
**system pods are any pods either required by Kubernetes/AKS to function properly such as CoreDNS and tunnelfront
Check out in the below link the notion of user and system node pools:
For instance, if you set the number of nodes in your cluster to 3 and the max pods per node to 50 (default value is 30), the AKS subnet will reserve 3 x 50 =150 IP’s in your AKS subnet regardless of the actual consumption of IP addresses by user or system pods inside your cluster.
Cluster deployment would fail if not all IP’s could be reserved.
It also applies during the upgrade of your AKS node pools where a temporary additional node is created prior to starting the cordon and drain process of any other existing node, if there is not enough IP’s available in your AKS subnet, the upgrade fails right away (pre-check performed).
Now let’s go over a simple use case to learn how to properly size your AKS subnet and cluster.
What is the minimum size required for a dedicated AKS subnet to fit in a single-node AKS cluster with a max pods per node of 30 (default) and using Azure CNI as a network plugin?
Let’s do the math!
- 5 IP’s reserved by Azure VNet within each subnet (https://docs.microsoft.com/en-us/azure/virtual-network/virtual-networks-faq)
- x.x.x.0: Network address
- x.x.x.1: Reserved by Azure for the default gateway
- x.x.x.2, x.x.x.3: Reserved by Azure to map the Azure DNS IPs to the VNet space
- x.x.x.255: Network broadcast address
- 1 IP assigned to the network interface of the 1-node in your cluster
- 30 IP’s reserved by Azure CNI which correspond to the max pods per node
- x.x.x.x.5 – x.x.x.35 (1 node)
Total required IP’s is the sum of the above IP’s: 5 + 1 + 30 = 36.
At this point you already know that a x.x.x.x/27 subnet with 32 IP’s available is not enough so you will need at least a /26 subnet with 64 IP’s available. Is that it? Almost…
Remember that the AKS cluster upgrade process implies the creation of a temporary additional node which like any other nodes has 1 IP assigned to its network interface and 30 IP’s reserved for the max pods per node so taking it into account, the new total required IP’s is: 36 + 1 + 30 = 67
Now you know that even a /26 subnet with 64 IP’s available would not be sufficient and at least a /25 subnet with 128 IP’s available is required.
- AKS cluster size
Following along with the previous use case, a /25 subnet with a max pods per node of 30 can hold a 2-node cluster but not 3, here is below the details of the reasoning to help you determine for each network size what the maximum od nodes in your clusters is:
+1 corresponds to the additional node brought in during AKS node pool upgrades.
- 2-node cluster: 5 + (2+1) + (30 x (2+1)) = 98 is less or equal to 128 <=>/25 –> OK
- 3-node cluster: 5+ (3+1) + (30 x (3+1)) = 129 is greater than 128 <=>/25 –> KO
Now as you might have figured out already if you lower the max pods per node, you will be able to fit more nodes into your AKS subnet.
For instance, a max pod per node of 29 instead of 30 (by default) would allow you to fit 3 nodes in a /25 subnet
- 3-node cluster: 5 + (3+1) + (29 x (3+1)) = 125 is less than 128 <=> 25 -> OK
You could make use of the following very simple Excel table to help you with your capacity planning.
An additional reserved IP has been added in column B which represents the private endpoint of a private AKS cluster.
Highlighted in yellow are the formulas used in the indicated columns with the red arrows (C and E), the other columns hold static values.
If you are looking to fit more nodes in your cluster, keep an eye on the size/SKU of your nodes/VM’s as you might not need as much compute on individual nodes since you will be able to achieve the same total amount of memory and CPU with more nodes of a smaller size. The Kubernetes scheduler will then do the job of spreading evenly the workload across all the nodes.
- AKS cluster pods capacity
As seen earlier in order for pods from your applications (user pods) to get IP’s assigned in the cluster, those IP’s must be reserved by Azure CNI at the creation of each node and they must not already be assigned to system pods.
Let’s try to determine how many user pods can be created in a 3-node cluster configured with the default max pods per node (30).
In AKS 1.17.4 by using the following command ‘>kubectl get pods -A -o wide’ right after the deployment of your AKS cluster, you can notice that there are 16 system pods running (no add-ons) using 5 different IP addresses (some pods such as kube-proxy or azure-ip-masq-agent share the same IP as the nodes they are hosted in, hence the difference).
At the end, 3 nodes x 30 max pods per node – 16 system pod = 74 user pods can be created.
N.B: keep in mind the existence of Kubernetes system daemonSet resources to understand the spread of the system pods across all the nodes of the cluster.
It does not matter here that there are more IP’s reserved by Azure CNI and available in the cluster 3×30-5=85 since the max pods per node is a hard limit that cannot be exceeded.
I hope you now fill much more confident in designing and configuring your AKS clusters to meet all your requirements and needs.
Senior Premier Field Engineer, Cloud and Infrastructure, France