Azure Kubernetes Services – Start & Stop Your AKS Cluster on Schedule using Azure Automation

Hi everybody, here I am again to show you a possible way to start and stop your AKS on schedule.

This could be something important if you're aiming at saving money and are in the middle of a Microsoft Azure Well-Architected Cost Optimization review. Say for example that you have a dev environment for which you don't need the resources to be up & running during the night or outside of normal working hours.

Helping customer in saving money or, even better, in spending them diligently is part of the mission we are all called to. If you can help customers to save money, they will be more inclined to invest that saving into other Azure services or by using additional resources. Hence, in the end, it's not a bad idea but instead a great example of customer care. I came across this scenario during a customer engagement and since I am not (or container) expert, I asked my colleague Michele Ferracin some help. Hence credits to Michele :smile:

So far, the Azure portal does not provide any scheduled approach to start or stop your AKS in the Service blade:

BrunoGabrielli_0-1627375454096.png

But the goal can be reached by either using Azure CLI as documented in the Stop and Start an Azure Kubernetes Service (AKS) cluster page or by using the REST APIs as per:

Of course, doing this kind of operations on a cluster has some limitations which well explained in the Microsoft documentation and briefly reported below:

BrunoGabrielli_1-1627375454101.png

But question is still: How can you get this done? You can take advantage of the great integration offered by Azure. Azure Automation, in this case, is your friend and since there are no PowerShell modules or cmdlets available for this purpose we will be forced to work with the REST APIs. I proposed that solution to a customer that was exactly asking the question: How can I stop the AKS cluster on my dev environment during night to save money?

Given that, what should you do to put this solution in place? All you need is to create a new automation runbook in a new or existing automation account. The technical pre-req here is that you need to have the Az.Accounts module added to the Modules shared resource in your Account

BrunoGabrielli_2-1627375454102.png

Assuming that you are all set (the AKS cluster in place, the authentication mechanism is working perfectly, and your permissions are set), you can import the PowerShell code below into a new runbook and schedule it as required.


.SYNOPSIS
This sample runbook is designed to manage the start and stop of aks clusterson a given schedule.

.DESCRIPTION
This sample runbook is designed to manage the start and stop of aks clusters on a given schedule. You need to provide some parameters. This runbook requires also
the following modules to be imported in the modules section of the Automation Account in the Azure Portal:

– Az.Account

.PARAMETERS
aksClusterName: This REQUIRED string parameter represents the cluster name to perform the required operation.

resourceGroupName: This REQUIRED string parameter represents the resource groupt that containt the AKS cluster in subject

operation: This REQUIRED string parameter represents the operations to be performed on the AKS cluster. It can only contain 2 values: Start or Stop

.EXAMPLE
.StartStop-AKS-Cluster

.NOTES
AUTHOR: Bruno Gabrielli
LASTEDIT: July 12th, 2021
CHANGELOG:

VERSION: 1.1
– Adde support to initial code to support AzureRunAsConnection authentication
– Fixed the RestAPI call to work
– Added verification on current AKS cluster state before performing the requested operation

VERSION: 1.0
– Initial version
#>

Param(
[Parameter(Mandatory=$True,
ValueFromPipelineByPropertyName=$false,
HelpMessage='Specify the AKS cluster name.',
Position=1)]
[String]
$aksClusterName,

[Parameter(Mandatory=$True,
ValueFromPipelineByPropertyName=$false,
HelpMessage='Specify the name of the resoure group containing the AKS cluster.',
Position=2)]
[String]
$resourceGroupName,

[Parameter(Mandatory=$True,
ValueFromPipelineByPropertyName=$false,
HelpMessage='Specify the operation to be performed on the AKS cluster name (Start/Stop).',
Position=2)]
[ValidateSet(‘Start','Stop')]
[String]
$operation
)

#Inizialiting connection to the AutomationAccount
[String]$connectionName = “AzureRunAsConnection”
try
{
#Get the connection “AzureRunAsConnection ”
$servicePrincipalConnection=Get-AutomationConnection -Name $connectionName

“Logging in to Azure…”
Connect-AzAccount `
-ServicePrincipal `
-Tenant $servicePrincipalConnection.TenantId `
-ApplicationId $servicePrincipalConnection.ApplicationId `
-CertificateThumbprint $servicePrincipalConnection.CertificateThumbprint

“Setting context to a specific subscription”
Set-AzContext -Subscription $servicePrincipalConnection.SubscriptionId

#Start/Stop cluster
#az aks $operation –name $aksClusterName –resource-group $resourceGroupName
#POST https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.ContainerService/managedClusters/{resourceName}/stop?api-version=2021-05-01

#Setting REST API Authentication token
$accToken = Get-AzAccessToken | Select-Object -Property Token
$AccessToken = $accToken.Token
$headers_Auth = @{‘Authorization'=”Bearer $AccessToken”}

#Setting GET RestAPI Uri
$getRestUri = “https://management.azure.com/subscriptions/$($servicePrincipalConnection.SubscriptionId)/resourceGroups/$resourceGroupName/providers/Microsoft.ContainerService/managedClusters/$($aksClusterName)?api-version=2021-05-01”

#Setting POST RestAPI Uri
$postRestUri = “https://management.azure.com/subscriptions/$($servicePrincipalConnection.SubscriptionId)/resourceGroups/$resourceGroupName/providers/Microsoft.ContainerService/managedClusters/$aksClusterName/$($operation.ToLower())?api-version=2021-05-01”

try
{
#Getting the cluster state
Write-Output “Invoking RestAPI method to get the cluster state. The request Uri is ==$getRestUri==.”
$getResponse = Invoke-WebRequest -UseBasicParsing -Method Get -Headers $headers_Auth -Uri $getRestUri
$getResponseJson = $getResponse.Content | ConvertFrom-Json
$clusterState = $getResponseJson.properties.powerState.code
Write-Output “AKS Cluster ==$aksClusterName== is currently ==$clusterState==”

#Checking if the requested operation can be performed based on the current state
Switch ($operation)
{
“Start”
{
If ($clusterState -eq “Running”)
{
Write-Output “The AKS Cluster ==$aksClusterName== is already ==$clusterState== and cannot be started again.”
}
else
{
Write-Output “Invoking RestAPI method to perform the requested ==$operation== operation on AKS Cluster ==$aksClusterName==. The request Uri is ==$postRestUri==.”
$postResponse = Invoke-WebRequest -UseBasicParsing -Method Post -Headers $headers_Auth -Uri $postRestUri
$StatusCode = $postResponse.StatusCode
}
}

“Stop”
{
If ($clusterState -eq “Stopped”)
{
Write-Output “The AKS Cluster ==$aksClusterName== is already ==$clusterState== and cannot be stopped again.”
}
else
{
Write-Output “Invoking RestAPI method to perform the requested ==$operation== operation on AKS Cluster ==$aksClusterName==. The request Uri is ==$postRestUri==.”
$postResponse = Invoke-WebRequest -UseBasicParsing -Method Post -Headers $headers_Auth -Uri $postRestUri
$StatusCode = $postResponse.StatusCode
}
}

Default
{
Write-Output “Unexpected scenario. The requested operation ==$operation== was not matching any of the managed cases.”
}
}
}
catch
{
$StatusCode = $_.Exception.Response.StatusCode.value__
$exMsg = $_.Exception.Message
Write-Output “Response Code == $StatusCode”
Write-Output “Exception Message == $exMsg”
}

if (($StatusCode -ge 200) -and ($StatusCode -lt 300))
{
Write-Output “The ==$operation== operation on AKS Cluster ==$aksClusterName== has been completed succesfully.”
}
else
{
Write-Output “The ==$operation== operation on AKS Cluster ==$aksClusterName== was not completed succesfully.”
}

}
catch
{
if (!$servicePrincipalConnection)
{
$ErrorMessage = “Connection $connectionName not found.”
throw $ErrorMessage
}
else
{
Write-Error -Message $_.Exception
throw $_.Exception
}
}

As per the parameter section in the script, it will require some inputs in order to be executed. When you will run the runbook (on-demand or on schedule) you'll need to enter the following specific info:

BrunoGabrielli_0-1627379014612.png

The runbook will first check if the required operation on the given cluster can be performed. For instance, if you requested to stop the cluster and the cluster is already stopped, the runbook produce some log entries similar to those below:

BrunoGabrielli_4-1627375454118.png

If the cluster was in the Stopped state and your request is to start it, then the runbook will go ahead and you will see logs similar to the screenshot below:

BrunoGabrielli_5-1627375454122.png

As I have been doing in all of my posts, I strongly recommend you to TEST, TEST, TEST before using it in production.

Thanks for reading as always :stareyes:

 

Disclaimer

The sample are not supported under any Microsoft standard support program or service. The sample are provided AS IS without warranty of any kind. Microsoft further disclaims all implied warranties including, without limitation, any implied warranties of merchantability or of fitness for a particular purpose. The entire risk arising out of the use or performance of the sample and documentation remains with you. In no event shall Microsoft, its authors, or anyone else involved in the creation, production, or delivery of the scripts be liable for any damages whatsoever (including, without limitation, damages for loss of business profits, business interruption, loss of business information, or other pecuniary loss) arising out of the use of or inability to use the sample scripts or documentation, even if Microsoft has been advised of the possibility of such damages.

 

This article was originally published by Microsoft’s Server Storage at Microsoft Blog. You can find the original article here.