Setting up Sentinel for Kubernetes Monitoring

A guide to using Microsoft Sentinel for monitoring the security of your containerized applications and orchestration platforms.

Part 3 of 3 part series about security monitoring of your Kubernetes Clusters and CI/CD pipelines by @singhabhi  and @Umesh_Nagdev , Security GBB

Introduction 

In part 1 and part 2 of this series, we discussed the type of log sources you should consider for monitoring the security of your Kubernetes environment, most pertinent risks (and corresponding use cases) in your AKS environment, and log sources to ingest data. This blog will demonstrate how to configure Azure Sentinel to derive identify the risks.

More specifically we will show: 

  1. Mapping of container security risks with Microsoft Defender for Cloud  
  2. Data connectors to ingest AKS data 
  3. Container Security Workbooks in Sentinel 
  4. Search queries to mine specific log tables for more pressing risks

How Microsoft Defender for Cloud addresses container risks 

Microsoft Defender for Containers is a security solution designed specifically for containerized environments. Microsoft Defender for Containers provides several key capabilities to enhance container security: 

  • Real-time protection: Defender for Containers offers real-time protection by continuously monitoring container activities, network traffic, and system events. It uses machine learning algorithms and heuristics to detect and respond to potential threats promptly. 
  • Vulnerability management: The solution helps identify vulnerabilities in container images and runtime environments. It can scan container images for known vulnerabilities, misconfigurations, and insecure dependencies, allowing organizations to address these issues before deploying containers into production.
  • Malware detection: Defender for Containers can detect and block malicious code, malware, and suspicious activities within containers. It leverages signature-based detection, behavioral analysis, and sandboxing techniques to identify and mitigate threats. 
  • File integrity monitoring: The solution includes file integrity monitoring capabilities to detect unauthorized changes or tampering within containerized environments. It monitors critical system files, configuration files, and application binaries for any suspicious modifications.
  • Network security: Defender for Containers helps secure container network traffic by enforcing access controls, segmenting network traffic, and detecting anomalies or suspicious network behavior. It can detect and block malicious network activities, such as port scanning, denial-of-service attacks, and lateral movement attempts.
  • Integration with Azure Security Center: Defender for Containers integrates seamlessly with Azure Security Center, providing centralized visibility, monitoring, and management of container security across hybrid and multi-cloud environments. It leverages Azure's cloud-native security capabilities and threat intelligence to enhance container security posture.
  • Incident response and remediation: In case of security incidents or suspicious activities, Defender for Containers provides tools for incident investigation, threat hunting, and automated remediation actions. It helps security teams quickly respond to security incidents and mitigate potential risks.

There are several alerts https://learn.microsoft.com/en-us/azure/defender-for-cloud/alerts-reference#alerts-for-containers—… that you get out of the box.

Additionally, you can secure the Kubernetes Control Plane using Azure Policy (https://learn.microsoft.com/en-us/azure/aks/use-azure-policy).

This resource https://techcommunity.microsoft.com/t5/microsoft-defender-for-cloud/leveraging-defender-for-containe… provides additional background on how Azure Policies work to protect your Kubernetes Control Plane.

There are several out of the box policies that can quickly get you started https://learn.microsoft.com/en-us/azure/aks/policy-reference

The table below shows a mapping of container risks to MDC alerts and Azure Policies. You will see that with the help of Defender for Containers you will get a great coverage across several risks that we discussed in blog part 2

#Kubernetes RiskRisk AreaMDC Alert(s)Azure Policy(s)
1Detect unauthorized or suspicious pods running in the cluster.Pod Security MonitoringAttempt to create a new namespace from a container detected
Anomalous pod deployment (Preview)
Anomalous secret access (Preview)
Behavior similar to common Linux bots detected (Preview)
Command within a container running with high privileges
Container running in privileged mode
Container with a sensitive volume mount detected
Kubernetes cluster containers should only pull images when image pull secrets are present
Kubernetes cluster containers CPU and memory resource limits should not exceed the specified limits
Kubernetes cluster containers should only use allowed capabilities
Kubernetes cluster containers should only use allowed images
Kubernetes cluster containers should only use allowed pull policy
Kubernetes cluster containers should run with a read only root file system
Kubernetes cluster pod hostPath volumes should only use allowed host paths
Kubernetes cluster pods and containers should only run with approved user and group IDs
2Monitor for privilege escalation attempts within pods.Pod Security MonitoringAttempt to create a new Linux namespace from a container detected
Anomalous secret access (Preview)
Command within a container running with high privileges
Container running in privileged mode
Container with a sensitive volume mount detected
Detected file download from a known malicious source
Excessive role permissions assigned in Kubernetes cluster (Preview)
Privileged container detected
Kubernetes cluster containers should only pull images when image pull secrets are present
Kubernetes cluster containers CPU and memory resource limits should not exceed the specified limits
Kubernetes cluster containers should only use allowed capabilities
Kubernetes cluster containers should only use allowed images
Kubernetes cluster containers should only use allowed pull policy
Kubernetes cluster containers should run with a read only root file system
Kubernetes cluster pod hostPath volumes should only use allowed host paths
Kubernetes cluster pods and containers should only run with approved user and group IDs
3Track and alert on changes to pod security policies.Pod Security MonitoringNoneNone
4Identify and alert on unexpected network traffic patterns.Network Security MonitoringAn uncommon connection attempt detected
CoreDNS modification in Kubernetes detected
K8S API requests from proxy IP address detected
Potential reverse shell detected
Potential port forwarding to external IP address
Suspicious use of DNS over HTTPS
None
5Monitor for unauthorized ingress and egress traffic.Network Security MonitoringAn uncommon connection attempt detected
CoreDNS modification in Kubernetes detected
K8S API requests from proxy IP address detected
Potential reverse shell detected
Potential port forwarding to external IP address
Suspicious use of DNS over HTTPS
None
6Detect and investigate potential denial-of-service (DoS) attacks.Network Security MonitoringIndicators associated with DDOS toolkit detectedNone
7Scan container images for vulnerabilities before deployment.Container Image SecurityNone (Gating when released)
Scanning is automatically done as part of ACR
Running container images should have vulnerability findings resolved
Running container images should have vulnerability findings resolved (powered by Microsoft Defender Vulnerability Management)
8Monitor for unauthorized or unsigned images.Container Image SecurityNoneKubernetes cluster containers should only use allowed images
Kubernetes clusters should only use images signed by notation
[Preview]: Deploy Image Integrity on Azure Kubernetes Service
9Track changes to container image repositories.Container Image SecurityNoneConfigure container registries to disable anonymous
Configure container registries to disable ARM audience token .
Configure container registries to disable local admin account.
Container registries should be with a customer-managed key
Container registries should have anonymous disabled.
10Monitor kubelet logs for signs of compromise or unauthorized access.Kubelet Activity MonitoringPossible credential access tool detected
Abnormal Kubernetes service account operation detected
Anomalous secret access (Preview)
Creation of admission webhook configuration detected
None
11Detect abnormal activities related to node management.Kubelet Activity Monitoring“K8S.NODE_” Alerts[Preview]: Cannot Edit Individual Nodes
[Preview]: Kubernetes clusters should restrict creation of given resource type
Azure Kubernetes Clusters should enable Container Interface(CSI)
Azure Kubernetes Clusters should enable Key Management Service (KMS)
Azure Kubernetes Clusters should use Azure CNI
Azure Kubernetes Service Clusters should enable node os auto-upgrade
Azure Role-Based Access Control (RBAC) should be used on Kubernetes Services
Both operating systems and data disks in Azure Kubernetes Service clusters should be by customer-managed keys
Configure Node OS Auto upgrade on Azure Kubernetes Cluster
Deploy Azure Policy Add-on to Azure Kubernetes Service clusters
Kubernetes cluster containers CPU and memory resource limits should not exceed the specified limits
Kubernetes clusters should not allow container privilege escalation
Kubernetes clusters should not allow endpoint edit permissions of ClusterRole/system:aggregate-to-edit
Kubernetes clusters should not grant CAP_SYS_ADMIN security capabilities
Kubernetes clusters should use internal load balancers
12Monitor Kubernetes API server logs for suspicious activities.API Server Security“K8S_*” AlertsNone
13Track and alert on failed authentication attempts.API Server SecurityAKS clusters should be set up to use Authentication
Azure Kubernetes Service: RBAC options in practice
Azure Kubernetes Service Clusters should enable Microsoft Entra ID integration
14Detect unusual API server request patterns.API Server Security“K8S_*” AlertsNone
15Monitor changes to RBAC policies and roles.RBAC (Role-Based Access Control) MonitoringRole binding to the cluster-admin role detected
AKS clusters should be set up to use Authentication
Azure Kubernetes Service: RBAC options in practice
Azure Kubernetes Service Clusters should enable Microsoft Entra ID integration
16Detect and alert on unauthorized access attempts.RBAC (Role-Based Access Control) MonitoringRole binding to the cluster-admin role detected
AKS clusters should be set up to use Authentication
Azure Kubernetes Service: RBAC options in practice
Azure Kubernetes Service Clusters should enable Microsoft Entra ID integration
17Track role binding changes and escalations.RBAC (Role-Based Access Control) MonitoringRole binding to the cluster-admin role detectedKubernetes clusters should minimize wildcard use in role and cluster role
18Monitor for unauthorized access to Kubernetes secrets and ConfigMaps.Secrets and ConfigMap Access MonitoringAnomalous secret access (Preview)
Process seen accessing the SSH authorized keys file in an unusual way
Defender for Key Vault also detects unusual access patterns
None
19Detect changes to sensitive configuration data.Secrets and ConfigMap Access MonitoringNone
Should be using Key Vault for Sensitive Data
Defender for Key Vault also detects unusual access patterns
None
20Track usage patterns of sensitive information.Secrets and ConfigMap Access MonitoringNone
Should be using Key Vault for Sensitive Data
Defender for Key Vault also detects unusual access patterns
None
21Enable and monitor Kubernetes audit logs for cluster-wide activities.Audit LoggingA history file has been cleared
Kubernetes events deleted
Possible Log Tampering Activity Detected
Deploy – Configure diagnostic settings for Azure Kubernetes Service to Log Analytics workspace
22Correlate audit logs to identify security events and policy violations.Audit LoggingDone natively by Defender for Container alertsDeploy – Configure diagnostic settings for Azure Kubernetes Service to Log Analytics workspace
23Regularly review audit logs for anomalies and potential threats.Audit LoggingNot applicableNot applicable
24Ensure compliance with security standards and policies.Compliance MonitoringDefender for CSPM Regulatory RecommendationsNone
25Monitor for deviations from security best practices.Compliance MonitoringDefender for CSPM Regulatory RecommendationsNone
26Generate reports on compliance status and potential risks.Compliance MonitoringDefender for CSPM Regulatory RecommendationsNone
27Monitor runtime activities of containers for abnormal behavior.Container Runtime Security“K8S_*” AlertsNone
28Detect and alert on suspicious system calls within containers.Container Runtime Security“K8S_*” AlertsNone
29Integrate with container runtime security tools for enhanced monitoring.Container Runtime Security“K8S_*” AlertsNone
30Develop and test incident response plans for Kubernetes security incidents.Incident Response and ForensicsNot applicableNot applicable
31Monitor for indicators of compromise (IoCs) and initiate investigations.Incident Response and ForensicsDone natively by Defender for Container alertsNot applicable
32Collect and analyze forensics data in the event of a security incidentIncident Response and ForensicsDone natively by Defender for Container alertsNot applicable

AKS Security Workbook 

There is an out of the box work that you can enable once you deploy the AKS Connector from Content Hub. If you are not familiar with Sentinel workbooks please refer to this resource Azure Sentinel Workbooks 101 (with sample Workbook) to learn about how to enable and leverage workbooks.  

Let's do a walkthrough of the AKS Security Workbook

Understanding the security coverage  

As we showcased above, Defender for Cloud provides a great coverage for your container security related risks.  

The first part would be to understand where you might have some blind spots that is the AKS clusters that are currently not being monitored by Defender for Cloud. The workbook allows you to look at the data across different subscriptions and the coverage table provides insights into where coverage is lacking. 

Note: The is a drop down for selecting the Time Range so you can see the items like alerts, recommendations etc. for that time period. 

Umesh_Nagdev_1-1713530194383.png

Looking at the Overview of security alerts 

For the given time period as we discussed earlier, you will see the security state of your clusters. This dashboard helps you understand where you have the most exposure across container image repos and clusters. 

You can also see how alerts have changed over period of time.

Umesh_Nagdev_0-1713530745468.png

Diving deeper into specific alerts 

Now that we understand our AKS Cluster coverage and areas of risk exposures, this next section shows you security alerts that exists in your environment. This way you can quickly prioritize where it would make sense to start to close most gaps.

Umesh_Nagdev_1-1713530807957.png

Search queries for security monitoring

In this section we will talk about building custom content to search for common use cases.

1. Identifying the Users and IPs who have most number of denies 

You would want to understand who is trying to get situational awareness of your AKS environment. This can be an existing user or a script that's trying to query your cluster. 

//API Authorization Deny by User, Source IP 
AKSAudit
| where TimeGenerated > ago (1h)
| extend authorizationDecision = parse_json(Annotations)
| extend user = parse_json(User)
| extend sourceIps = parse_json(SourceIps)
| project TimeGenerated, PodName, Verb, sourceIps[0],user.username, authorizationDecision["authorization.k8s.io/decision"], RequestUri
| order by TimeGenerated asc

2. Users that are assigned to Cluster Roles 

A ClusterRole can be used to grant the same permissions as a Role. Because ClusterRoles are cluster-scoped, you can also use them to grant access to: 

  • cluster-scoped resources (like nodes)
  • non-resource endpoints (like /healthz)
  • namespaced resources (like Pods), across all namespaces 

You should not assign users to cluster roles. The following query shows you the users are aligned to a Cluster Role 

AKSAudit 
| where TimeGenerated > ago (1h)
| where RequestObject.kind == "ClusterRoleBinding"
| where ResponseObject.subjects[0].name != "aks-support"
| project TimeGenerated, PodName, Verb, SourceIP=SourceIps[0],User=ResponseObject.subjects[0].name,ClusterRoleBinding=ResponseObject.metadata.name,ClusterRole=RequestObject.roleRef.name
| order by TimeGenerated asc

3. Users that have Admin Access 

Users should not be part of system:masters and system:accounts. Any user who is a member of this group bypasses all RBAC rights checks and will always have unrestricted superuser access, which cannot be revoked by removing RoleBindings or ClusterRoleBindings.

//Users part of system:masters or Users that are part of system:accounts 
AKSAudit
| where TimeGenerated > ago (1h)
| extend authorizationDecision = parse_json(Annotations)
| extend user = parse_json(User)
| extend sourceIps = parse_json(SourceIps)
| where user.groups[0] in ("system:masters","system:accounts") or user.groups[1] in ("system:masters","system:accounts")
| where user.username !in ("aksService","system:apiserver","aksProblemDetector")
| project TimeGenerated, PodName, Verb, SourceIP=sourceIps[0],Username=user.username, AuthorizationDecision=authorizationDecision["authorization.k8s.io/decision"], AuthorizationDecisionReason=authorizationDecision["authorization.k8s.io/reason"], RequestUri
| order by TimeGenerated asc

4. Resources that can run unauthenticated and anonymous API calls 

You would want to know who is part of system:unauthenticated group and remove them where possible, as this gives access to anyone who can contact the API server at a network level. 

/Bindings of system:unauthenticated and system:anonymous group 
AKSAudit
| where TimeGenerated > ago (1h)
| where PodName !startswith "kube-apiserver"
| extend authorizationDecision = parse_json(Annotations)
| extend user = parse_json(User)
| extend sourceIps = parse_json(SourceIps)
| extend ObjectRef = parse_json(ObjectRef)
| where user.groups[0] in ("system:unauthenticated","system:anonymous") or user.groups[1] in ("system:unauthenticated","system:anonymous")
| project TimeGenerated, PodName, Verb, Resource=ObjectRef.resource, SourceIP=sourceIps[0],Username=user.username, AuthorizationDecision=authorizationDecision["authorization.k8s.io/decision"], AuthorizationDecisionReason=authorizationDecision["authorization.k8s.io/reason"], RequestUri
| order by TimeGenerated asc

5. Users that are trying to get secrets  

There are not many common use cases where a user is getting Kubernetes secrets.

//Users that have executed 'kubectl get secrets' 
AKSAudit
| where TimeGenerated > ago (1h)
| where Verb == "get"
| extend authorizationDecision = parse_json(Annotations)
| extend user = parse_json(User)
| extend sourceIps = parse_json(SourceIps)
| extend ObjectRef = parse_json(ObjectRef)
| where ObjectRef.resource == "secrets"
| where user.username !startswith "system:serviceaccount"
| project TimeGenerated, PodName, Verb, SourceIP=sourceIps[0],Username=user.username, AuthorizationDecision=authorizationDecision["authorization.k8s.io/decision"], AuthorizationDecisionReason=authorizationDecision["authorization.k8s.io/reason"], RequestUri
| order by TimeGenerated asc

6. Users that are trying to bind to Cluster Admin roles 

Like we discussed above Cluster Admin roles should be used sparingly. You would want to know who is trying to bind to Cluster Roles

//Users that have executed Cluster Role Binding on cluster admin 
AKSAudit
| where TimeGenerated > ago (1h)
| where Verb == "create"
| extend authorizationDecision = parse_json(Annotations)
| extend user = parse_json(User)
| extend sourceIps = parse_json(SourceIps)
| extend ObjectRef = parse_json(ObjectRef)
| where user.username != "aksService"
| where ObjectRef.resource startswith "clusterrole"
| project TimeGenerated, PodName, Verb, Resource=ObjectRef.resource, SourceIP=sourceIps[0],Username=user.username, AuthorizationDecision=authorizationDecision["authorization.k8s.io/decision"], AuthorizationDecisionReason=authorizationDecision["authorization.k8s.io/reason"], RequestUri
| order by TimeGenerated asc

7. Pods that are in default namespace  

In Kubernetes, namespaces provides a mechanism for isolating groups of resources within a single cluster. Namespaces are a way to divide cluster resources between multiple users. Kubernetes includes the default namespace so that you can start using your new cluster without first creating a namespace. Unless a namespace is specified when creating a resource Kubernetes assumes the default namespace. As a result users will deploy their resources in Default Namespace and can make potentially malicious changes to running applications in the default namespace. 

ContainerLogV2 
| where TimeGenerated > ago (1h)
| where LogSource == "stdout"
| where PodNamespace == "default"
| extend ContainerHostName1 = tostring(LogMessage.hostname)
| join ContainerInventory on $left.ContainerHostName1 == $right.ContainerHostname
| summarize dcount(Image) by ContainerHostname, Repository, Image
| project ContainerHostname, Repository, Image

Conclusion

In this document we showed you: 

  • How to set up Azure Sentinel to monitor security risks in Azure Kubernetes Services (AKS) clusters. It also discusses the container security risks and how Microsoft Defender for Cloud addresses them. 

  • Provide you a map of 32 container risks and how Microsoft Defender for Cloud alerts and Azure Policies provide a great coverage. It covers the areas of pod security, network security, container image security, kubelet activity, API server security, RBAC monitoring, secrets and ConfigMap access, audit logging, compliance monitoring, container runtime security, and incident response and forensics.

  • Showed you how to configure your environment to monitor AKS. The data connectors to ingest AKS data, setting up the diagnostic settings for the AKS cluster and enabling Container Insights to get the pod level data. Showed you how to enable the AKS Connector from Content Hub, which provides a workbook, hunting queries, and a data connector. 

  • We show how to use the out-of-the-box workbook that comes with the AKS Connector. The workbook helps to understand the AKS cluster coverage, the security state, the security alerts, and the security recommendations. It also provides filters and drill-downs for further analysis.

  • Provide some examples of custom search queries that can be used to mine the log tables for more pressing risks. The queries can help to identify unauthorized or suspicious pods, privilege escalation attempts, changes to pod security policies, unexpected network traffic patterns, unauthorized access to secrets, and more.

 

This article was originally published by Microsoft's Sentinel Blog. You can find the original article here.