Deep dive into Microsoft Sentinel’s new Overview dashboard

Special thanks to @OriLicht and @edilahav for the collaboration

Microsoft Sentinel's Overview dashboard provides operational and health insights from each of the main function domains of Microsoft Sentinel and also gives an idea of SOC efficiency.

The new overview experience consist of widgets which surface data on the core components of Microsoft Sentinel like incidents, data connectors, , TI, analytics, etc.

neelam_n_0-1688116433334.png

In this article, we will take a deeper look into the widgets and the KQL queries that are being used (where applicable) by each widget.

Widgets data refresh

On top of the overview page, you can find a refresh button that refreshes the data in the entire dashboard.

neelam_n_1-1688116433335.png

Each widget's data has been pre-calculated for improved performance and each widget refresh time is shown on top of the widget.

For example, the Incidents widget:

neelam_n_2-1688116433336.png

Also, for the Data widget, the data is recalculated every 60 minutes:

neelam_n_3-1688116433336.png

Incidents widget

The incidents widget includes a summary of incidents created during the last 24 hours by status, by severity, closed incidents, closing classification and also, you can find meters for mean time to acknowledge an incident and mean time to close with a link to the Security Operations Efficiency workbook.

neelam_n_4-1688116433339.png

Queries used to fetch data for the incidents widget:

  1. Incidents by status (last 24 hours)

SecurityIncident

| summarize arg_max(LastModifiedTime, Status) by IncidentName

| summarize Count = count() by Status 

  1. Incidents by severity (last 24 hours)

SecurityIncident

| summarize arg_max(LastModifiedTime,Severity) by IncidentName

| summarize Count = count() by Severity

  1. Incidents by closed classification (last 24 hours)

SecurityIncident

| where Status == ‘Closed'

| where isnotempty(Classification)

| summarize arg_max(LastModifiedTime, Classification) by IncidentName

| summarize Count = count() by Classification

  1. Incidents status by creation time (last 24 hours)

SecurityIncident

| where CreatedTime > ago(1d)

| summarize arg_max(LastModifiedTime, Status, CreatedTime) by IncidentName

| summarize Count = count() by Status, bin(CreatedTime, 4h)

| extend StatusCount = pack(Status, Count) 

| summarize StatusCountArray = make_bag(StatusCount) by CreatedTime 

| evaluate bag_unpack(StatusCountArray)

| project Result = pack_all()

  1. Mean time to acknowledge (last 48 hours)

let MeanTimeToAck = SecurityIncident

| where Status == ‘Active'

| summarize arg_min(LastModifiedTime, CreatedTime, TimeGenerated) by IncidentName

| extend timeToAck = datetime_diff(‘Minute', LastModifiedTime, CreatedTime)

| summarize MeanTime = percentiles(timeToAck, 50) by HalfQueryPeriodTime = bin_at(TimeGenerated, 24h, ago(48h)) 

| order by HalfQueryPeriodTime asc;

MeanTimeToAck

| serialize HalfQueryPeriodTime

| extend MeanTime = MeanTime/todouble(60)

| extend Trend = (MeanTime – prev(MeanTime))/todouble(60)

| order by HalfQueryPeriodTime desc

| project MeanTime, Trend

  1. Mean time to close (last 48 hours)

let MeanTimeToClose = SecurityIncident

| where Status == ‘Closed'

| summarize arg_min(LastModifiedTime, ClosedTime, CreatedTime, TimeGenerated) by IncidentName

| extend timeToClose = datetime_diff(‘Minute', ClosedTime, CreatedTime)

| summarize MeanTime = percentiles(timeToClose, 50)

by HalfQueryPeriodTime = bin_at(TimeGenerated, 24h, ago(48h)) 

| order by HalfQueryPeriodTime asc;

MeanTimeToClose

| serialize HalfQueryPeriodTime

| extend MeanTime = MeanTime/todouble(60)

| extend Trend = (MeanTime – prev(MeanTime))/todouble(60)

| order by HalfQueryPeriodTime desc

| project MeanTime, Trend

Automation widget

The widget includes a summary of the rules activity including incidents closed by automation, the time saved when using automations and a graph showing the different types of automation actions that were performed.

It also includes a link to view the Playbooks health workbook.

At the bottom, you can find a count of the active automation rules with a link to the Automation blade.

neelam_n_5-1688116433341.png

Queries used to fetch data for the automation widget (last 24 hours):

Please note: The queries below do not include the “active automation rules” as this is using the Automation Rules API call to list all the automation rules and perform a count in the backend.

  1. Closed incidents

SecurityIncident

| where Status == ‘Closed'

| summarize arg_min(LastModifiedTime, ModifiedBy) by IncidentName

| where ModifiedBy has ‘Automation rule'

| summarize Count = count()

  1. Time saved

SecurityIncident

| where Status == ‘Closed'

| summarize arg_min(LastModifiedTime, ClosedTime, CreatedTime, ModifiedBy) by IncidentName

| extend timeToClose = datetime_diff(‘Minute',ClosedTime, CreatedTime)

| extend IsClosedByAutomation = iff(ModifiedBy has ‘Automation rule','ClosedByAutomation', ‘NotClosedByAutomation')

| summarize MeanTimeToClose = percentiles(timeToClose, 50) by IsClosedByAutomation

  1. Actions performed

let IncidentsData = materialize(SecurityIncident

| order by IncidentName asc, todatetime(TimeGenerated) asc

| extend rowNumber = row_number(0, IncidentName != prev(IncidentName))

| extend prevRowNumber = rowNumber – 1

| project TimeGenerated, IncidentName,

ModifiedBy, rowNumber, prevRowNumber, Severity, Status, Comments, Owner);

IncidentsData

| where ModifiedBy has ‘Automation rule' or ModifiedBy startswith ‘playbook'

| join IncidentsData on $left.prevRowNumber == $right.rowNumber and

$left.IncidentName == $right.IncidentName

| project TimeGenerated, ModifiedBy,

IncidentName,

Severity, PrevSeverity = Severity1,

Status, PrevStatus = Status1,

Comments,  PrevComments = Comments1,

Owner, PrevOwner = Owner1

| extend isSeverityChanged = Severity != PrevSeverity

| extend isStatusChanged = Status != PrevStatus

| extend isCommentsChanged = tostring(Comments) != tostring(PrevComments)

| extend isOwnerChanged = tostring(Owner.objectId) != tostring(PrevOwner.objectId)

| summarize Severity = dcountif(IncidentName, isSeverityChanged),

Status = dcountif(IncidentName, isStatusChanged),

Comments = dcountif(IncidentName, isCommentsChanged),

Owner = dcountif(IncidentName, isOwnerChanged)

Data widget

The data widget provides visibility to ingestion volumes, health telemetry of the data connectors and by the types of IoCs

neelam_n_0-1688117242240.png

Let's break this into each of the components: Data received graph, Data connectors and TI by type.

Data received graph:

The graph that shows the number of records that were collected by Microsoft Sentinel in the last 24 hours, compared to the previous 24 hours and anomalies detected in that time period.

neelam_n_1-1688117274807.png

Queries used to fetch the data for the data received graph:

Total volume of events

let endTime = now(); 

let startTime = ago(2d); 

search * 

| where TimeGenerated between (startTime..endTime) 

| make-series RecordCount = count() default=0 on TimeGenerated from startTime to endTime step 4h 

| mv-expand RecordCount to typeof(int), TimeGenerated to typeof(datetime) 

| project Result = pack_all() 

Anomalies

let endTime = now();

let startTime = ago(1d);

let emptyTableAnoamliesVolume = datatable(TimeGenerated:datetime, RecordCount:int)[];

emptyTableAnoamliesVolume

| union isfuzzy=true

                ( Anomalies

                | where TimeGenerated between (StartTimeAnomalies..EndTimeAnomalies)

                | make-series RecordCount = count() default=0 on TimeGenerated from StartTimeAnomalies to EndTimeAnomalies step 4h

                | mv-expand RecordCount to typeof(int), TimeGenerated to typeof(datetime))

                | project Result = pack_all()

Data Connectors

On the top right, you see a summary of the status of the data connectors, divided by unhealthy and active connectors. Unhealthy connectors indicate how many connectors have errors . Active connectors are connectors with data streaming into Microsoft Sentinel, as measured by the query below.

neelam_n_2-1688117435825.png

Queries used to fetch the data for the data connectors:

Unhealthy connectors

Unhealthy connectors indicate that there are errors with connectors and this is indicated by the operation “Data fetch status change” status and the status is “Failure”. As the reason for failure can be due to several factors, the way to identify why there was a failure is to look at the “Description” field in the SentinelHealth table and take the necessary actions to resolve the issue to bring the connector to a “healthy” state.

SentinelHealth
| where OperationName == ‘Data fetch status change'
| summarize arg_max(TimeGenerated, *) by SentinelResourceId, SentinelResourceKind, SentinelResourceName
| where Status == “Failure”

Active Connectors

Active connectors are connectors with data streaming into Microsoft Sentinel. This is measured by a union of queries for each connector. For example, if there are 10 connectors receiving data in the workspace (or 10 active connectors), the widget is using a union of 10 queries for each connector and performing a count in the backend to display the 10 active connectors.

The following is a sample query that is used to define an active connector:

| // if more than one connector can ingest data to this table

| summarize LastLogReceived = max(TimeGenerated)

| project IsConnected = LastLogReceived > ago(

| extend Key = , Group = 0)

An example of this union function with the active “Syslog” receiving Sysmon data and “Okta Single Sign-On” connectors would be :

let emptyTable = datatable(TimeGenerated:datetime)[];

union isfuzzy=true

(emptyTable),

(Syslog | where ProcessName == ‘sysmon'

| summarize LastLogReceived = max(TimeGenerated)

| project IsConnected = LastLogReceived > ago(30d)

| extend Key = “GUID for Sysmon For ”, Group = 0) ,

(Okta_CL

| summarize LastLogReceived = max(TimeGenerated)

| project IsConnected = LastLogReceived > ago(30d)

| extend Key = “GUID for OktaSSO”, Group = 0)

TI by type

In addition, the widget also includes a view on the records in Microsoft Sentinel, by indicator of compromise (IoC). You can view the number of active IoCs imported in the last 24 hours in Microsoft Sentinel based on the URL, IP address, File, Email, Domain, etc.

neelam_n_3-1688117757386.png

Query used for TI by type

ThreatIntelligenceIndicator

| where ExpirationDateTime > now()

| summarize arg_max(TimeGenerated, *) by IndicatorId

| where Active == true

| extend IndicatorType = 

iif(isnotempty(EmailSourceIpAddress) or isnotempty(NetworkDestinationIP) or isnotempty(NetworkIP) or isnotempty(NetworkSourceIP) or isnotempty(NetworkCidrBlock), ‘IP',

iff(isnotempty(Url), ‘URL',

iff(isnotempty(EmailRecipient) or isnotempty(EmailSenderAddress), ‘Email',

iff(isnotempty(FileHashValue), ‘File',

iff(isnotempty(DomainName) or isnotempty(EmailSourceDomain), ‘Domain',

    ‘Other')))))

| summarize IndicatorCount = count() by IndicatorType

Learn more about working with threat indicators: Work with threat indicators in Microsoft Sentinel | Microsoft Learn

Analytics widget

The analytics widget provides operational insights on the analytics rules status in the environment. It shows which rules are enabled or disabled in the workspace and also, which ones were causing constant failures and were automatically disabled (or auto disabled). The widget also allows you to access the MITRE blade to map the analytics rules coverage.

neelam_n_4-1688118940310.png

Unlike the widgets mentioned above, this one does not use a KQL query to present the data, but instead, it uses the Alerts Rules  API and a logic that counts the disabled, auto-disabled and enabled rules.

Learn more about the Alert Rules API: Alert Rules – REST API (Azure Sentinel) | Microsoft Learn

Using the Sentinel Overview dashboard queries

Now that we had a look at the queries that run behind the widgets in the Microsoft Sentinel Overview dashboard, there are many ways which you can adapt and re-use these queries.

Some examples are:

  • Re-create the visuals using workbooks to create a more “personalized” dashboard.

The Overview dashboard currently only allows you to see data for the last 24 hours and only shows data for one workspace.

By using Workbooks, you can create interactive visual reports which provides the flexibility to choose any timeframe for querying your data and also, if you have multiple workspaces and would like to get insights across all your workspaces, you can use cross workspace queries in your workbooks.

  • Creating custom detection rules

    The queries can be added to your your custom analytics rules or hunting queries to get additional insights.

    While the overview page provides the visibility, analytics rules will create alerts/incidents as another way to surface any health related issues for the visibility of the SOC members.

Microsoft Sentinel Overview dashboard reference

Visualize collected data | Microsoft Learn

Microsoft Sentinel Health & Monitoring

Turn on auditing and health monitoring in Microsoft Sentinel | Microsoft Learn

Monitor the health of your Microsoft Sentinel data connectors | Microsoft Learn

Monitor the health of your Microsoft Sentinel automation rules and playbooks | Microsoft Learn

Monitor the health and audit the integrity of your Microsoft Sentinel analytics rules | Microsoft Le…

Monitor the health and role of your Microsoft Sentinel SAP systems | Microsoft Learn

 

This article was originally published by Microsoft's Sentinel Blog. You can find the original article here.