Special thanks to @romarsia for the collaboration and ideas.
Analytics rules in Microsoft Sentinel play a crucial role in helping SOC teams to protect the organization against cyberattacks by identifying and detecting potential threats so that they can analyze and respond quickly to security incidents. Therefore, it is important for SOC engineers to ensure their detection rules are functioning correctly and producing relevant with actionable information. Besides that, SOC engineers need to be aware of any planned or unplanned changes made to the rules to ensure compliance and integrity of effective defence.
Having the ability to monitor the health of analytics rules and changes helps to improve the accuracy and efficiency of security operations.
We are pleased to announce the new health and auditing monitoring capabilities for Analytics Rules.
With Analytics Health Monitoring, organizations can get insights into the health and rule running statuses. Besides that, SOC teams can use analytics health monitoring in the detection rule creation process in both production and pre-production environments. For instance, during development, the health information can be useful for testing and validation before deployment to production.
In addition, Sentinel's audit monitoring feature provides organizations with a comprehensive view of what changes were made to an analytics rule (by who, from where, and when). This helps organizations to detect any unauthorized changes that may compromise security.
Before we get started, let's have a quick overview on what are being offered in the new health and auditing monitoring capabilities for Analytics Rules.
- Microsoft Sentinel analytics rule health logs:
- This log captures events that record the running of analytics rules, and the end result of these runnings—if they succeeded or failed, and if they failed, why.
- The log also records how many events were captured by the query, whether or not that number passed the threshold and caused an alert to be fired.
- These logs are collected in the SentinelHealth table in Log Analytics.
- Microsoft Sentinel analytics rule audit logs:
- This log captures events that record changes made to any analytics rule, including which rule was changed, what the change was, the state of the rule settings before and after the change, the user or identity that made the change, the source IP and date/time of the change, and more.
- These logs are collected in the SentinelAudit table in Log Analytics.
How to enable health and auditing monitoring
To get health and auditing data from the tables described above, you must first turn on the Microsoft Sentinel health feature for your workspace. For more information, see Turn on auditing and health monitoring for Microsoft Sentinel.
Understanding SentinelHealth and SentinelAudit table events
The following types of analytics rule health events are logged in the SentinelHealth table:
- Scheduled analytics rule run.
- NRT analytics rule run.
For more information, see SentinelHealth table columns schema.
The following types of analytics rule audit events are logged in the SentinelAudit table:
- Create or update analytics rule.
- Analytics rule deleted.
For more information, see SentinelAudit table columns schema.
Visit Monitor the health and audit the integrity of your Microsoft Sentinel analytics rules to get a list of statuses and suggested steps for errors.
Using health and auditing data
You can use the pre-built functions on these tables _SentinelHealth() and _SentinelAudit(), instead of querying the tables directly. These functions ensure the maintenance of your queries' backward compatibility in the event of changes being made to the schema of the tables themselves. In order to view data related to analytics rules, you can filter the records by SentinelResourceType or SentinelResourceKind.
Below are some sample queries for your reference:
Rules without ‘Success' running status:
_SentinelHealth() | where SentinelResourceType =="Analytics Rule" | where Status != "Success"
Rules that have been “Auto-disabled”:
_SentinelHealth() | where SentinelResourceType =="Analytics Rule" | where Reason == "The analytics rule is disabled and was not executed."
Rule running status by reason:
_SentinelHealth() | where SentinelResourceType =="Analytics Rule" | summarize Occurence=count(), Unique_Rule= dcount(SentinelResourceId) by Status, Reason
Rule deletion activity:
_SentinelAudit() | where SentinelResourceType =="Analytic Rule" | where Description =="Analytics rule deleted"
Rule activity by rule name and activity name:
_SentinelAudit() | where SentinelResourceType =="Analytic Rule" | summarize Count= count() by RuleName=SentinelResourceName, Activity=Description
Rule activity by caller name:
_SentinelAudit() | where SentinelResourceType =="Analytic Rule" | extend Caller= tostring(ExtendedProperties.CallerName) | summarize Count = count() by Caller, Activity=Description
Besides that, we have provided an Analytics Health & Audit workbook to help you turns your health and audit data into insights quickly:
Sample use case
Let's walk through a sample use case on the usage of analytics health and audit data.
In my environment, I discovered a rule that failed to run with the reason “The analytics rule execution encountered an issue and could not be completed.”
In order to analyse the running history of this rule, I have filtered _SentinelHealth() with SentinelResourceName equals to the impacted rule with a longer time range. The results show that the rule was running fine up until recently.
Next, I proceed to check whether there were any changes made on this rule by querying _SentinelAudit(). The output shows that there were some changes made on the impacted rule.
In the records, I can drill down to ExtendedProperties column where I will find the orignal values of all the rule's properties and also the updated values under the column named OriginalResourceState and UpdatedResourceState. Besides that, I have the context of who performed the change, when, and from which IP address, which will be useful for the investigation process.
By comparing the OriginalResourceState and UpdatedResourceState values, I was able to identify what has changed for the rule. In this use case, the query was updated with a reference to a cross-workspace.
Upon investigation, the root cause was due to the cross-workspace being referenced in the rule no longer exisiting.
I hope you found this sample use case helpful in understanding the usage and process of analyzing analytics health and audit data. Understanding how to utilize health and audit data can be crucial in identifying and resolving analytics issues that may arise.
More information can be found in the following documentation:
SentinelHealth table schema: Microsoft Sentinel health tables reference | Microsoft Learn
SentinelAudit table schema: Microsoft Sentinel audit tables reference | Microsoft Learn