Continuously Monitor the Performance of your AzureML Models in Production

We are thrilled to announce the public preview of Azure model monitoring, allowing you to effortlessly monitor the overall health of your deployed models. Model monitoring is an essential part of the cyclical lifecycle, encompassing both data science and operational aspects of tracking model performance in production. Changes in data and consumer behavior can influence your model, causing your systems to become outdated. This may result in reduced model performance in production, adversely affecting business outcomes and potentially leading to compliance concerns in highly regulated environments. With AzureML model monitoring, you can receive timely alerts about critical issues, analyze results for model enhancement, and minimize the numerous inherent risks associated with deploying ML models.

Capabilities of AzureML model monitoring

AzureML model monitoring provides the following capabilities:

  • Simple model monitoring configuration with AzureML online endpoints. If you deploy your model to production with AzureML online endpoints, AzureML collects production inference data automatically and uses it for continuous model monitoring, providing you with an easy configuration process.
  • Pre-configured and customizable monitoring signals. Model monitoring supports a variety of configurable monitoring signals for tabular datasets, including data drift, prediction drift, data quality, and feature attribution drift. You can choose your preferred metric(s) and adjust alert thresholds for each signal. If the pre-configured signals don't suit your needs, create a custom monitoring signal component tailored to your business scenario.
  • Use of recent past production data or training data as comparison baseline dataset. For model signals and metrics, AzureML lets you set these datasets as the baseline dataset for comparison, enabling you to monitor for both drift and skew.
  • Monitoring of data drift or data quality based on feature importance explanations. If you use training data as your comparison baseline dataset, you can define data drift or data quality signals and monitor only the most important features for your predictions, saving costs.
  • Analyze monitoring metrics from a comprehensive UI. View change in drift metrics over time, see which features are violating defined thresholds, and analyze your baseline and production feature distributions side-by-side within a comprehensive monitoring UI.

AzureML model monitoring signals

Evaluating the performance of a production ML system requires examining various signals, including data drift, model prediction drift, data quality, and feature attribution drift. Such shifts can lead to outdated models: by identifying these shifts, organizations can proactively implement measures like model retraining to maintain optimal model performance and minimize risks associated with outdated or mismatched data.

  • Data drift: Monitoring data drift is vital for maintaining the accuracy and performance of machine learning models in production. AzureML allows you to detect changes in data distributions, mitigating risks associated with outdated or mismatched data.
  • Prediction drift: Significant changes in a model's prediction distribution may indicate prediction drift, which can result from shifts in data or code. AzureML's proactive monitoring of model outputs aids you in identifying issues within the model as it responds to these data shifts.
  • Data quality: Maintaining data quality is essential, as errors in upstream data processing can lead to unexpected model behavior. Changes in data sources, schemas, logging, or upstream features generated by other ML models can impact your model significantly. AzureML detects data issues such as null values, range violations, or type mismatches, ensuring optimal performance and enabling you to proactively fix issues.
  • Feature attribution drift: Changes in feature importance distributions between training and production may signify feature attribution drift, potentially indicating unexpected model behavior. AzureML helps you evaluate each feature's influence on predictions by tracking their contributions over time and detecting shifts in feature importance, which helps identify unexpected behavior and potential accuracy impacts.

For a complete overview of AzureML model monitoring signals and metrics, take a look at this document.

How to enable AzureML model monitoring

Take the following steps to enable model monitoring in AzureML:

  1. Enable production inference data collection. If you deploy a model to an AzureML online endpoint, you can enable production inference data collection by using AzureML Model Data Collector. If you deploy your model to an AzureML batch endpoint or outside of AzureML, you're responsible for collecting your own production inference data, which can then be used for AzureML model monitoring.
  2. Configure model monitoring. You can use AzureML's SDK, CLI, or the Studio UI to easily set up model monitoring. During setup, you can specify your preferred monitoring signals, configure your desired metrics, and set the respective alert threshold for each metric.
  3. View and analyze model monitoring results. Once model monitoring is configured, a monitoring job is scheduled, which calculates and evaluates metrics for all selected monitoring signals, and triggers alert notifications whenever a specified threshold is exceeded. You can follow the link in the alert notification to your AzureML workspace to view and analyze monitoring results. 

Step 1: During AzureML Model Monitoring set-up, users can configure the signals and metrics to monitor the performance of their model in production.Step 1: During AzureML Model Monitoring set-up, users can configure the signals and metrics to monitor the performance of their model in production.

Step 2: After model monitoring is configured, users can view a comprehensive overview of signals, metrics, and alerts in AzureML's Monitoring UI.Step 2: After model monitoring is configured, users can view a comprehensive overview of signals, metrics, and alerts in AzureML's Monitoring UI.

Step 3: For a specific drift signal, users can view the metric change over time in addition to a histogram displaying the baseline distribution compared to the production distribution.Step 3: For a specific drift signal, users can view the metric change over time in addition to a histogram displaying the baseline distribution compared to the production distribution.

AzureML model monitoring best practices

Each machine learning model and its use cases are unique. Therefore, model monitoring is unique for each situation. The following is a list of recommended for model monitoring:

  • Start monitoring your model as soon as it is deployed to production. The sooner you begin monitoring your production model, the sooner you will be able to identify issues and resolve them.
  • Work with data scientists that are familiar with the model to set up model monitoring. These data scientists have insight into the model and its use cases. They are best positioned to recommend the best monitoring signals, metrics, and alert thresholds to use, thereby reducing alert fatigue.
  • Include multiple monitoring signals in your monitoring setup. With multiple monitoring signals, you get both a broad view of your model's health in addition to granular insights into model performance. For example, you can combine both data drift and feature attribution drift signals to get an early warning about a model performance issue.
  • Use model training data as the baseline dataset. For comparison based on the baseline dataset, AzureML allows you to use the recent past production data or historical data (such as training data or validation data). For a meaningful comparison, we recommend that you use the training data as the comparison baseline for data drift and data quality. For prediction drift, we recommend using the validation data as the comparison baseline.
  • Specify the monitoring frequency based on how your production data will change over time. For example, if your production model has a large amount of daily traffic, and the daily data accumulation is sufficient for you to monitor, then you can configure your model monitor to run on a daily basis. Otherwise, you can consider a weekly or monthly monitoring frequency, based on the growth of your production data over time.
  • Monitor the top N important features or a subset of features. If you use training data as your comparison baseline by default, AzureML monitors data drift or data quality for the top 10 important features. For models that have a large number of features, consider monitoring a subset of those features to reduce both computation costs and monitoring noise.

Get started with AzureML model monitoring today

Get started with AzureML model monitoring today! You can find more information about AzureML Model Monitoring below:

To learn more about AzureML model monitoring, watch these Microsoft Build 2023 breakout sessions:

 

This article was originally published by Microsoft's AI - Machine Learning Blog. You can find the original article here.