How to explain and interpret a model using Responsible AI (Part 7)

Assessing a model is not just about understanding how accurately it can make a prediction, but also why it made the prediction. Understanding a model's behavior is a critical part of debugging and helps drive responsible outputs. By evaluating which data features are driving a model's prediction, you can identify if they are acceptable sensitive or non-sensitive features to base a decision on. For instance, if a model is using race or gender to predict a diabetic patient's time in the hospital, then that's a red flag to investigate the model. In addition, being able to explain a model's outcome provides shared understanding for data scientists, decision-makers, end-users and auditors. Some industries have compliance regulations that require organizations to provide an explanation for how and why a model made the prediction it did. If an system is driving the decision-making, then data scientists need to specify the data features driving the model to make a prediction.

That is where Azure 's responsible (RAI) dashboard is beneficial. The Feature Importance component of the RAI dashboard provides an interactive user interface (UI) that enables data scientists or AI developers to see the top features in their dataset that influence their model's prediction. In addition, it provides both global explanations and local explanations. With global explanations, the dashboard displays the top features that affect the model's overall predictions. For local explanations, it shows which features most influenced a prediction for an individual data point. In our diabetes hospital readmission use case, every patient is different, so what features drove the model to make a prediction for one patient may not be as important for another patient.

The Feature Importance component has built-in model explainability and interpretability capabilities to help users answer questions in scenarios such as:

  • Model debugging: Why did my model make this mistake? How can I improve my model?
  • Human-AI collaboration: How can I understand and trust the model's decisions?
  • Regulatory compliance: Does my model satisfy legal requirements?

In this tutorial, we will explore use the Feature Importance section of the RAI dashboard. This is a continuation of our Diabetes Hospital use case we've been using in this tutorial series. In the prior tutorial, we discovered gaps in our dataset distribution where the data was over- or underrepresented, causing our model to have fairness, inclusivity, and reliability issues. We used the Data Analysis feature of the RAI dashboard to investigate where the model was performing poorly. Now, let us explore why it performed poorly, using Feature Importance.


Aggregate features importance

The Feature Importance component of the RAI dashboard enables users to get a comprehensive understanding of why and how a model made a prediction. The RAI dashboard displays the top data features that drove a model's overall predictions in the Feature Important section of the dashboard. This is also known as the global explanation.


A user can toggle the slider back-and-forth on top of the chart to display all the features, which are ordered in descending order of importance on the x-axis. The y-axis shows how much weight a feature has in driving a model's prediction in comparison to the rest of the other features. The color of bar(s) on the chart corresponds to the cohorts created on the dashboard. In our case, we have “All data”, which is the default cohort with the test dataset as well as the cohorts with the highest and least number of errors.  It looks like prior hospitalization (prior_Inpatient), age, number of other health diagnoses (number_diagnoses), prior emergency admission (prior_emergency), and the number of medications an individual is taking (num_medication) are the top 5 features driving our diabetic hospital readmission classification model predictions.

Feature Influence on a Model Prediction

In the debugging process, users have the ability to evaluate features to see how their values positively or negatively influence a model's outcome. This can help pinpoint any anomalies in the model's decision making. To put this into practice, let's start by:

  1. Selecting the “Class: Not Readmitted” option under the Class importance weights drop-down menu on the right-hand side of the dashboard.
  2. The dashboard gives you the ability to double-click on any bar or box on the chart to get details. In our case, we'll double-click on the “number_diagnoses” bar from the “All data” cohort (in blue color).
  3. This generates another chart below the Aggregate feature importance chart.
  4. The x-axis displays the number of diagnoses that were entered into the system for the diabetic patient.
  5. The y-axis displays the level of contribution to the model making a prediction of Not Readmitted.
    • Numbers above 0 show the level of positive contribution to the model's prediction of a patient Not Readmitted
    • Numbers below 0 show the level of negative contribution against the model's prediction from Not Readmitted to Readmitted.


As you can see from the graph, as we progress from 1 to 9 in “number_diagnoses” the model's confidence decreases in a patient's outcome to be not readmitted within 30 days back to the hospital. The chart shows that after the number of diagnoses is 7 or greater, the feature starts to negatively impact the model's prediction, by falling below the 0 axis, for a “Not Readmitted classification”.  Hence this leads to a “Readmitted” classification. This makes intuitive sense because a diabetic patient with additional medical conditions is more likely to get sick and return to the hospital again.

Individual feature importance

The RAI dashboard Feature Importance section has a table view that enables users to see which records the model made a correct vs. incorrect prediction. You can use each individual patient's record to see which features positively or negatively drove that individual outcome. This is especially useful when debugging to see where the model is performing erroneously for a specific patient, and which features are positive or negative contributors.


If we need to look at specific record(s) and see what features the model used in making an incorrect prediction, we can use that for our investigation.

In our case, we're going to:

  1. Click on “Switch cohort”, to select the where the model has the highest error rate.
  2. Next, we'll select record index #882. This will generate a Feature Important plot chart under the Table view. Here we see that “Prior_Inpatient”, “Age”, “Max_Glu_Serum” and “num_medications” are the top 4 features that are negative contributors to driving our model incorrectly predicting that the selected patient will not be readmitted within 30 days (the outcome should be Readmitted).


We can see that the 4 features from record index #882 are different from the top 4 features we saw with the models' overall predictions from the Aggregate feature importance tab.

Next, the Individual feature importance graph shows that the “Admission_source”, “prior_emergency”, “gender” and “insulin” positively contributed to the model's outcome (Not Readmitted). Since the model incorrectly predicted record index #882 as Not Readmitted, that means the positively contributing features are erroneous: “Admission_source”, “prior_emergency”, “gender” and “insulin” since they played a significant role in skewing the model's output for this data point.

Now we'll add record index #865, another record where the model predicted the opposite outcome, that a patient would be readmitted back to a hospital, incorrectly.


Here, we can see that the key features positively contributing to that model's prediction are “Prior_Emergency”, and “Insulin”. Once again, we see the top important features (in blue color) that drove the model's prediction have changed. In this case, “Prior_Emergency” was the top positive contributor. That means it had a major impact on the model's incorrect prediction in our selected cohort. In trying to debug why a model's prediction is erroneous for a given data point, this chart provides ML professionals an explanation of which features positively influenced the poor outcome.

Individual Conditional Expectation (using ICE plot)

Finally, you have the ability to select a feature and see the model's prediction for the different values in that feature using the Individual Conditional Expectation (ICE) plot. To do this, we need to select the “Individual Conditional Expectation (ICE) plot” radio button. Next, in the Feature drop-down menu, we'll select “A1CResult”. The orange dots represent record index #882 that the model incorrectly predicted as Not Readmitted. The blue dots represent record index #865 that the model incorrectly predicted as Readmitted.


The ICE plot, in this case, is showing the model's predictions for the different “A1CResult” values: “Norm”, “>7”, “>8” and “None” (the “None” when no A1C test). As we can see in the chart, a patient has the lowest chance of not being readmitted when the“AICResult” result is “>8” for both record index #882 and #865. Both the orange and blue dots are showing the lowest point in their at the “>8” mark on the axis, and the model's predicted probability of “not readmitted” is showing a low probability for both dots at the axis. This makes sense because an A1C result greater than 8 is considered a very high level and is serious for diabetic patients, hence, the need to be Readmitted in combination with other factors. As you can see, this is a good way to see how a feature's value impacts on a model's prediction.

This tutorial shows how the Feature Importance removes the black box way of not knowing how the model went about making a prediction. It provides a global understanding of what key features the model uses to make a prediction. In addition, for each feature, the user has the ability to explore and visualize how it is positively or negatively influencing the model's prediction for one class vs. another for each value in the feature. This exposes the thresholds the model has to produce a certain outcome. We saw this in the “Number_Diagnoses” feature.

Next, when we are debugging a model or just want to know why the model made an incorrect prediction for a specific datapoint, the dashboard shows the key feature contributors that positively and negatively influenced a model's output. All of this provides transparency to understand a model's behavior. This transparency helps uncover if the model is using the wrong sensitive or non-sensitive features to make predictions, so data scientists or stakeholders can mitigate the error. This level of detail is essential to help hospitals provide transparency during auditing to determine why the model made an incorrect or correct prediction that a diabetic patient was at risk of being readmitted within 30 days.  It also makes it easier to evaluate if the key features influencing a model's prediction will lead to fairness issues.

Now we are ready to learn generate desired model predictions from minimum data feature changes using Counterfactuals. Stay tuned for tutorial 8

DISCLAIMER: Microsoft products and services (1) are not designed, intended or made available as a medical device, and (2) are not designed or intended to be a substitute for professional medical advice, diagnosis, treatment, or judgment and should not be used to replace or as a substitute for professional medical advice, diagnosis, treatment, or judgment. Customers/partners are responsible for ensuring their solutions comply with applicable laws and regulations. Customers/partners also will need to thoroughly test and evaluate whether an AI tool is fit for purpose and identify and mitigate any risks or harms to end users associated with its use.


This article was originally published by Microsoft's AI - Machine Learning Blog. You can find the original article here.