Feature attributions for forecasting

Introduction

This page provides a brief conceptual overview of the feature attribution methods available with Vertex AI. For an in-depth technical discussion, see our AI Explanations Whitepaper.

Global feature importance (model feature attributions) shows you how much each feature impacts a model. The values are provided as a percentage for each feature: the higher the percentage, the more impact the feature had on model training. For example, after reviewing the global feature importance for your model, you may come to the following conclusion: "The model sees that the previous month's sales are usually the strongest predictor of the next month's sales. Factors such as customer count and promotions are important, but they are less important than the sales figures."

To view the global feature importance for your model, examine the evaluation metrics.

Local feature attributions for time series models indicate how much each feature in a model contributed to a prediction. They measure a feature's contribution to a prediction relative to an input baseline. For numerical features such as sales, the baseline input is the median sales. For categorical features such as the product name, the baseline input is the most common product name. The sum of all attributions is not the prediction. The sum indicates how much the prediction differs from the baseline prediction (that is, all inputs are baseline inputs).

Feature attributions are determined based on forecasts made for counterfactuals. An example forecast is as follows: What would be the forecast if the advertisement value of TRUE on 2020-11-21 was replaced with FALSE, the most common value? The required number of counterfactuals scales with the number of columns and the number of paths (service generated). The resulting number of predictions may be orders of magnitude larger than in a normal prediction task and the expected run time scales accordingly.

You can use Forecasting with AutoML or Tabular Workflow for Forecasting to generate and query local feature attributions. Forecasting with AutoML supports batch predictions only. Tabular Workflow for Forecasting supports both batch predictions and online predictions.

Advantages

If you inspect specific instances, and also aggregate feature attributions across your training dataset, you can get deeper insight into how your model works. Consider the following advantages:

  • Debugging models: Feature attributions can help detect issues in the data that standard model evaluation techniques would usually miss.

  • Optimizing models: You can identify and remove features that are less important, which can result in more efficient models.

Conceptual limitations

Consider the following limitations of feature attributions:

  • Feature attributions, including local feature importance for AutoML, are specific to individual predictions. Inspecting the feature attributions for an individual prediction may provide good insight, but the insight may not be generalizable to the entire class for that individual instance, or the entire model.

    To get more generalizable insight for AutoML models, refer to the model feature importance. To get more generalizable insight for other models, aggregate attributions over subsets over your dataset, or the entire dataset.

  • Each attribution only shows how much the feature affected the prediction for that particular example. A single attribution might not reflect the overall behavior of the model. To understand approximate model behavior on an entire dataset, aggregate attributions over the entire dataset.

  • Although feature attributions can help with model debugging, they do not always indicate clearly whether an issue arises from the model or from the data that the model is trained on. Use your best judgment, and diagnose common data issues to narrow the space of potential causes.

  • The attributions depend entirely on the model and data used to train the model. They can only reveal the patterns the model found in the data, and can't detect any fundamental relationships in the data. The presence or absence of a strong attribution to a certain feature doesn't mean there is or is not a relationship between that feature and the target. The attribution merely shows that the model is or is not using the feature in its predictions.

  • Attributions alone cannot tell if your model is fair, unbiased, or of sound quality. Carefully evaluate your training data and evaluation metrics in addition to the attributions.

For more information about limitations, see the AI Explanations Whitepaper.

Improving feature attributions

The following factors have the highest impact on feature attributions:

  • The attribution methods approximate the Shapley value. You can increase the precision of the approximation by increasing the number of paths for the sampled Shapley method. As a result, the attributions could change dramatically.
  • The attributions only express how much the feature affected the change in prediction value, relative to the baseline value. Be sure to choose a meaningful baseline, relevant to the question you're asking of the model. Attribution values and their interpretation might change significantly as you switch baselines.

You can view the path count and the baselines in the Explanation Parameters and Metadata.

View explanation metadata and parameters

The Explanation Parameters and Metadata contain the following:

  • static_value: The baselines used to generate explanations.
  • pathCount: The number of paths, a factor in the amount of time it takes to generate feature attributions.
  • historical_values, prediction_values: Columns available at forecast.
  • historical_values: Columns unavailable at forecast.

The model can be viewed using the Vertex AI REST API and includes the explanation spec.

REST

Before using any of the request data, make the following replacements:

  • LOCATION: Region where your model is stored
  • PROJECT: Your project ID.
  • MODEL_ID: The ID of the model resource

HTTP method and URL:

GET https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT/locations/LOCATION/models/MODEL_ID

To send your request, choose one of these options:

curl

Execute the following command:

curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT/locations/LOCATION/models/MODEL_ID "

PowerShell

Execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT/locations/LOCATION/models/MODEL_ID " | Select-Object -Expand Content

You should see output similar to the following for a trained AutoML model.

Algorithm

Vertex AI provides feature attributions using Shapley Values, a cooperative game theory algorithm that assigns credit to each player in a game for a particular outcome. Applied to machine learning models, this means that each model feature is treated as a "player" in the game and credit is assigned in proportion to the outcome of a particular prediction. For structured data models, Vertex AI uses a sampling approximation of exact Shapley Values called Sampled Shapley.

For in-depth information about how the sampled Shapley method works, read the paper Bounding the Estimation Error of Sampling-based Shapley Value Approximation.

What's next

The following resources provide further useful educational material: