BigQuery Explainable AI overview

This document describes how BigQuery ML supports Explainable artificial intelligence (AI), sometimes called XAI.

Explainable AI helps you understand the results that your predictive machine learning model generates for classification and regression tasks by defining how each feature in a row of data contributed to the predicted result. This information is often referred to as feature attribution. You can use this information to verify that the model is behaving as expected, to recognize biases in your models, and to inform ways to improve your model and your training data.

BigQuery ML and Vertex AI both have Explainable AI offerings which offer feature-based explanations. You can perform explainability in BigQuery ML, or you can register your model in Vertex AI and perform explainability there.

For information about the supported SQL statements and functions for each model type, see End-to-end user journey for each model.

Local versus global explainability

There are two types of explainability: local explainability and global explainability. These are also known respectively as local feature importance and global feature importance.

Local explainability returns feature attribution values for each explained example. These values describe how much a particular feature affected the prediction relative to the baseline prediction.
Global explainability returns the feature's overall influence on the model and is often obtained by aggregating the feature attributions over the entire dataset. A higher absolute value indicates the feature had a greater influence on the model's predictions.

Explainable AI offerings in BigQuery ML

Explainable AI in BigQuery ML supports a variety of machine learning models, including both time series and non-time series models. Each of the models takes advantage of a different explainability method.

Model category	Model types	Explainability method	Basic explanation of the method	Local explain functions	Global explain functions
Supervised models	Linear & Logistic Regression	Shapley values	Shapley values for linear models are equal to `model weight * feature value`, where feature values are standardized and model weights are trained with the standardized feature values.	`ML.EXPLAIN_PREDICT`¹	`ML.GLOBAL_EXPLAIN`²
	Linear & Logistic Regression	Standard Errors and P-values	Standard errors and p-values are used for significance testing against the model weights.	N/A	`ML.ADVANCED_WEIGHTS`⁴
	Boosted trees Random forest	Tree SHAP	Tree SHAP is an algorithm to compute exact SHAP values for decision tree-based models.	`ML.EXPLAIN_PREDICT`¹	`ML.GLOBAL_EXPLAIN`²
		Approximate Feature Contribution	Approximates the feature contribution values. It is faster and simpler compared to Tree SHAP.	`ML.EXPLAIN_PREDICT`¹	`ML.GLOBAL_EXPLAIN`²
		Gini Index-based feature importance	A global feature importance score that indicates how useful or valuable each feature was in the construction of the boosted tree or random forest model during training.	N/A	`ML.FEATURE_IMPORTANCE`
	Deep Neural Network (DNN) Wide-and-Deep	Integrated gradients	A gradients-based method that efficiently computes feature attributions with the same axiomatic properties as the Shapley value. It provides a sampling approximation of exact feature attributions. Its accuracy is controlled by the `integrated_gradients_num_steps` parameter.	`ML.EXPLAIN_PREDICT`¹	`ML.GLOBAL_EXPLAIN`²
	AutoML Tables	Sampled Shapley	Sampled Shapley assigns credit for the model's outcome to each feature, and considers different permutations of the features. This method provides a sampling approximation of exact Shapley values.	N/A	`ML.GLOBAL_EXPLAIN`²
Time series models	ARIMA_PLUS	Time series decomposition	Decomposes the time series into multiple components if those components are present in the time series. The components include trend, seasonal, holiday, step changes, and spike and dips. See ARIMA_PLUS modeling pipeline for more details.	`ML.EXPLAIN_FORECAST`³	N/A
Time series models	ARIMA_PLUS_XREG	Time series decomposition and Shapley values	Decomposes the time series into multiple components, including trend, seasonal, holiday, step changes, and spike and dips (similar to ARIMA_PLUS). Attribution of each external regressor is calculated based on Shapley Values, which is equal to `model weight * feature value`.	`ML.EXPLAIN_FORECAST`³	N/A

¹ML_EXPLAIN_PREDICT is an extended version of ML.PREDICT.

²ML.GLOBAL_EXPLAIN returns the global explainability obtained by taking the mean absolute attribution that each feature receives for all the rows in the evaluation dataset.

³ML.EXPLAIN_FORECAST is an extended version of ML.FORECAST.

⁴ML.ADVANCED_WEIGHTS is an extended version of ML.WEIGHTS.

Explainable AI in Vertex AI

Explainable AI is available in Vertex AI for the following subset of exportable supervised learning models:

Model type	Explainable AI method
dnn_classifier	Integrated gradients
dnn_regressor	Integrated gradients
dnn_linear_combined_classifier	Integrated gradients
dnn_linear_combined_regressor	Integrated gradients
boosted_tree_regressor	Sampled shapley
boosted_tree_classifier	Sampled shapley
random_forest_regressor	Sampled shapley
random_forest_classifier	Sampled shapley

See Feature Attribution Methods to learn more about these methods.

Enable Explainable AI in Model Registry

When your BigQuery ML model is registered in Model Registry, and if it is a type of model that supports Explainable AI, you can enable Explainable AI on the model when deploying to an endpoint. When you register your BigQuery ML model, all of the associated metadata is populated for you.

Register your BigQuery ML model to the Model Registry.
Go to the Model Registry page from the BigQuery section in the Google Cloud console.
From the Model Registry, select the BigQuery ML model and click the model version to redirect to the model detail page.
Select More actions from the model version.
Click Deploy to endpoint.
Define your endpoint - create an endpoint name and click continue.
Select a machine type, for example, n1-standard-2.
Under Model settings, in the logging section, select the checkbox to enable Explainability options.
Click Done, and then Continue to deploy to the endpoint.

To learn how to use XAI on your models from the Model Registry, see Get an online explanation using your deployed model. To learn more about XAI in Vertex AI, see Get explanations.

What's next

Learn how to manage BigQuery ML models in Vertex AI.