The ML.FEATURE_IMPORTANCE function

The ML.FEATURE_IMPORTANCE function lets you to see the feature importance score, which indicates how useful or valuable each feature was in the construction of the boosted tree model during training. For more information about this type of feature importance, see this definition in the XGBoost library.

For information about Explainable AI, see Explainable AI Overview.

For information about supported model types of each SQL statement and function, and all supported SQL statements and functions for each model type, read End-to-end user journey for each model.

ML.FEATURE_IMPORTANCE syntax

ML.FEATURE_IMPORTANCE(MODEL `project_id.dataset.model`)

Where:

  • project_id is your project ID.
  • dataset is the BigQuery dataset that contains the model.
  • model is the name of the model.

ML.FEATURE_IMPORTANCE output

ML.FEATURE_IMPORTANCE returns the following columns:

  • feature: The name of the feature column in the input training data.
  • importance_weight: The number of times a feature is used to split the data across all trees.
  • importance_gain: The average gain across all splits the feature is used in.
  • importance_cover: The average coverage across all splits the feature is used in.

If the TRANSFORM clause is present in the CREATE MODEL statement that creates model, ML.FEATURE_IMPORTANCE returns the information of the pre-transform columns from query_statement.

ML.FEATURE_IMPORTANCE permissions

You need both bigquery.models.create and bigquery.models.getData to run ML.FEATURE_IMPORTANCE.

ML.FEATURE_IMPORTANCE example

This example retrieves feature importance from mymodel in mydataset. The dataset is in your default project.

SELECT
  *
FROM
  ML.FEATURE_IMPORTANCE(MODEL `mydataset.mymodel`)

ML.FEATURE_IMPORTANCE limitations

ML.FEATURE_IMPORTANCE is only supported with Boosted Tree models.