The ML.FEATURE_IMPORTANCE function

ML.FEATURE_IMPORTANCE function

The ML.FEATURE_IMPORTANCE function allows you to see feature importance score, which indicates how useful or valuable each feature was in the construction of the Boosted Tree model during training. See information about feature importance in the XGBoost library.

ML.FEATURE_IMPORTANCE returns the following columns:

  • feature: The name of the feature column in the input training data.
  • importance_weight: The number of times a feature is used to split the data across all trees.
  • importance_gain: The average gain across all splits the feature is used in.
  • importance_cover: The average coverage across all splits the feature is used in.

If the TRANSFORM clause is present in the CREATE MODEL statement that creates model, ML.FEATURE_IMPORTANCE outputs the information of the pre-transform columns from query_statement.

ML.FEATURE_IMPORTANCE permissions

You need both bigquery.models.create and bigquery.models.getData to run ML.FEATURE_IMPORTANCE.

ML.FEATURE_IMPORTANCE syntax

ML.FEATURE_IMPORTANCE(MODEL `project_id.dataset.model`)

Where:

  • project_id is your project ID.
  • dataset is the BigQuery dataset that contains the model.
  • model is the name of the model.

ML.FEATURE_IMPORTANCE example

The following example retrieves feature importance from mymodel in mydataset. The dataset is in your default project.

SELECT
  *
FROM
  ML.FEATURE_IMPORTANCE(MODEL `mydataset.mymodel`)

ML.FEATURE_IMPORTANCE limitations

The ML.FEATURE_IMPORTANCE function is subject to the following limitations: