The ML.PRINCIPAL_COMPONENTS function

ML.PRINCIPAL_COMPONENTS function

The ML.PRINCIPAL_COMPONENTS function lets you see the principal components. Principal components and eigenvectors are the same concepts in a PCA model.

For information about model weights support in BigQuery ML, see Model weights overview.

For information about the supported model types of each SQL statement and function, and for a list of all of the supported SQL statements and functions for each model type, read End-to-end user journey for each model.

ML.PRINCIPAL_COMPONENTS syntax

ML.PRINCIPAL_COMPONENTS(MODEL `project_id.dataset.model`)

Where:

  • project_id: your project ID
  • dataset: the BigQuery dataset that contains the model
  • model: the name of the model

ML.PRINCIPAL_COMPONENTS output

The ML.PRINCIPAL_COMPONENTS function returns the following columns:

  • principal_component_id. An integer that identifies the principal component.
  • feature. The column name that contains the feature.
  • numerical_value. If feature is numerical, the value of feature for the centroid that centroid_id identifies. If feature is not numeric, the value is NULL.
  • categorical_value. An ARRAY of STRUCTs containing information about categorical features. Each STRUCT contains the following fields:
    • categorical_value.category. The name of each category.
    • categorical_value.value. The value of categorical_value.category for the centroid that centroid_id identifies.

The principal components are ordered in the descending order of their associated eigenvalues, which can be retrieved by using the ml.principal_component_info function.

ML.PRINCIPAL_COMPONENTS examples

The following example retrieves the principal components from mymodel in mydataset. The dataset is in your default project.

SELECT
  *
FROM
  ML.PRINCIPAL_COMPONENTS(MODEL `mydataset.mymodel`)