The ML.PRINCIPAL_COMPONENTS function

This document describes the ML.PRINCIPAL_COMPONENTS function, which lets you see the principal components of a principal component analysis (PCA) model. Principal components and eigenvectors are the same concepts in PCA models.

Syntax

ML.PRINCIPAL_COMPONENTS(MODEL `project_id.dataset.model`)

Arguments

ML.PRINCIPAL_COMPONENT takes the following arguments:

  • project_id: Your project ID.
  • dataset: The BigQuery dataset that contains the model.
  • model: The name of the model.

Output

ML.PRINCIPAL_COMPONENTS returns the following columns:

  • principal_component_id: an INT64 that contains the principal component ID.
  • feature: a STRING value that contains the feature column name.
  • numerical_value: a FLOAT64 value that contains the feature value for the principal component that principal_component_id identifies if the column identified by the feature value is numeric. Otherwise, numerical_value is NULL.
  • categorical_value: an ARRAY<STRUCT> value that contains information about categorical features. Each struct contains the following fields:

    • categorical_value.category: a STRING value that contains the name of each category.
    • categorical_value.value: a FLOAT64 value that contains the value of categorical_value.category for the principal component that principal_component_id identifies.

The output is in descending order by the eigenvalues of the principal components, which you can get by using the ML.PRINCIPAL_COMPONENT_INFO function.

Example

The following example retrieves the principal components from the model mydataset.mymodel in your default project:

SELECT
  *
FROM
  ML.PRINCIPAL_COMPONENTS(MODEL `mydataset.mymodel`)

What's next