The ML.CENTROIDS function

ML.CENTROIDS function

The ML.CENTROIDS returns information about the centroids in a k-means model.

For information about model weights support in BigQuery ML, see Model weights overview.

For information about supported model types of each SQL statement and function, and all supported SQL statements and functions for each model type, read End-to-end user journey for each model.

ML.CENTROIDS syntax

ML.CENTROIDS(MODEL `project_id.dataset.model`
             [, STRUCT(<T> as standardize)])

Replace the following:

  • project_id is your project ID.
  • dataset is the BigQuery dataset that contains the model.
  • model is the name of the model.
  • standardize is an optional parameter that determines whether the centroid features should be standardized to assume that all features have a mean of zero and a standard deviation of one. Standardizing the features allows the absolute magnitude of the values to be compared to each other. The default value is false. The value that is supplied must be the only field in a STRUCT.

ML.CENTROIDS output

ML.CENTROIDS returns the following columns:

  • centroid_id. An integer that identifies the centroid.
  • feature. The column name that contains the feature.
  • numerical_value. If feature is numeric, the value of feature for the centroid that centroid_id identifies. If feature is not numeric, the value is NULL.
  • categorical_value. An ARRAY of STRUCTs containing information about categorical features. Each STRUCT contains the following fields:
    • categorical_value.category. The name of each category.
    • categorical_value.value. The value of categorical_value.category for the centroid that centroid_id identifies.
  • geography_value. If feature is of type GEOGRAPHY, the value of feature for the centroid that centroid_id identifies. If not, the value is NULL.

The output contains one row per feature per centroid.

ML.CENTROIDS examples

The following examples demonstrate the use of ML.CENTROIDS in a query.

ML.CENTROIDS without standardization

The following example retrieves centroid information from the k-means model my_kmeans_model in mydataset. This model only contains numerical features.

SELECT
  *
FROM
  ML.CENTROIDS(MODEL `mydataset.my_kmeans_model`)

This query returns results like the following:

+-------------+-------------------+----------------------+---------------------+
| centroid_id | feature           | numerical_value      | categorical_value   |
+-------------+-------------------+----------------------+---------------------+
|           3 | x_coordinate      |            3095929.0 |                  [] |
|           3 | y_coordinate      | 1.0089726307692308E7 |                  [] |
|           2 | x_coordinate      |        3117072.65625 |                  [] |
|           2 | y_coordinate      | 1.0083220745833334E7 |                  [] |
|           1 | x_coordinate      |    3259947.096227731 |                  [] |
|           1 | y_coordinate      | 1.0105690227895036E7 |                  [] |
|           4 | x_coordinate      |   3109887.9056603773 |                  [] |
|           4 | y_coordinate      | 1.0057112358490566E7 |                  [] |
+-------------+-------------------+----------------------+---------------------+

The following example retrieves centroid information from the k-means model my_kmeans_model in mydataset. This model contains categorical features.

SELECT
  *
FROM
  ML.CENTROIDS(MODEL `mydataset.my_kmeans_model`)
ORDER BY
  centroid_id;

This query returns results like the following:

+-------------+-------------------+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| centroid_id | feature           |numerical_value| categorical_value                                                                                                                                                                                                                                              |
+-------------+-------------------+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|           1 | department        |          NULL | [{"category":"Medieval Art","feature_value":"1.0"}]                                                                                                                                                                                                            |
|           1 | medium            |          NULL | [{"category":"Iron","feature_value":"0.21602160216021601"},{"category":"Glass, ceramic","feature_value":"0.3933393339333933"},{"category":"Copper alloy","feature_value":"0.39063906390639064"}]                                                               |
|           2 | medium            |          NULL | [{"category":"Wood, gesso, paint","feature_value":"0.15"},{"category":"Carnelian","feature_value":"0.2692307692307692"},{"category":"Papyrus, ink","feature_value":"0.2653846153846154"},{"category":"Steatite, glazed","feature_value":"0.3153846153846154"}] |
|           2 | department        |          NULL | [{"category":"Egyptian Art","feature_value":"1.0"}]                                                                                                                                                                                                            |
|           3 | medium            |          NULL | [{"category":"Faience","feature_value":"1.0"}]                                                                                                                                                                                                                 |
|           3 | department        |          NULL | [{"category":"Egyptian Art","feature_value":"1.0"}]                                                                                                                                                                                                            |
|           4 | medium            |          NULL | [{"category":"Steatite","feature_value":"1.0"}]                                                                                                                                                                                                                |
|           4 | department        |          NULL | [{"category":"Egyptian Art","feature_value":"1.0"}]                                                                                                                                                                                                            |
|           5 | medium            |          NULL | [{"category":"Red quartzite","feature_value":"0.20316027088036118"},{"category":"Bronze or copper alloy","feature_value":"0.3476297968397291"},{"category":"Gold","feature_value":"0.4492099322799097"}]                                                       |
|           5 | department        |          NULL | [{"category":"Egyptian Art","feature_value":"1.0"}]                                                                                                                                                                                                            |
+-------------+-------------------+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

The following are the results from the same query against a k-means model with both numerical and categorical features.

+-------------+--------------------+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| centroid_id |      feature       |  numerical_value  | categorical_value                                                                                                                                                                                                                                                                 |
+-------------+--------------------+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|           1 | start_station_name |              NULL | [{"category":"Toomey Rd @ South Lamar","value":"0.5714285714285714"},{"category":"State Capitol @ 14th & Colorado","value":"0.42857142857142855"}]                                                                                                                                |
|           1 | duration_minutes   | 9.142857142857142 | []                                                                                                                                                                                                                                                                                |
|           2 | duration_minutes   |               9.0 | []                                                                                                                                                                                                                                                                                |
|           2 | start_station_name |              NULL | [{"category":"Rainey @ River St","value":"0.14285714285714285"},{"category":"11th & San Jacinto","value":"0.42857142857142855"},{"category":"ACC - West & 12th Street","value":"0.14285714285714285"},{"category":"East 11th St. at Victory Grill","value":"0.2857142857142857"}] |
+-------------+--------------------+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

ML.CENTROIDS with standardization

The following example retrieves centroid information from the k-means model my_kmeans_model in mydataset. The query in this example assumes all features have a mean of zero and a standard deviation of one.

SELECT
  *
FROM
  ML.CENTROIDS(MODEL `mydataset.my_kmeans_model`,
               STRUCT(true AS standardize))