The ML.CENTROIDS function
ML.CENTROIDS
function
The ML.CENTROIDS
returns information about the centroids in a
k-means model.
For information about model weights support in BigQuery ML, see Model weights overview.
For information about supported model types of each SQL statement and function, and all supported SQL statements and functions for each model type, read End-to-end user journey for each model.
ML.CENTROIDS
syntax
ML.CENTROIDS(MODEL `project_id.dataset.model` [, STRUCT(<T> as standardize)])
Replace the following:
project_id
is your project ID.dataset
is the BigQuery dataset that contains the model.model
is the name of the model.standardize
is an optional parameter that determines whether the centroid features should be standardized to assume that all features have a mean of zero and a standard deviation of one. Standardizing the features allows the absolute magnitude of the values to be compared to each other. The default value is false. The value that is supplied must be the only field in aSTRUCT
.
ML.CENTROIDS
output
ML.CENTROIDS
returns the following columns:
- centroid_id. An integer that identifies the centroid.
- feature. The column name that contains the feature.
- numerical_value. If
feature
is numeric, the value offeature
for the centroid thatcentroid_id
identifies. Iffeature
is not numeric, the value isNULL
. - categorical_value. An ARRAY of STRUCTs containing information about
categorical features. Each STRUCT contains the following fields:
- categorical_value.category. The name of each category.
- categorical_value.value. The value of
categorical_value.category
for the centroid thatcentroid_id
identifies.
- geography_value. If
feature
is of typeGEOGRAPHY
, the value offeature
for the centroid thatcentroid_id
identifies. If not, the value isNULL
.
The output contains one row per feature per centroid.
ML.CENTROIDS
examples
The following examples demonstrate the use of ML.CENTROIDS
in a query.
ML.CENTROIDS
without standardization
The following example retrieves centroid information from the k-means model
my_kmeans_model
in mydataset
. This model only contains numerical features.
SELECT * FROM ML.CENTROIDS(MODEL `mydataset.my_kmeans_model`)
This query returns results like the following:
+-------------+-------------------+----------------------+---------------------+ | centroid_id | feature | numerical_value | categorical_value | +-------------+-------------------+----------------------+---------------------+ | 3 | x_coordinate | 3095929.0 | [] | | 3 | y_coordinate | 1.0089726307692308E7 | [] | | 2 | x_coordinate | 3117072.65625 | [] | | 2 | y_coordinate | 1.0083220745833334E7 | [] | | 1 | x_coordinate | 3259947.096227731 | [] | | 1 | y_coordinate | 1.0105690227895036E7 | [] | | 4 | x_coordinate | 3109887.9056603773 | [] | | 4 | y_coordinate | 1.0057112358490566E7 | [] | +-------------+-------------------+----------------------+---------------------+
The following example retrieves centroid information from the k-means model
my_kmeans_model
in mydataset
. This model contains categorical features.
SELECT * FROM ML.CENTROIDS(MODEL `mydataset.my_kmeans_model`) ORDER BY centroid_id;
This query returns results like the following:
+-------------+-------------------+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | centroid_id | feature |numerical_value| categorical_value | +-------------+-------------------+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 1 | department | NULL | [{"category":"Medieval Art","feature_value":"1.0"}] | | 1 | medium | NULL | [{"category":"Iron","feature_value":"0.21602160216021601"},{"category":"Glass, ceramic","feature_value":"0.3933393339333933"},{"category":"Copper alloy","feature_value":"0.39063906390639064"}] | | 2 | medium | NULL | [{"category":"Wood, gesso, paint","feature_value":"0.15"},{"category":"Carnelian","feature_value":"0.2692307692307692"},{"category":"Papyrus, ink","feature_value":"0.2653846153846154"},{"category":"Steatite, glazed","feature_value":"0.3153846153846154"}] | | 2 | department | NULL | [{"category":"Egyptian Art","feature_value":"1.0"}] | | 3 | medium | NULL | [{"category":"Faience","feature_value":"1.0"}] | | 3 | department | NULL | [{"category":"Egyptian Art","feature_value":"1.0"}] | | 4 | medium | NULL | [{"category":"Steatite","feature_value":"1.0"}] | | 4 | department | NULL | [{"category":"Egyptian Art","feature_value":"1.0"}] | | 5 | medium | NULL | [{"category":"Red quartzite","feature_value":"0.20316027088036118"},{"category":"Bronze or copper alloy","feature_value":"0.3476297968397291"},{"category":"Gold","feature_value":"0.4492099322799097"}] | | 5 | department | NULL | [{"category":"Egyptian Art","feature_value":"1.0"}] | +-------------+-------------------+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
The following are the results from the same query against a k-means model with both numerical and categorical features.
+-------------+--------------------+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | centroid_id | feature | numerical_value | categorical_value | +-------------+--------------------+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 1 | start_station_name | NULL | [{"category":"Toomey Rd @ South Lamar","value":"0.5714285714285714"},{"category":"State Capitol @ 14th & Colorado","value":"0.42857142857142855"}] | | 1 | duration_minutes | 9.142857142857142 | [] | | 2 | duration_minutes | 9.0 | [] | | 2 | start_station_name | NULL | [{"category":"Rainey @ River St","value":"0.14285714285714285"},{"category":"11th & San Jacinto","value":"0.42857142857142855"},{"category":"ACC - West & 12th Street","value":"0.14285714285714285"},{"category":"East 11th St. at Victory Grill","value":"0.2857142857142857"}] | +-------------+--------------------+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
ML.CENTROIDS
with standardization
The following example retrieves centroid information from the k-means model
my_kmeans_model
in mydataset
. The query in this example assumes all features
have a mean of zero and a standard deviation of one.
SELECT * FROM ML.CENTROIDS(MODEL `mydataset.my_kmeans_model`, STRUCT(true AS standardize))