Supported input feature types
BigQuery ML supports different input feature types for different model types. Supported input feature types are listed in the following table:
Model Category | Model Types | Numeric types (INT64, NUMERIC, BIGNUMERIC, FLOAT64) | Categorical types (BOOL, STRING, BYTES, DATE, DATETIME) | TIMESTAMP | STRUCT | GEOGRAPHY | ARRAY<Numeric types> | ARRAY<Categorical types> | ARRAY<STRUCT<INT64, Numeric types>> |
---|---|---|---|---|---|---|---|---|---|
Supervised Learning | Linear & Logistic Regression | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | |
Deep Neural Networks | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | |||
Wide-and-Deep | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | |||
Boosted trees | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | |||
AutoML Tables | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | |||
Unsupervised Learning | K-means | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | |
PCA | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | |||
Autoencoder | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ||
Time Series Models | ARIMA_PLUS_XREG | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ |
Dense vector input
BigQuery ML supports ARRAY<numerical>
as dense vector input
during model training. The embedding feature is a special type of dense vector.
see the ML.GENERATE_EMBEDDING
function for more information.
Sparse input
BigQuery ML supports ARRAY<STRUCT>
as sparse input during
model training. Each struct contains an INT64
value that represents its
zero-based index, and a
numeric type
that represents the corresponding value.
Below is an example of a sparse tensor input for the integer array
[0,1,0,0,0,0,1]
:
ARRAY<STRUCT<k INT64, v INT64>>[(1, 1), (6, 1)] AS f1