The ML.PREDICT Function

ML.PREDICT function

The ML.PREDICT function can be used to predict outcomes using the model. Prediction can be done during model creation, after model creation, or after a failure (as long as at least 1 iteration is finished). ML.PREDICT always uses the model weights from the last successful iteration.

The output of the ML.PREDICT function has as many rows as the input table, and it includes all columns from the input table and all output columns from the model. The output column names for the model are predicted_<label_column_name> and (for logistic regression models) predicted_<label_column_name>_probs. In both columns, label_column_name is the name of the input label column used during training.

  • For logistic regression models:

    • The predicted_<label_column_name>_probs output column is an array of STRUCTs of type [<label, prob>] that contains the predicted probability of each label.
    • The predicted_<label_column_name> output column is one of the two input labels, depending on which label has the higher predicted probability.
  • For multiclass logistic regression models:

    • The predicted_<label_column_name>_probs output column is the probability for each class label calculated using a softmax function.
    • The predicted_<label_column_name> output column is the label with the highest predicted probability score.
  • For linear regression models:

    • The predicted_<label_column_name> output column is the predicted value of the label.

ML.PREDICT syntax

ML.PREDICT(MODEL model_name,
          {TABLE table_name | (query_statement)})

model_name

model_name is the name of the model you're evaluating. If you do not have a default project configured, prepend the project ID to the model name in following format: `[PROJECT_ID].[DATASET].[MODEL]` (including the backticks); for example, `myproject.mydataset.mymodel`.

table_name

table_name is the name of the input table that contains the evaluation data. If you do not have a default project configured, prepend the project ID to the table name in following format: `[PROJECT_ID].[DATASET].[TABLE]` (including the backticks); for example, `myproject.mydataset.mytable`.

The input column names and data types in the model must match the column names and data types in the table. The input must have a column that matches the label column name provided during training. This value is provided using the input_label_cols option. If input_label_cols is unspecified, the column named "label" in the training data is used.

query_statement

The query_statement clause specifies the standard SQL query that is used to generate the evaluation data. See the Standard SQL Query Syntax page for the supported SQL syntax of the query_statement clause.

All columns referenced by the query_statement are used as inputs to the model.

ML.PREDICT examples

The following examples assume your model and input table are in your default project.

Predicting an outcome

The following query uses the ML.PREDICT function to predict an outcome. The query returns these columns:

  • predicted_label
  • label
  • column1
  • column2
SELECT
  *
FROM
  ML.PREDICT(MODEL `mydataset.mymodel`,
    (
    SELECT
      label,
      column1,
      column2
    FROM
      `mydataset.mytable`))

Comparing predictions from two different models

In this example, the following query is used to create the first model.

CREATE MODEL
  `mydataset.mymodel1`
OPTIONS
  (model_type='linear_reg',
    input_label_cols=['label'],
  ) AS
SELECT
  label,
  input_column1
FROM
  `mydataset.mytable`

The following query is used to create the second model.

CREATE MODEL
  `mydataset.mymodel2`
OPTIONS
  (model_type='linear_reg',
    input_label_cols=['label'],
  ) AS
SELECT
  label,
  input_column2
FROM
  `mydataset.mytable`

The following query uses the ML.PREDICT function to compare the output of the two models.

SELECT
  label,
  predicted_label1,
  predicted_label AS predicted_label2
FROM
  ML.PREDICT(MODEL `mydataset.mymodel2`,
    (
    SELECT
      * REPLACE (predicted_label AS predicted_label1)
    FROM
      ML.PREDICT(MODEL `mydataset.mymodel1`,
        TABLE `mydataset.mytable`)))
Was this page helpful? Let us know how we did:

Send feedback about...

Need help? Visit our support page.