Stay organized with collections Save and categorize content based on your preferences.

The ML.TRAINING_INFO function

ML.TRAINING_INFO function

The ML.TRAINING_INFO function allows you to see information about the training iterations of a model. ML.TRAINING_INFO can be run while the CREATE MODEL query is running, or after it is run. If you run a query that contains ML.TRAINING_INFO before the first training iteration is complete, the query returns a Not found error.

For information about supported model types of each SQL statement and function, and all supported SQL statements and functions for each model type, read End-to-end user journey for each model.

ML.TRAINING_INFO syntax

ML.TRAINING_INFO(MODEL `project_id.dataset.model`)

Where:

  • project_id is your project ID.
  • dataset is the BigQuery dataset that contains the model.
  • model is the name of the model.

ML.TRAINING_INFO output

ML.TRAINING_INFO returns the following columns:

  • training_run: The value in this column is zero for a newly created model. If you retrain the model using warm_start, this value is incremented.
  • iteration: The iteration number of the training run. The value for the first iteration is zero. This value is incremented for each additional training run.
  • loss: The loss metric calculated after an iteration on the training data. Loss is log loss for a logistic regression and mean squared error for a linear regression. For multiclass logistic regressions, loss is the cross-entropy log loss. For explicit matrix factorization models, the loss is mean squared error calculated over the seen input ratings. For implicit matrix factorization models, the loss is calculated using the following formula:

    \( Loss = \sum_{u, i} c_{ui}(p_{ui} - x^T_uy_i)^2 + \lambda(\sum_u||x_u||^2 + \sum_i||y_i||^2) \)

    For more information about what the variables mean, see Additional information about feedback types.

  • eval_loss: The loss metric calculated on the holdout data. For k-means models, ML.TRAINING_INFO does not return an eval_loss column. If DATA_SPLIT_METHOD is 'NO_SPLIT', then all entries in the eval_loss column are NULL.

  • learning_rate: The learning rate in this iteration.

  • duration_ms: How long the iteration took, in milliseconds.

  • cluster_info: An ARRAY of STRUCTs, which contain the fields centroid_id, cluster_radius, and cluster_size. ML.TRAINING_INFO computes cluster_radius and cluster_size with standardized features. Only returned for k-means models.

ML.TRAINING_INFO permissions

Both bigquery.models.create and bigquery.models.getData are required to run ML.TRAINING_INFO.

ML.TRAINING_INFO example

The following example retrieves training information from mymodel in mydataset. The dataset is in your default project.

SELECT
  *
FROM
  ML.TRAINING_INFO(MODEL `mydataset.mymodel`)

ML.TRAINING_INFO limitations

The ML.TRAINING_INFO function is subject to the following limitations:

  • Imported TensorFlow models are not supported.
  • For time series models, this function only returns three columns: training_run, iteration, and duration_ms. It doesn't expose the training information per iteration, or per time series if multiple time series are forecasted at once. The duration_ms is the total time cost for the entire process.