The ML.RECOMMEND function

ML.RECOMMEND function

The ML.RECOMMEND function generates a predicted rating for every user-item row combination for a matrix factorization model. Because the input data for a matrix factorization model tends to be a sparse matrix with missing values, ML.RECOMMEND can return the predictions for those missing values without requiring specification of each entry.

If the model was trained with feedback_type=EXPLICIT, a user column called user, and an item column called item, then the output columns to ML.RECOMMEND will be user, item, and predicted_rating. The predicted_rating output column is the rating for each user-item pair.

If the model was trained with feedback_type=IMPLICIT, a user column called user, and an item column called item, then the output columns to ML.RECOMMEND will be user, item, and predicted_rating_confidence. The predicted_rating_confidence output column is the relative confidence for each user-item pair.

For information about model inference in BigQuery ML, see Model inference overview.

For information about supported model types of each SQL statement and function, and all supported SQL statements and functions for each model type, read End-to-end user journey for each model.

ML.RECOMMEND syntax

ML.RECOMMEND(MODEL model_name
            [, {TABLE table_name
                | (query_statement)}])

model_name

model_name is the name of the model you're evaluating. If you do not have a default project configured, prepend the project ID to the model name in following format: `project-id.dataset.model`, including the backticks; for example, `myproject.mydataset.mymodel`.

table_name

(Optional) table_name is the name of the input table that contains the user and/or item data. If you do not have a default project configured, prepend the project ID to the table name in the following format: `project-id.dataset.table` (including the backticks); for example, `myproject.mydataset.mytable`.

query_statement

(Optional.) The query_statement clause specifies the GoogleSQL query that is used to generate the evaluation data. For the supported SQL syntax of the query_statement clause, see GoogleSQL query syntax.

ML.RECOMMEND output

If table_name or query_statement is specified, the user and/or item columns must match the user and item columns in the model and their types should be compatible according to BigQuery implicit coercion rules. If the input table does not contain both the user and item column, the input table can only contain exactly 1 column. If the table contains both user and item columns, then the non-user or item columns will be passed through and available for query in the statement.

ML.RECOMMEND outputs at least 3 columns for all cases; the user column, the item column and a column for predicted recommendations.

The predicted recommendation column name for explicit matrix factorization models will be predicted_<rating_col_name>. Because the input ratings from training are assumed to be explicit feedback, the predicted ratings will be approximately in the range of the original input, although ratings outside the range are also normal.

The predicted recommendation column name for implicit matrix factorization models will be predicted_<rating_col_name>_confidence. The input ratings from training are assumed to be a proxy for user confidence. Therefore, if the model has converged, the predicted confidences lie between approximately 0 and 1 (but can lie just outside that range). If the model hasn't converged, the predicted confidences can be any value. If your model isn't converging and your ratings are very large, try decreasing WALS_ALPHA. If your model isn't converging and your ratings are very small, try increasing WALS_ALPHA.

The output of ML.RECOMMEND is computed as follows:

  • If both the user and item columns are in table_name or query_statement, then ML.RECOMMEND returns a rating for each user-item pair.
  • If only the user or only the item is specified; for example, table_name only contains the user column, then all the item ratings for every user in the table are outputted.
  • If either the user or item feature was not in the training dataset, the rating that is returned is the intercept of the seen item or user added with the global__intercept__.
  • If either table_name or query_statement is specified but does not use either the user or item column, ML.RECOMMEND returns an error.
  • If neither table_name nor query_statement is specified, ML.RECOMMEND outputs the ratings for every user and item combination seen during training.

ML.RECOMMEND example

ML.RECOMMEND with no input data specified

The following query uses ML.RECOMMEND to generate predicted ratings for every user-item pair in the inputs of mymodel because there is no input data specified.

SELECT
  *
FROM
  ML.RECOMMEND(MODEL `mydataset.mymodel`)

ML.RECOMMEND with input data

The following query generates predicted ratings for each user-item row in mydataset.mytable assuming that mydataset.mymodel was trained using the user column user and item column item.

SELECT
  *
FROM
  ML.RECOMMEND(MODEL `mydataset.mymodel`,
      (
      SELECT
        user,
        item
      FROM
        `mydataset.mytable`))