The ML.RECOMMEND function
ML.RECOMMEND
function
The ML.RECOMMEND
function generates a predicted rating for every user-item row
combination for a matrix factorization model. Because the input data for a
matrix factorization model tends to be a sparse matrix with missing values,
ML.RECOMMEND
can return the predictions for those missing values without
requiring specification of each entry.
If the model was trained with feedback_type=EXPLICIT
, a user column called
user
, and an item column called item
, then the output columns to
ML.RECOMMEND will be user
, item
, and predicted_rating
.
The predicted_rating
output column is the rating for each user-item pair.
If the model was trained with feedback_type=IMPLICIT
, a user column called
user
, and an item column called item
, then the output columns to
ML.RECOMMEND will be user
, item
, and predicted_rating_confidence
.
The predicted_rating_confidence
output column is the relative confidence for
each user-item pair.
For information about model inference in BigQuery ML, see Model inference overview.
For information about supported model types of each SQL statement and function, and all supported SQL statements and functions for each model type, read End-to-end user journey for each model.
ML.RECOMMEND
syntax
ML.RECOMMEND(MODEL model_name [, {TABLE table_name | (query_statement)}])
model_name
model_name
is the name of the model you're evaluating. If you do
not have a default project configured, prepend the project ID to the model
name in following format:
`project-id.dataset.model`,
including the backticks; for example, `myproject.mydataset.mymodel`.
table_name
(Optional) table_name
is the name of the input table that contains the user
and/or item data. If you do not have a default project configured, prepend the
project ID to the table name in the following format:
`project-id.dataset.table`
(including the backticks); for example, `myproject.mydataset.mytable`.
query_statement
(Optional.) The query_statement
clause specifies the GoogleSQL query that
is used to generate the evaluation data. For the supported SQL syntax of the
query_statement
clause, see
GoogleSQL query syntax.
ML.RECOMMEND
output
If table_name
or query_statement
is specified, the user and/or item columns
must match the user and item columns in the model and their types should be
compatible according to BigQuery
implicit coercion rules.
If the input table does not contain both the user and item column, the input
table can only contain exactly 1 column. If the table contains both user and
item columns, then the non-user or item columns will be passed through and
available for query in the statement.
ML.RECOMMEND
outputs at least 3 columns for all cases; the user
column, the
item
column and a column for predicted recommendations.
The predicted recommendation column name for explicit matrix factorization
models will be predicted_<rating_col_name>
. Because the input ratings from
training are assumed to be explicit feedback, the predicted ratings will be
approximately in the range of the original input, although ratings outside the
range are also normal.
The predicted recommendation column name for implicit matrix factorization
models will be predicted_<rating_col_name>_confidence
. The input ratings from
training are assumed to be a proxy for user confidence. Therefore, if the model
has converged, the predicted confidences lie between approximately 0 and 1 (but
can lie just outside that range). If the model hasn't converged, the predicted
confidences can be any value. If your model isn't converging and your ratings
are very large, try decreasing WALS_ALPHA
. If your model isn't converging and
your ratings are very small, try increasing WALS_ALPHA
.
The output of ML.RECOMMEND
is computed as follows:
- If both the user and item columns are in
table_name
orquery_statement
, thenML.RECOMMEND
returns a rating for each user-item pair. - If only the user or only the item is specified; for example,
table_name
only contains the user column, then all the item ratings for every user in the table are outputted. - If either the user or item feature was not in the training dataset, the rating
that is returned is the
intercept
of the seen item or user added with theglobal__intercept__
. - If either
table_name
orquery_statement
is specified but does not use either the user or item column,ML.RECOMMEND
returns an error. - If neither
table_name
norquery_statement
is specified,ML.RECOMMEND
outputs the ratings for every user and item combination seen during training.
ML.RECOMMEND
example
ML.RECOMMEND
with no input data specified
The following query uses ML.RECOMMEND
to generate predicted ratings for every
user-item pair in the inputs of mymodel
because there is no input data
specified.
SELECT * FROM ML.RECOMMEND(MODEL `mydataset.mymodel`)
ML.RECOMMEND
with input data
The following query generates predicted ratings for each user-item row in
mydataset.mytable
assuming that mydataset.mymodel
was trained using the user
column user
and item column item
.
SELECT * FROM ML.RECOMMEND(MODEL `mydataset.mymodel`, ( SELECT user, item FROM `mydataset.mytable`))