The ML.FORECAST function

ML.FORECAST function

The ML.FORECAST function forecasts a time series based on a trained time series ARIMA_PLUS or ARIMA_PLUS_XREG model.

For information about model inference in BigQuery ML, see Model inference overview.

For information about supported model types of each SQL statement and function, and all supported SQL statements and functions for each model type, read End-to-end user journey for each model.

ML.FORECAST syntax

# `ARIMA_PLUS` models:
ML.FORECAST(MODEL model_name
           [, STRUCT<horizon INT64, confidence_level FLOAT64> settings])

# `ARIMA_PLUS_XREG` model:
ML.FORECAST(MODEL model_name
           , STRUCT<horizon INT64, confidence_level FLOAT64> settings
           , {TABLE table_name | (query_statement)})

model_name

model_name is the name of the model that you're using for forecasting. If you do not have a default project configured, prepend the project ID to the model name in following format: `[PROJECT_ID].[DATASET].[MODEL]` (including the backticks); for example, `myproject.mydataset.mymodel`.

horizon

(Optional) Horizon is the number of time points to forecast. The horizon value is type INT64 and is part of the settings STRUCT. The default value is 3, and the maximum value is the horizon value specified in the CREATE MODEL statement for time-series models, or 1000 if not specified. When forecasting multiple time series at the same time, this parameter applies to each time series.

confidence_level

(Optional) The percentage of the future values that fall in the prediction interval. The confidence_level value is type FLOAT64 and is part of the settings STRUCT. The default value is 0.95. The valid input range is [0, 1).

table_name

table_name is the name of the input table that contains the future features. If you do not have a default project configured, prepend the project ID to the table name in following format: `[PROJECT_ID].[DATASET].[TABLE]` (including the backticks); for example, `myproject.mydataset.mytable`.

The input column names in the table must contain the column names in the model, and their types should be compatible according to BigQuery coercion rules.

If there are unused columns from the table, they are ignored.

query_statement

The query_statement clause specifies the GoogleSQL query that is used to generate the future features. See the GoogleSQL Query Syntax page for the supported SQL syntax of the query_statement clause.

The input column names from the query must contain the column names in the model, and their types should be compatible according to BigQuery implicit coercion rules.

If there are unused columns from the query, they are ignored.

ML.FORECAST output

ML.FORECAST returns the following columns:

  • time_series_id_col or time_series_id_cols: the identifiers of a time series. Only present when forecasting multiple time series at once. The column names and types are inherited from the TIME_SERIES_ID_COL option as specified in the model creation query.
  • forecast_timestamp (TIMESTAMP)
  • forecast_value (FLOAT64)
  • standard_error (FLOAT64)
  • confidence_level (FLOAT64)
  • prediction_interval_lower_bound (FLOAT64)
  • prediction_interval_upper_bound (FLOAT64)
  • confidence_interval_lower_bound (FLOAT64) (soon to be deprecated)
  • confidence_interval_upper_bound (FLOAT64) (soon to be deprecated)

The output of ML.FORECAST has the following properties:

  • For each time series, the output rows are sorted in the chronological order of forecast_timestamp.
  • forecast_timestamp always has a type of TIMESTAMP, regardless of the type of the input time_series_timestamp_col.
  • forecast_value is the average of prediction_interval_lower_bound and prediction_interval_upper_bound.
  • confidence_level is the user-specified value, or the default value if unspecified. It is the same across all rows.

ML.FORECAST for ARIMA_PLUS example

The following query uses ML.FORECAST to forecast 30 time points with a confidence level of 0.8.

SELECT
  *
FROM
  ML.FORECAST(MODEL `mydataset.mymodel`,
              STRUCT(30 AS horizon, 0.8 AS confidence_level))

ML.FORECAST for ARIMA_PLUS_XREG example

The following query uses ML.FORECAST to forecast 30 time points with a confidence level of 0.8 with future features.

SELECT
  *
FROM
  ML.FORECAST(MODEL `mydataset.mymodel`,
              STRUCT(30 AS horizon, 0.8 AS confidence_level),
              (SELECT * FROM `mydataset.mytable`))