The ML.EXPLAIN_FORECAST
function generates forecasts based on a trained
ARIMA_PLUS
time-series model. It only works on ARIMA_PLUS
models with the
training option decompose_time_series
enabled. The ML.EXPLAIN_FORECAST
function encompasses ML.FORECAST
because its output is a super set of the
results of ML.FORECAST
.
For information about Explainable AI, see Explainable AI Overview.
For information about supported model types of each SQL statement and function, and all supported SQL statements and functions for each model type, read End-to-end user journey for each model.
ML.EXPLAIN_FORECAST
syntax
ML.EXPLAIN_FORECAST(MODEL model_name, [, STRUCT<horizon INT64, confidence_level FLOAT64> settings])
model_name
model_name
is the name of the model that you're using for forecasting. If you
do not have a default project configured, then prepend the project ID to the model
name in following format:
`[project_id].[dataset].[model]`
(including the backticks); for example,
`myproject.mydataset.mymodel`
.
horizon
(Optional) Horizon is the number of time points to forecast. The horizon value
is type INT64
and is part of the settings STRUCT
.
The default value is 3
, and the maximum value is the horizon value that's specified in
the CREATE MODEL
statement for time-series models, or 1000
if not specified. When forecasting
multiple time series at the same time, this parameter applies to each
time series.
confidence_level
(Optional) The percentage of the future values that fall in the prediction
interval. The confidence_level
value is type
FLOAT64
and is part of the settings STRUCT
.
The default value is 0.95
. The valid input range is [0, 1)
.
ML.EXPLAIN_FORECAST
output
The ML.EXPLAIN_FORECAST
function returns the following columns:
time_series_id_col
ortime_series_id_cols
: The identifiers of a time series. This column is only present when forecasting multiple time series in one model creation query by specifying theTIME_SERIES_ID_COL
option. The column names and types are inherited from theTIME_SERIES_ID_COL
option.time_series_timestamp
(TIMESTAMP
): The timestamp of the time series. This column has a type ofTIMESTAMP
, regardless of the type of the inputtime_series_timestamp_col
. For each time series, the output rows are sorted in chronological order oftime_series_timestamp
.time_series_type
(STRING
): A value of eitherhistory
orforecast
. The rows withhistory
in this column are used in training, either directly from the training table, or from interpolation using the training data.time_series_data
(FLOAT64
): The data of the time series. Forhistory
rows,time_series_data
is either the training data or the interpolated value using the training data. Forforecast
rows,time_series_data
is the forecast value.time_series_adjusted_data
(FLOAT64
): The adjusted data of the time series. Forhistory
rows, this is the value after cleaning spikes and dips, adjusting the step changes, and removing the residuals. It is the aggregation of all the valid components: holiday effect, seasonal components, and trend. Forforecast
rows, this is the forecast value, which is the same as the value oftime_series_data
.standard_error
(FLOAT64
): The standard error of the residuals during the ARIMA fitting. The values are the same for allhistory
rows. Forforecast
rows, this value increases with time, as the forecast values become less reliable.confidence_level
(FLOAT64
): The user-specified confidence level or, if unspecified, the default value. This value is the same forforecast
rows andNULL
forhistory
rows.prediction_interval_lower_bound
(FLOAT64
): The lower bound of the prediction result. Onlyforecast
rows have values other thanNULL
in this column.prediction_interval_upper_bound
(FLOAT64
): The upper bound of the prediction result. Onlyforecast
rows have values other thanNULL
in this column.trend
(FLOAT64
): The long-term increase or decrease in the time series data.seasonal_period_yearly
(FLOAT64
): The time series data value affected by the time of the year. This value isNULL
if no yearly effect is found.seasonal_period_quarterly
(FLOAT64
): The time series data value affected by the time of the quarter. This value isNULL
if no quarterly effect is found.seasonal_period_monthly
(FLOAT64
): The time series data value affected by the time of the month. This value isNULL
if no monthly effect is found.seasonal_period_weekly
(FLOAT64
): The time series data value affected by the time of the week. This value isNULL
if no weekly effect is found.seasonal_period_daily
(FLOAT64
): The time series data value affected by the time of the day. This value isNULL
if no daily effect is found.holiday_effect
(FLOAT64
): The time series data value affected by different holidays. This is the aggregation value of all the holiday effects. This value isNULL
if no holiday effect is found.spikes_and_dips
(FLOAT64
): The unexpectedly high or low values of the time series. Forhistory
rows, the value isNULL
if no spike or dip is found. This value isNULL
forforecast
rows.step_changes
(FLOAT64
): The abrupt or structural change in the distributional properties of the time series. Forhistory
rows, this value isNULL
if no step change is found. This value isNULL
forforecast
rows.
ML.EXPLAIN_FORECAST
example
The following query uses ML.EXPLAIN_FORECAST
to forecast 30 time points with
a confidence level of 0.8.
SELECT * FROM ML.EXPLAIN_FORECAST(MODEL `mydataset.mymodel`, STRUCT(30 AS horizon, 0.8 AS confidence_level))