This page explains what hierarchical forecasting is, what its objectives are, and shows you some training strategies that you can employ to reduce bias in your forecasting models.
For detailed instructions on how to configure hierarchical forecasting when training your forecasting model using the API, see Train a forecast model.
What is hierarchical forecasting
Time series are often structured in a nested hierarchy. For example, the entire inventory of products that a retailer sells can be divided into categories of products. The categories can be further divided into individual products. When forecasting future sales, the forecasts for the products of a category should add up to the forecast for the category itself, and so forth up the hierarchy.
Similarly, the time dimension of a single time series can also exhibit a hierarchy. For example, forecasted sales for an individual product at the day level should add up to the product's forecasted weekly sales. The following figure shows this group and temporal hierarchy as a matrix:
Hierarchical forecasting has three objectives:
- Reduce overall bias to improve metrics over all time series (total sales).
- Reduce temporal bias to improve metrics over the horizon (season sales).
- Reduce group level bias to improve metrics over a group of time series (item sales).
In Vertex AI, hierarchical forecasting takes into account the hierarchical structure of time series by incorporating additional loss terms for aggregated predictions.
Hierarchical loss = (1 x loss) +
(temporal total weight x temporal total loss) +
(group total weight x group total loss) +
(group temporal total weight x group temporal total loss)
For example, if the hierarchical group is "category", the predictions at the "category" level is the sum of predictions for all "products" in the category. If the objective of the model is mean absolute error (MAE), the loss would include the MAE for predictions at both the "product" and "category" levels. This helps to improve the consistency of forecasts at different levels of the hierarchy, and in some cases, may even improve metrics at the lowest level.
Configure hierarchical aggregation for model training
You can configure hierarchical aggregation when training your forecast models by
configuring
AutoMLForecastingTrainingJob
in the
Vertex AI SDK
or by
configuring hierarchyConfig
in the Vertex AI API.
Available parameters for AutoMLForecastingTrainingJob
and hierarchyConfig
include:
group_columns
group_total_weight
temporal_total_weight
group_temporal_total_weight
The parameters allow for different combinations of group and time aggregated
losses. They also allow you to assign weights to increase the priority of
minimizing the aggregated loss relative to the individual loss. For example, if
the weight is set to 2.0
, it will be weighted twice as much as the individual
loss.
group_columns
Column names in your training input table that identify the grouping for the
hierarchy level. The column(s) must be time_series_attribute_columns
. If the
group column is not set, all time series will be treated as part of the same
group and is aggregated over all time series.
group_total_weight
Weight of the group aggregated loss relative to the individual loss. Disabled if
set to 0.0
or is not set.
temporal_total_weight
Weight of the time aggregated loss relative to the individual loss. Disabled if
set to 0.0
or is not set.
group_temporal_total_weight
Weight of the total (group x time) aggregated loss relative to the individual
loss. Disabled if set to 0.0
or is not set. If the group column is not set,
all time series will be treated as part of the same group and is aggregated over
all time series.
Strategies to reduce bias
Consider starting with one type of aggregation (group or time) with a weight of
10.0
, and then halve or double the value based on the results.
Reduce overall bias
In fine-grained forecasts for distributing stock across stores where weighted absolute percentage error (WAPE) at the product x store x date level are used as a forecasting metric, forecasts often underpredict at the aggregate levels. To compensate for this overall bias, you can try the following:
- Set
group_total_weight
to10.0
. - Leave
group_columns
unset. - Leave other weights unset.
This aggregates over all time series and reduces overall bias.
Reduce temporal bias
In long term planning, forecasts may be made at a product x region x week level, but the relevant metrics may be measured with respect to seasonal totals. To compensate for this temporal bias, you can try the following:
- Set
temporal_total_weight
to10.0
. - Leave
group_columns
unset. - Leave other weights unset.
This aggregates over all dates in the horizon of a time series, and reduces temporal bias.
Reduce group level bias
For forecasts that are multi-purpose in the replenishment process, fine grained forecasts at the product x store x date or week level may be aggregated up to product x distribution center x date levels for distribution or product category x date levels for materials orders. To do this, perform the following:
- Set
group_total_weight
to10.0
. - Set
group_columns
, for example, ["region"] or ["region", "category"]. Setting multiple group columns uses their combined value to define the group. For best results, use group columns with 100 or fewer distinct combined values. - Leave other weights unset.
This aggregates over all time series in the same group for the same date, and reduces bias at the group level.
Limits
- Only one level of time series aggregation is supported. If more than one grouping column is specified, such as "product, store", the time series is in the same group only if they share the same values of both "product" and "store".
- We recommend using 100 or fewer groups.