Index
AutoMlForecasting
(message)AutoMlForecastingInputs
(message)AutoMlForecastingInputs.Granularity
(message)AutoMlForecastingInputs.Transformation
(message)AutoMlForecastingInputs.Transformation.AutoTransformation
(message)AutoMlForecastingInputs.Transformation.CategoricalTransformation
(message)AutoMlForecastingInputs.Transformation.NumericTransformation
(message)AutoMlForecastingInputs.Transformation.TextTransformation
(message)AutoMlForecastingInputs.Transformation.TimestampTransformation
(message)AutoMlForecastingMetadata
(message)AutoMlImageClassification
(message)AutoMlImageClassificationInputs
(message)AutoMlImageClassificationInputs.ModelType
(enum)AutoMlImageClassificationMetadata
(message)AutoMlImageClassificationMetadata.SuccessfulStopReason
(enum)AutoMlImageObjectDetection
(message)AutoMlImageObjectDetectionInputs
(message)AutoMlImageObjectDetectionInputs.ModelType
(enum)AutoMlImageObjectDetectionMetadata
(message)AutoMlImageObjectDetectionMetadata.SuccessfulStopReason
(enum)AutoMlImageSegmentation
(message)AutoMlImageSegmentationInputs
(message)AutoMlImageSegmentationInputs.ModelType
(enum)AutoMlImageSegmentationMetadata
(message)AutoMlImageSegmentationMetadata.SuccessfulStopReason
(enum)AutoMlTables
(message)AutoMlTablesInputs
(message)AutoMlTablesInputs.Transformation
(message)AutoMlTablesInputs.Transformation.AutoTransformation
(message)AutoMlTablesInputs.Transformation.CategoricalArrayTransformation
(message)AutoMlTablesInputs.Transformation.CategoricalTransformation
(message)AutoMlTablesInputs.Transformation.NumericArrayTransformation
(message)AutoMlTablesInputs.Transformation.NumericTransformation
(message)AutoMlTablesInputs.Transformation.TextArrayTransformation
(message)AutoMlTablesInputs.Transformation.TextTransformation
(message)AutoMlTablesInputs.Transformation.TimestampTransformation
(message)AutoMlTablesMetadata
(message)AutoMlTextClassification
(message)AutoMlTextClassificationInputs
(message)AutoMlTextExtraction
(message)AutoMlTextExtractionInputs
(message)AutoMlTextSentiment
(message)AutoMlTextSentimentInputs
(message)AutoMlVideoActionRecognition
(message)AutoMlVideoActionRecognitionInputs
(message)AutoMlVideoActionRecognitionInputs.ModelType
(enum)AutoMlVideoClassification
(message)AutoMlVideoClassificationInputs
(message)AutoMlVideoClassificationInputs.ModelType
(enum)AutoMlVideoObjectTracking
(message)AutoMlVideoObjectTrackingInputs
(message)AutoMlVideoObjectTrackingInputs.ModelType
(enum)CustomJobMetadata
(message)CustomTask
(message)ExportEvaluatedDataItemsConfig
(message)HierarchyConfig
(message)HyperparameterTuningJobMetadata
(message)HyperparameterTuningJobSpec
(message)HyperparameterTuningTask
(message)Seq2SeqPlusForecasting
(message)Seq2SeqPlusForecastingInputs
(message)Seq2SeqPlusForecastingInputs.Granularity
(message)Seq2SeqPlusForecastingInputs.Transformation
(message)Seq2SeqPlusForecastingInputs.Transformation.AutoTransformation
(message)Seq2SeqPlusForecastingInputs.Transformation.CategoricalTransformation
(message)Seq2SeqPlusForecastingInputs.Transformation.NumericTransformation
(message)Seq2SeqPlusForecastingInputs.Transformation.TextTransformation
(message)Seq2SeqPlusForecastingInputs.Transformation.TimestampTransformation
(message)Seq2SeqPlusForecastingMetadata
(message)WindowConfig
(message)
AutoMlForecasting
A TrainingJob that trains and uploads an AutoML Forecasting Model.
The input parameters of this TrainingJob.
The metadata information.
AutoMlForecastingInputs
target_column
string
The name of the column that the Model is to predict values for. This column must be unavailable at forecast.
time_series_identifier_column
string
The name of the column that identifies the time series.
time_column
string
The name of the column that identifies time order in the time series. This column must be available at forecast.
Each transformation will apply transform function to given input column. And the result will be used for training. When creating transformation for BigQuery Struct column, the column should be flattened using "." as the delimiter.
optimization_objective
string
Objective function the model is optimizing towards. The training process creates a model that optimizes the value of the objective function over the validation set.
The supported optimization objectives:
"minimize-rmse" (default) - Minimize root-mean-squared error (RMSE).
"minimize-mae" - Minimize mean-absolute error (MAE).
"minimize-rmsle" - Minimize root-mean-squared log error (RMSLE).
"minimize-rmspe" - Minimize root-mean-squared percentage error (RMSPE).
"minimize-wape-mae" - Minimize the combination of weighted absolute percentage error (WAPE) and mean-absolute-error (MAE).
"minimize-quantile-loss" - Minimize the quantile loss at the quantiles defined in
quantiles
.- "minimize-mape" - Minimize the mean absolute percentage error.
train_budget_milli_node_hours
int64
Required. The train budget of creating this model, expressed in milli node hours i.e. 1,000 value in this field means 1 node hour.
The training cost of the model will not exceed this budget. The final cost will be attempted to be close to the budget, though may end up being (even) noticeably smaller - at the backend's discretion. This especially may happen when further model training ceases to provide any improvements.
If the budget is set to a value known to be insufficient to train a model for the given dataset, the training won't be attempted and will error.
The train budget must be between 1,000 and 72,000 milli node hours, inclusive.
weight_column
string
Column name that should be used as the weight column. Higher values in this column give more importance to the row during model training. The column must have numeric values between 0 and 10000 inclusively; 0 means the row is ignored for training. If weight column field is not set, then all rows are assumed to have equal weight of 1.
time_series_attribute_columns[]
string
Column names that should be used as attribute columns. The value of these columns does not vary as a function of time. For example, store ID or item color.
available_at_forecast_columns[]
string
Names of columns that are available and provided when a forecast is requested. These columns contain information for the given entity (identified by the time_series_identifier_column column) that is known at forecast. For example, predicted weather for a specific day.
Expected difference in time granularity between rows in the data.
forecast_horizon
int64
The amount of time into the future for which forecasted values for the target are returned. Expressed in number of units defined by the data_granularity
field.
context_window
int64
The amount of time into the past training and prediction data is used for model training and prediction respectively. Expressed in number of units defined by the data_granularity
field.
Configuration for exporting test set predictions to a BigQuery table. If this configuration is absent, then the export is not performed.
quantiles[]
double
Quantiles to use for minimize-quantile-loss optimization_objective
, or for probabilistic inference. Up to 5 quantiles are allowed of values between 0 and 1, exclusive. Required if the value of optimization_objective is minimize-quantile-loss. Represents the percent quantiles to use for that objective. Quantiles must be unique.
Configuration that defines the hierarchical relationship of time series and parameters for hierarchical forecasting strategies.
Config containing strategy for generating sliding windows.
holiday_regions[]
string
The geographical region based on which the holiday effect is applied in modeling by adding holiday categorical array feature that include all holidays matching the date. This option only allowed when data_granularity is day. By default, holiday effect modeling is disabled. To turn it on, specify the holiday region using this option.
enable_probabilistic_inference
bool
If probabilistic inference is enabled, the model will fit a distribution that captures the uncertainty of a prediction. At inference time, the predictive distribution is used to make a point prediction that minimizes the optimization objective. For example, the mean of a predictive distribution is the point prediction that minimizes RMSE loss. If quantiles are specified, then the quantiles of the distribution are also returned. The optimization objective cannot be minimize-quantile-loss.
validation_options
string
Validation options for the data validation component. The available options are:
"fail-pipeline" - default, will validate against the validation and fail the pipeline if it fails.
"ignore-validation" - ignore the results of the validation and continue
additional_experiments[]
string
Additional experiment flags for the time series forcasting training.
Granularity
A duration of time expressed in time granularity units.
unit
string
The time granularity unit of this time period. The supported units are:
"minute"
"hour"
"day"
"week"
"month"
"year"
quantity
int64
The number of granularity_units between data points in the training data. If granularity_unit
is minute
, can be 1, 5, 10, 15, or 30. For all other values of granularity_unit
, must be 1.
Transformation
transformation_detail
. The transformation that the training pipeline will apply to the input columns. transformation_detail
can be only one of the following:AutoTransformation
Training pipeline will infer the proper transformation based on the statistic of dataset.
column_name
string
CategoricalTransformation
Training pipeline will perform following transformation functions.
The categorical string as is--no change to case, punctuation, spelling, tense, and so on.
Convert the category name to a dictionary lookup index and generate an embedding for each index.
Categories that appear less than 5 times in the training dataset are treated as the "unknown" category. The "unknown" category gets its own special lookup index and resulting embedding.
column_name
string
NumericTransformation
Training pipeline will perform following transformation functions.
The value converted to float32.
The z_score of the value.
log(value+1) when the value is greater than or equal to 0. Otherwise, this transformation is not applied and the value is considered a missing value.
z_score of log(value+1) when the value is greater than or equal to 0. Otherwise, this transformation is not applied and the value is considered a missing value.
A boolean value that indicates whether the value is valid.
column_name
string
TextTransformation
Training pipeline will perform following transformation functions.
The text as is--no change to case, punctuation, spelling, tense, and so on.
Convert the category name to a dictionary lookup index and generate an embedding for each index.
column_name
string
TimestampTransformation
Training pipeline will perform following transformation functions.
Apply the transformation functions for Numerical columns.
Determine the year, month, day,and weekday. Treat each value from the timestamp as a Categorical column.
Invalid numerical values (for example, values that fall outside of a typical timestamp range, or are extreme values) receive no special treatment and are not removed.
column_name
string
time_format
string
The format in which that time field is expressed. The time_format must either be one of:
unix-seconds
unix-milliseconds
unix-microseconds
unix-nanoseconds
(for respectively number of seconds, milliseconds, microseconds and nanoseconds since start of the Unix epoch);
or be written in strftime
syntax.
If time_format is not set, then the default format is RFC 3339 date-time
format, where time-offset
= "Z"
(e.g. 1985-04-12T23:20:50.52Z)
AutoMlForecastingMetadata
Model metadata specific to AutoML Forecasting.
train_cost_milli_node_hours
int64
Output only. The actual training cost of the model, expressed in milli node hours, i.e. 1,000 value in this field means 1 node hour. Guaranteed to not exceed the train budget.
evaluated_data_items_bigquery_uri
string
BigQuery destination uri for exported evaluated examples.
AutoMlImageClassification
A TrainingJob that trains and uploads an AutoML Image Classification Model.
The input parameters of this TrainingJob.
The metadata information.
AutoMlImageClassificationInputs
base_model_id
string
The ID of the base
model. If it is specified, the new model will be trained based on the base
model. Otherwise, the new model will be trained from scratch. The base
model must be in the same Project and Location as the new Model to train, and have the same modelType.
budget_milli_node_hours
int64
The training budget of creating this model, expressed in milli node hours i.e. 1,000 value in this field means 1 node hour. The actual metadata.costMilliNodeHours will be equal or less than this value. If further model training ceases to provide any improvements, it will stop without using the full budget and the metadata.successfulStopReason will be model-converged
. Note, node_hour = actual_hour * number_of_nodes_involved. For modelType cloud
(default), the budget must be between 8,000 and 800,000 milli node hours, inclusive. The default value is 192,000 which represents one day in wall time, considering 8 nodes are used. For model types mobile-tf-low-latency-1
, mobile-tf-versatile-1
, mobile-tf-high-accuracy-1
, the training budget must be between 1,000 and 100,000 milli node hours, inclusive. The default value is 24,000 which represents one day in wall time on a single node that is used.
disable_early_stopping
bool
Use the entire training budget. This disables the early stopping feature. When false the early stopping feature is enabled, which means that AutoML Image Classification might stop training before the entire training budget has been used.
multi_label
bool
If false, a single-label (multi-class) Model will be trained (i.e. assuming that for each image just up to one annotation may be applicable). If true, a multi-label Model will be trained (i.e. assuming that for each image multiple annotations may be applicable).
uptrain_base_model_id
string
The ID of base
model for upTraining. If it is specified, the new model will be upTrained based on the base
model for upTraining. Otherwise, the new model will be trained from scratch. The base
model for upTraining must be in the same Project and Location as the new Model to train, and have the same modelType.
ModelType
Enums | |
---|---|
MODEL_TYPE_UNSPECIFIED |
Should not be set. |
CLOUD |
A Model best tailored to be used within Google Cloud, and which cannot be exported. Default. |
CLOUD_1 |
A model type best tailored to be used within Google Cloud, which cannot be exported externally. Compared to the CLOUD model above, it is expected to have higher prediction accuracy. |
MOBILE_TF_LOW_LATENCY_1 |
A model that, in addition to being available within Google Cloud, can also be exported (see ModelService.ExportModel) as TensorFlow or Core ML model and used on a mobile or edge device afterwards. Expected to have low latency, but may have lower prediction quality than other mobile models. |
MOBILE_TF_VERSATILE_1 |
A model that, in addition to being available within Google Cloud, can also be exported (see ModelService.ExportModel) as TensorFlow or Core ML model and used on a mobile or edge device with afterwards. |
MOBILE_TF_HIGH_ACCURACY_1 |
A model that, in addition to being available within Google Cloud, can also be exported (see ModelService.ExportModel) as TensorFlow or Core ML model and used on a mobile or edge device afterwards. Expected to have a higher latency, but should also have a higher prediction quality than other mobile models. |
AutoMlImageClassificationMetadata
cost_milli_node_hours
int64
The actual training cost of creating this model, expressed in milli node hours, i.e. 1,000 value in this field means 1 node hour. Guaranteed to not exceed inputs.budgetMilliNodeHours.
For successful job completions, this is the reason why the job has finished.
SuccessfulStopReason
Enums | |
---|---|
SUCCESSFUL_STOP_REASON_UNSPECIFIED |
Should not be set. |
BUDGET_REACHED |
The inputs.budgetMilliNodeHours had been reached. |
MODEL_CONVERGED |
Further training of the Model ceased to increase its quality, since it already has converged. |
AutoMlImageObjectDetection
A TrainingJob that trains and uploads an AutoML Image Object Detection Model.
The input parameters of this TrainingJob.
The metadata information
AutoMlImageObjectDetectionInputs
budget_milli_node_hours
int64
The training budget of creating this model, expressed in milli node hours i.e. 1,000 value in this field means 1 node hour. The actual metadata.costMilliNodeHours will be equal or less than this value. If further model training ceases to provide any improvements, it will stop without using the full budget and the metadata.successfulStopReason will be model-converged
. Note, node_hour = actual_hour * number_of_nodes_involved. For modelType cloud
(default), the budget must be between 20,000 and 900,000 milli node hours, inclusive. The default value is 216,000 which represents one day in wall time, considering 9 nodes are used. For model types mobile-tf-low-latency-1
, mobile-tf-versatile-1
, mobile-tf-high-accuracy-1
the training budget must be between 1,000 and 100,000 milli node hours, inclusive. The default value is 24,000 which represents one day in wall time on a single node that is used.
disable_early_stopping
bool
Use the entire training budget. This disables the early stopping feature. When false the early stopping feature is enabled, which means that AutoML Image Object Detection might stop training before the entire training budget has been used.
uptrain_base_model_id
string
The ID of base
model for upTraining. If it is specified, the new model will be upTrained based on the base
model for upTraining. Otherwise, the new model will be trained from scratch. The base
model for upTraining must be in the same Project and Location as the new Model to train, and have the same modelType.
ModelType
Enums | |
---|---|
MODEL_TYPE_UNSPECIFIED |
Should not be set. |
CLOUD_HIGH_ACCURACY_1 |
A model best tailored to be used within Google Cloud, and which cannot be exported. Expected to have a higher latency, but should also have a higher prediction quality than other cloud models. |
CLOUD_LOW_LATENCY_1 |
A model best tailored to be used within Google Cloud, and which cannot be exported. Expected to have a low latency, but may have lower prediction quality than other cloud models. |
CLOUD_1 |
A model best tailored to be used within Google Cloud, and which cannot be exported. Compared to the CLOUD_HIGH_ACCURACY_1 and CLOUD_LOW_LATENCY_1 models above, it is expected to have higher prediction quality and lower latency. |
MOBILE_TF_LOW_LATENCY_1 |
A model that, in addition to being available within Google Cloud can also be exported (see ModelService.ExportModel) and used on a mobile or edge device with TensorFlow afterwards. Expected to have low latency, but may have lower prediction quality than other mobile models. |
MOBILE_TF_VERSATILE_1 |
A model that, in addition to being available within Google Cloud can also be exported (see ModelService.ExportModel) and used on a mobile or edge device with TensorFlow afterwards. |
MOBILE_TF_HIGH_ACCURACY_1 |
A model that, in addition to being available within Google Cloud, can also be exported (see ModelService.ExportModel) and used on a mobile or edge device with TensorFlow afterwards. Expected to have a higher latency, but should also have a higher prediction quality than other mobile models. |
CLOUD_STREAMING_1 |
A model best tailored to be used within Google Cloud, and which cannot be exported. Expected to best support predictions in streaming with lower latency and lower prediction quality than other cloud models. |
AutoMlImageObjectDetectionMetadata
cost_milli_node_hours
int64
The actual training cost of creating this model, expressed in milli node hours, i.e. 1,000 value in this field means 1 node hour. Guaranteed to not exceed inputs.budgetMilliNodeHours.
For successful job completions, this is the reason why the job has finished.
SuccessfulStopReason
Enums | |
---|---|
SUCCESSFUL_STOP_REASON_UNSPECIFIED |
Should not be set. |
BUDGET_REACHED |
The inputs.budgetMilliNodeHours had been reached. |
MODEL_CONVERGED |
Further training of the Model ceased to increase its quality, since it already has converged. |
AutoMlImageSegmentation
A TrainingJob that trains and uploads an AutoML Image Segmentation Model.
The input parameters of this TrainingJob.
The metadata information.
AutoMlImageSegmentationInputs
budget_milli_node_hours
int64
The training budget of creating this model, expressed in milli node hours i.e. 1,000 value in this field means 1 node hour. The actual metadata.costMilliNodeHours will be equal or less than this value. If further model training ceases to provide any improvements, it will stop without using the full budget and the metadata.successfulStopReason will be model-converged
. Note, node_hour = actual_hour * number_of_nodes_involved. Or actual_wall_clock_hours = train_budget_milli_node_hours / (number_of_nodes_involved * 1000) For modelType cloud-high-accuracy-1
(default), the budget must be between 20,000 and 2,000,000 milli node hours, inclusive. The default value is 192,000 which represents one day in wall time (1000 milli * 24 hours * 8 nodes).
base_model_id
string
The ID of the base
model. If it is specified, the new model will be trained based on the base
model. Otherwise, the new model will be trained from scratch. The base
model must be in the same Project and Location as the new Model to train, and have the same modelType.
ModelType
Enums | |
---|---|
MODEL_TYPE_UNSPECIFIED |
Should not be set. |
CLOUD_HIGH_ACCURACY_1 |
A model to be used via prediction calls to uCAIP API. Expected to have a higher latency, but should also have a higher prediction quality than other models. |
CLOUD_LOW_ACCURACY_1 |
A model to be used via prediction calls to uCAIP API. Expected to have a lower latency but relatively lower prediction quality. |
MOBILE_TF_LOW_LATENCY_1 |
A model that, in addition to being available within Google Cloud, can also be exported (see ModelService.ExportModel) as TensorFlow model and used on a mobile or edge device afterwards. Expected to have low latency, but may have lower prediction quality than other mobile models. |
AutoMlImageSegmentationMetadata
cost_milli_node_hours
int64
The actual training cost of creating this model, expressed in milli node hours, i.e. 1,000 value in this field means 1 node hour. Guaranteed to not exceed inputs.budgetMilliNodeHours.
For successful job completions, this is the reason why the job has finished.
SuccessfulStopReason
Enums | |
---|---|
SUCCESSFUL_STOP_REASON_UNSPECIFIED |
Should not be set. |
BUDGET_REACHED |
The inputs.budgetMilliNodeHours had been reached. |
MODEL_CONVERGED |
Further training of the Model ceased to increase its quality, since it already has converged. |
AutoMlTables
A TrainingJob that trains and uploads an AutoML Tables Model.
The input parameters of this TrainingJob.
The metadata information.
AutoMlTablesInputs
prediction_type
string
The type of prediction the Model is to produce. "classification" - Predict one out of multiple target values is picked for each row. "regression" - Predict a value based on its relation to other values. This type is available only to columns that contain semantically numeric values, i.e. integers or floating point number, even if stored as e.g. strings.
target_column
string
The column name of the target column that the model is to predict.
Each transformation will apply transform function to given input column. And the result will be used for training. When creating transformation for BigQuery Struct column, the column should be flattened using "." as the delimiter.
optimization_objective
string
Objective function the model is optimizing towards. The training process creates a model that maximizes/minimizes the value of the objective function over the validation set.
The supported optimization objectives depend on the prediction type. If the field is not set, a default objective function is used.
classification (binary): "maximize-au-roc" (default) - Maximize the area under the receiver operating characteristic (ROC) curve. "minimize-log-loss" - Minimize log loss. "maximize-au-prc" - Maximize the area under the precision-recall curve. "maximize-precision-at-recall" - Maximize precision for a specified recall value. "maximize-recall-at-precision" - Maximize recall for a specified precision value.
classification (multi-class): "minimize-log-loss" (default) - Minimize log loss.
regression: "minimize-rmse" (default) - Minimize root-mean-squared error (RMSE). "minimize-mae" - Minimize mean-absolute error (MAE). "minimize-rmsle" - Minimize root-mean-squared log error (RMSLE).
train_budget_milli_node_hours
int64
Required. The train budget of creating this model, expressed in milli node hours i.e. 1,000 value in this field means 1 node hour.
The training cost of the model will not exceed this budget. The final cost will be attempted to be close to the budget, though may end up being (even) noticeably smaller - at the backend's discretion. This especially may happen when further model training ceases to provide any improvements.
If the budget is set to a value known to be insufficient to train a model for the given dataset, the training won't be attempted and will error.
The train budget must be between 1,000 and 72,000 milli node hours, inclusive.
disable_early_stopping
bool
Use the entire training budget. This disables the early stopping feature. By default, the early stopping feature is enabled, which means that AutoML Tables might stop training before the entire training budget has been used.
weight_column_name
string
Column name that should be used as the weight column. Higher values in this column give more importance to the row during model training. The column must have numeric values between 0 and 10000 inclusively; 0 means the row is ignored for training. If weight column field is not set, then all rows are assumed to have equal weight of 1.
Configuration for exporting test set predictions to a BigQuery table. If this configuration is absent, then the export is not performed.
additional_experiments[]
string
Additional experiment flags for the Tables training pipeline.
additional_optimization_objective_config
. Additional optimization objective configuration. Required for maximize-precision-at-recall
and maximize-recall-at-precision
, otherwise unused. additional_optimization_objective_config
can be only one of the following:optimization_objective_recall_value
float
Required when optimization_objective is "maximize-precision-at-recall". Must be between 0 and 1, inclusive.
optimization_objective_precision_value
float
Required when optimization_objective is "maximize-recall-at-precision". Must be between 0 and 1, inclusive.
Transformation
transformation_detail
. The transformation that the training pipeline will apply to the input columns. transformation_detail
can be only one of the following:AutoTransformation
Training pipeline will infer the proper transformation based on the statistic of dataset.
column_name
string
CategoricalArrayTransformation
Treats the column as categorical array and performs following transformation functions. * For each element in the array, convert the category name to a dictionary lookup index and generate an embedding for each index. Combine the embedding of all elements into a single embedding using the mean. * Empty arrays treated as an embedding of zeroes.
column_name
string
CategoricalTransformation
Training pipeline will perform following transformation functions. * The categorical string as is--no change to case, punctuation, spelling, tense, and so on. * Convert the category name to a dictionary lookup index and generate an embedding for each index. * Categories that appear less than 5 times in the training dataset are treated as the "unknown" category. The "unknown" category gets its own special lookup index and resulting embedding.
column_name
string
NumericArrayTransformation
Treats the column as numerical array and performs following transformation functions. * All transformations for Numerical types applied to the average of the all elements. * The average of empty arrays is treated as zero.
column_name
string
invalid_values_allowed
bool
If invalid values is allowed, the training pipeline will create a boolean feature that indicated whether the value is valid. Otherwise, the training pipeline will discard the input row from trainining data.
NumericTransformation
Training pipeline will perform following transformation functions. * The value converted to float32. * The z_score of the value. * log(value+1) when the value is greater than or equal to 0. Otherwise, this transformation is not applied and the value is considered a missing value. * z_score of log(value+1) when the value is greater than or equal to 0. Otherwise, this transformation is not applied and the value is considered a missing value. * A boolean value that indicates whether the value is valid.
column_name
string
invalid_values_allowed
bool
If invalid values is allowed, the training pipeline will create a boolean feature that indicated whether the value is valid. Otherwise, the training pipeline will discard the input row from trainining data.
TextArrayTransformation
Treats the column as text array and performs following transformation functions. * Concatenate all text values in the array into a single text value using a space (" ") as a delimiter, and then treat the result as a single text value. Apply the transformations for Text columns. * Empty arrays treated as an empty text.
column_name
string
TextTransformation
Training pipeline will perform following transformation functions. * The text as is--no change to case, punctuation, spelling, tense, and so on. * Tokenize text to words. Convert each words to a dictionary lookup index and generate an embedding for each index. Combine the embedding of all elements into a single embedding using the mean. * Tokenization is based on unicode script boundaries. * Missing values get their own lookup index and resulting embedding. * Stop-words receive no special treatment and are not removed.
column_name
string
TimestampTransformation
Training pipeline will perform following transformation functions. * Apply the transformation functions for Numerical columns. * Determine the year, month, day,and weekday. Treat each value from the * timestamp as a Categorical column. * Invalid numerical values (for example, values that fall outside of a typical timestamp range, or are extreme values) receive no special treatment and are not removed.
column_name
string
time_format
string
The format in which that time field is expressed. The time_format must either be one of: * unix-seconds
* unix-milliseconds
* unix-microseconds
* unix-nanoseconds
(for respectively number of seconds, milliseconds, microseconds and nanoseconds since start of the Unix epoch); or be written in strftime
syntax. If time_format is not set, then the default format is RFC 3339 date-time
format, where time-offset
= "Z"
(e.g. 1985-04-12T23:20:50.52Z)
invalid_values_allowed
bool
If invalid values is allowed, the training pipeline will create a boolean feature that indicated whether the value is valid. Otherwise, the training pipeline will discard the input row from trainining data.
AutoMlTablesMetadata
Model metadata specific to AutoML Tables.
train_cost_milli_node_hours
int64
Output only. The actual training cost of the model, expressed in milli node hours, i.e. 1,000 value in this field means 1 node hour. Guaranteed to not exceed the train budget.
evaluated_data_items_bigquery_uri
string
BigQuery destination uri for exported evaluated examples.
AutoMlTextClassification
A TrainingJob that trains and uploads an AutoML Text Classification Model.
The input parameters of this TrainingJob.
AutoMlTextClassificationInputs
multi_label
bool
AutoMlTextExtraction
A TrainingJob that trains and uploads an AutoML Text Extraction Model.
The input parameters of this TrainingJob.
AutoMlTextExtractionInputs
This type has no fields.
AutoMlTextSentiment
A TrainingJob that trains and uploads an AutoML Text Sentiment Model.
The input parameters of this TrainingJob.
AutoMlTextSentimentInputs
sentiment_max
int32
A sentiment is expressed as an integer ordinal, where higher value means a more positive sentiment. The range of sentiments that will be used is between 0 and sentimentMax (inclusive on both ends), and all the values in the range must be represented in the dataset before a model can be created. Only the Annotations with this sentimentMax will be used for training. sentimentMax value must be between 1 and 10 (inclusive).
AutoMlVideoActionRecognition
A TrainingJob that trains and uploads an AutoML Video Action Recognition Model.
The input parameters of this TrainingJob.
AutoMlVideoActionRecognitionInputs
ModelType
Enums | |
---|---|
MODEL_TYPE_UNSPECIFIED |
Should not be set. |
CLOUD |
A model best tailored to be used within Google Cloud, and which c annot be exported. Default. |
MOBILE_VERSATILE_1 |
A model that, in addition to being available within Google Cloud, can also be exported (see ModelService.ExportModel) as a TensorFlow or TensorFlow Lite model and used on a mobile or edge device afterwards. |
MOBILE_JETSON_VERSATILE_1 |
A model that, in addition to being available within Google Cloud, can also be exported (see ModelService.ExportModel) to a Jetson device afterwards. |
MOBILE_CORAL_VERSATILE_1 |
A model that, in addition to being available within Google Cloud, can also be exported (see ModelService.ExportModel) as a TensorFlow or TensorFlow Lite model and used on a Coral device afterwards. |
AutoMlVideoClassification
A TrainingJob that trains and uploads an AutoML Video Classification Model.
The input parameters of this TrainingJob.
AutoMlVideoClassificationInputs
ModelType
Enums | |
---|---|
MODEL_TYPE_UNSPECIFIED |
Should not be set. |
CLOUD |
A model best tailored to be used within Google Cloud, and which cannot be exported. Default. |
MOBILE_VERSATILE_1 |
A model that, in addition to being available within Google Cloud, can also be exported (see ModelService.ExportModel) as a TensorFlow or TensorFlow Lite model and used on a mobile or edge device afterwards. |
MOBILE_JETSON_VERSATILE_1 |
A model that, in addition to being available within Google Cloud, can also be exported (see ModelService.ExportModel) to a Jetson device afterwards. |
AutoMlVideoObjectTracking
A TrainingJob that trains and uploads an AutoML Video ObjectTracking Model.
The input parameters of this TrainingJob.
AutoMlVideoObjectTrackingInputs
ModelType
Enums | |
---|---|
MODEL_TYPE_UNSPECIFIED |
Should not be set. |
CLOUD |
A model best tailored to be used within Google Cloud, and which c annot be exported. Default. |
MOBILE_VERSATILE_1 |
A model that, in addition to being available within Google Cloud, can also be exported (see ModelService.ExportModel) as a TensorFlow or TensorFlow Lite model and used on a mobile or edge device afterwards. |
MOBILE_CORAL_VERSATILE_1 |
A versatile model that is meant to be exported (see ModelService.ExportModel) and used on a Google Coral device. |
MOBILE_CORAL_LOW_LATENCY_1 |
A model that trades off quality for low latency, to be exported (see ModelService.ExportModel) and used on a Google Coral device. |
MOBILE_JETSON_VERSATILE_1 |
A versatile model that is meant to be exported (see ModelService.ExportModel) and used on an NVIDIA Jetson device. |
MOBILE_JETSON_LOW_LATENCY_1 |
A model that trades off quality for low latency, to be exported (see ModelService.ExportModel) and used on an NVIDIA Jetson device. |
CustomJobMetadata
backing_custom_job
string
The resource name of the CustomJob that has been created to carry out this custom task.
CustomTask
A TrainingJob that trains a custom code Model.
The input parameters of this CustomTask.
The metadata information.
ExportEvaluatedDataItemsConfig
Configuration for exporting test set predictions to a BigQuery table.
destination_bigquery_uri
string
URI of desired destination BigQuery table. Expected format: bq://{project_id}:{dataset_id}:{table}
If not specified, then results are exported to the following auto-created BigQuery table: {project_id}:export_evaluated_examples_{model_name}_{yyyy_MM_dd'T'HH_mm_ss_SSS'Z'}.evaluated_examples
override_existing_table
bool
If true and an export destination is specified, then the contents of the destination are overwritten. Otherwise, if the export destination already exists, then the export operation fails.
HierarchyConfig
Configuration that defines the hierarchical relationship of time series and parameters for hierarchical forecasting strategies.
group_columns[]
string
A list of time series attribute column names that define the time series hierarchy. Only one level of hierarchy is supported, ex. 'region' for a hierarchy of stores or 'department' for a hierarchy of products. If multiple columns are specified, time series will be grouped by their combined values, ex. ('blue', 'large') for 'color' and 'size', up to 5 columns are accepted. If no group columns are specified, all time series are considered to be part of the same group.
group_total_weight
double
The weight of the loss for predictions aggregated over time series in the same group.
temporal_total_weight
double
The weight of the loss for predictions aggregated over the horizon for a single time series.
group_temporal_total_weight
double
The weight of the loss for predictions aggregated over both the horizon and time series in the same hierarchy group.
HyperparameterTuningJobMetadata
backing_hyperparameter_tuning_job
string
The resource name of the HyperparameterTuningJob that has been created to carry out this HyperparameterTuning task.
best_trial_backing_custom_job
string
The resource name of the CustomJob that has been created to run the best Trial of this HyperparameterTuning task.
HyperparameterTuningJobSpec
Study configuration of the HyperparameterTuningJob.
The spec of a trial job. The same spec applies to the CustomJobs created in all the trials.
max_trial_count
int32
The desired total number of Trials.
parallel_trial_count
int32
The desired number of Trials to run in parallel.
max_failed_trial_count
int32
The number of failed Trials that need to be seen before failing the HyperparameterTuningJob.
If set to 0, Vertex AI decides how many Trials must fail before the whole job fails.
HyperparameterTuningTask
A TrainingJob that tunes Hypererparameters of a custom code Model.
The input parameters of this HyperparameterTuningTask.
The metadata information.
Seq2SeqPlusForecasting
A TrainingJob that trains and uploads an AutoML Forecasting Model.
The input parameters of this TrainingJob.
The metadata information.
Seq2SeqPlusForecastingInputs
target_column
string
The name of the column that the Model is to predict values for. This column must be unavailable at forecast.
time_series_identifier_column
string
The name of the column that identifies the time series.
time_column
string
The name of the column that identifies time order in the time series. This column must be available at forecast.
Each transformation will apply transform function to given input column. And the result will be used for training. When creating transformation for BigQuery Struct column, the column should be flattened using "." as the delimiter.
optimization_objective
string
Objective function the model is optimizing towards. The training process creates a model that optimizes the value of the objective function over the validation set.
The supported optimization objectives:
"minimize-rmse" (default) - Minimize root-mean-squared error (RMSE).
"minimize-mae" - Minimize mean-absolute error (MAE).
"minimize-rmsle" - Minimize root-mean-squared log error (RMSLE).
"minimize-rmspe" - Minimize root-mean-squared percentage error (RMSPE).
"minimize-wape-mae" - Minimize the combination of weighted absolute percentage error (WAPE) and mean-absolute-error (MAE).
"minimize-quantile-loss" - Minimize the quantile loss at the quantiles defined in
quantiles
.- "minimize-mape" - Minimize the mean absolute percentage error.
train_budget_milli_node_hours
int64
Required. The train budget of creating this model, expressed in milli node hours i.e. 1,000 value in this field means 1 node hour.
The training cost of the model will not exceed this budget. The final cost will be attempted to be close to the budget, though may end up being (even) noticeably smaller - at the backend's discretion. This especially may happen when further model training ceases to provide any improvements.
If the budget is set to a value known to be insufficient to train a model for the given dataset, the training won't be attempted and will error.
The train budget must be between 1,000 and 72,000 milli node hours, inclusive.
weight_column
string
Column name that should be used as the weight column. Higher values in this column give more importance to the row during model training. The column must have numeric values between 0 and 10000 inclusively; 0 means the row is ignored for training. If weight column field is not set, then all rows are assumed to have equal weight of 1. This column must be available at forecast.
time_series_attribute_columns[]
string
Column names that should be used as attribute columns. The value of these columns does not vary as a function of time. For example, store ID or item color.
available_at_forecast_columns[]
string
Names of columns that are available and provided when a forecast is requested. These columns contain information for the given entity (identified by the time_series_identifier_column column) that is known at forecast. For example, predicted weather for a specific day.
Expected difference in time granularity between rows in the data.
forecast_horizon
int64
The amount of time into the future for which forecasted values for the target are returned. Expressed in number of units defined by the data_granularity
field.
context_window
int64
The amount of time into the past training and prediction data is used for model training and prediction respectively. Expressed in number of units defined by the data_granularity
field.
holiday_regions[]
string
The geographical region based on which the holiday effect is applied in modeling by adding holiday categorical array feature that include all holidays matching the date. This option only allowed when data_granularity is day. By default, holiday effect modeling is disabled. To turn it on, specify the holiday region using this option.
Configuration for exporting test set predictions to a BigQuery table. If this configuration is absent, then the export is not performed.
Config containing strategy for generating sliding windows.
quantiles[]
double
Quantiles to use for minimize-quantile-loss optimization_objective
. Up to 5 quantiles are allowed of values between 0 and 1, exclusive. Required if the value of optimization_objective is minimize-quantile-loss. Represents the percent quantiles to use for that objective. Quantiles must be unique.
validation_options
string
Validation options for the data validation component. The available options are:
"fail-pipeline" - default, will validate against the validation and fail the pipeline if it fails.
"ignore-validation" - ignore the results of the validation and continue
additional_experiments[]
string
Additional experiment flags for the time series forcasting training.
Configuration that defines the hierarchical relationship of time series and parameters for hierarchical forecasting strategies.
Granularity
A duration of time expressed in time granularity units.
unit
string
The time granularity unit of this time period. The supported units are:
"minute"
"hour"
"day"
"week"
"month"
"year"
quantity
int64
The number of granularity_units between data points in the training data. If granularity_unit
is minute
, can be 1, 5, 10, 15, or 30. For all other values of granularity_unit
, must be 1.
Transformation
transformation_detail
. The transformation that the training pipeline will apply to the input columns. transformation_detail
can be only one of the following:AutoTransformation
Training pipeline will infer the proper transformation based on the statistic of dataset.
column_name
string
CategoricalTransformation
Training pipeline will perform following transformation functions.
The categorical string as is--no change to case, punctuation, spelling, tense, and so on.
Convert the category name to a dictionary lookup index and generate an embedding for each index.
Categories that appear less than 5 times in the training dataset are treated as the "unknown" category. The "unknown" category gets its own special lookup index and resulting embedding.
column_name
string
NumericTransformation
Training pipeline will perform following transformation functions.
The value converted to float32.
The z_score of the value.
log(value+1) when the value is greater than or equal to 0. Otherwise, this transformation is not applied and the value is considered a missing value.
z_score of log(value+1) when the value is greater than or equal to 0. Otherwise, this transformation is not applied and the value is considered a missing value.
column_name
string
TextTransformation
Training pipeline will perform following transformation functions.
The text as is--no change to case, punctuation, spelling, tense, and so on.
Convert the category name to a dictionary lookup index and generate an embedding for each index.
column_name
string
TimestampTransformation
Training pipeline will perform following transformation functions.
Apply the transformation functions for Numerical columns.
Determine the year, month, day,and weekday. Treat each value from the timestamp as a Categorical column.
Invalid numerical values (for example, values that fall outside of a typical timestamp range, or are extreme values) receive no special treatment and are not removed.
column_name
string
time_format
string
The format in which that time field is expressed. The time_format must either be one of:
unix-seconds
unix-milliseconds
unix-microseconds
unix-nanoseconds
(for respectively number of seconds, milliseconds, microseconds and nanoseconds since start of the Unix epoch);
or be written in strftime
syntax.
If time_format is not set, then the default format is RFC 3339 date-time
format, where time-offset
= "Z"
(e.g. 1985-04-12T23:20:50.52Z)
Seq2SeqPlusForecastingMetadata
Model metadata specific to Seq2Seq Plus Forecasting.
train_cost_milli_node_hours
int64
Output only. The actual training cost of the model, expressed in milli node hours, i.e. 1,000 value in this field means 1 node hour. Guaranteed to not exceed the train budget.
evaluated_data_items_bigquery_uri
string
BigQuery destination uri for exported evaluated examples.
WindowConfig
Config that contains the strategy used to generate sliding windows in time series training. A window is a series of rows that comprise the context up to the time of prediction, and the horizon following. The corresponding row for each window marks the start of the forecast horizon. Each window is used as an input example for training/evaluation.
Union field strategy
.
strategy
can be only one of the following:
column
string
Name of the column that should be used to generate sliding windows. The column should contain either booleans or string booleans; if the value of the row is True, generate a sliding window with the horizon starting at that row. The column will not be used as a feature in training.
stride_length
int64
Stride length used to generate input examples. Within one time series, every {$STRIDE_LENGTH} rows will be used to generate a sliding window.
max_count
int64
Maximum number of windows that should be generated across all time series.