Package google.cloud.automl.v1beta1

Index

AutoMl

AutoML Server API.

The resource names are assigned by the server. The server never reuses names that it has created after the resources with those names are deleted.

An ID of a resource is the last element of the item's resource name. For projects/{project_id}/locations/{location_id}/datasets/{dataset_id}, then the id for the item is {dataset_id}.

Currently the only supported location_id is "us-central1".

On any input that is documented to expect a string parameter in snake_case or kebab-case, either of those cases is accepted.

CreateDataset

rpc CreateDataset(CreateDatasetRequest) returns (Dataset)

Creates a dataset.

Authorization Scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

CreateModel

rpc CreateModel(CreateModelRequest) returns (Operation)

Creates a model. Returns a Model in the response field when it completes. When you create a model, several model evaluations are created for it: a global evaluation, and one evaluation for each annotation spec.

Authorization Scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

DeleteDataset

rpc DeleteDataset(DeleteDatasetRequest) returns (Operation)

Deletes a dataset and all of its contents. Returns empty response in the response field when it completes, and delete_details in the metadata field.

Authorization Scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

DeleteModel

rpc DeleteModel(DeleteModelRequest) returns (Operation)

Deletes a model. Returns google.protobuf.Empty in the response field when it completes, and delete_details in the metadata field.

Authorization Scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

DeployModel

rpc DeployModel(DeployModelRequest) returns (Operation)

Deploys a model. If a model is already deployed, deploying it with the same parameters has no effect. Deploying with different parameters resets the deployment state without pausing the model's availability.

Returns an empty response in the response field when it completes.

Authorization Scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

ExportData

rpc ExportData(ExportDataRequest) returns (Operation)

Exports dataset's data to the provided output location. Returns an empty response in the response field when it completes.

Authorization Scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

ExportEvaluatedExamples

rpc ExportEvaluatedExamples(ExportEvaluatedExamplesRequest) returns (Operation)

Exports examples on which the model was evaluated (i.e. which were in the TEST set of the dataset the model was created from), together with their ground truth annotations and the annotations created (predicted) by the model. The examples, ground truth and predictions are exported in the state they were at the moment the model was evaluated.

This export is available only for 30 days since the model evaluation is created.

Currently only available for Tables.

Returns an empty response in the response field when it completes.

Authorization Scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

GetAnnotationSpec

rpc GetAnnotationSpec(GetAnnotationSpecRequest) returns (AnnotationSpec)

Gets an annotation spec.

Authorization Scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

GetColumnSpec

rpc GetColumnSpec(GetColumnSpecRequest) returns (ColumnSpec)

Gets a column spec.

Authorization Scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

GetDataset

rpc GetDataset(GetDatasetRequest) returns (Dataset)

Gets a dataset.

Authorization Scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

GetModel

rpc GetModel(GetModelRequest) returns (Model)

Gets a model.

Authorization Scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

GetModelEvaluation

rpc GetModelEvaluation(GetModelEvaluationRequest) returns (ModelEvaluation)

Gets a model evaluation.

Authorization Scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

GetTableSpec

rpc GetTableSpec(GetTableSpecRequest) returns (TableSpec)

Gets a table spec.

Authorization Scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

ImportData

rpc ImportData(ImportDataRequest) returns (Operation)

Imports data into a dataset.

This method can only be called on an empty Dataset.

You must explicitly set the schema_inference_version field before calling the ImportData method.

For more information, see Importing items into a dataset

Authorization Scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

ListColumnSpecs

rpc ListColumnSpecs(ListColumnSpecsRequest) returns (ListColumnSpecsResponse)

Lists column specs in a table spec.

Authorization Scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

ListDatasets

rpc ListDatasets(ListDatasetsRequest) returns (ListDatasetsResponse)

Lists datasets in a project.

Authorization Scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

ListModelEvaluations

rpc ListModelEvaluations(ListModelEvaluationsRequest) returns (ListModelEvaluationsResponse)

Lists model evaluations.

Authorization Scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

ListModels

rpc ListModels(ListModelsRequest) returns (ListModelsResponse)

Lists models.

Authorization Scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

ListTableSpecs

rpc ListTableSpecs(ListTableSpecsRequest) returns (ListTableSpecsResponse)

Lists table specs in a dataset.

Authorization Scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

UndeployModel

rpc UndeployModel(UndeployModelRequest) returns (Operation)

Removes a deployed model. If the model is not deployed this method has no effect.

Returns an empty response in the response field when it completes.

Authorization Scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

UpdateColumnSpec

rpc UpdateColumnSpec(UpdateColumnSpecRequest) returns (ColumnSpec)

Updates a column spec.

Authorization Scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

UpdateDataset

rpc UpdateDataset(UpdateDatasetRequest) returns (Dataset)

Updates a dataset.

Authorization Scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

UpdateTableSpec

rpc UpdateTableSpec(UpdateTableSpecRequest) returns (TableSpec)

Updates a table spec.

Authorization Scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

PredictionService

AutoML Prediction API.

On any input that is documented to expect a string parameter in snake_case or kebab-case, either of those cases is accepted.

BatchPredict

rpc BatchPredict(BatchPredictRequest) returns (Operation)

Perform a batch prediction and return the id of a long-running operation. You can request the operation result by using the GetOperation method. When the operation has completed, you can call GetOperation to retrieve a BatchPredictResult from the response field.

See Making batch predictions for more details.

Authorization Scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

Predict

rpc Predict(PredictRequest) returns (PredictResponse)

Perform an online prediction. The prediction result will be directly returned in the response.

You can call the predict method on a row with column values that match the columns of your model. The data for your row can be up to 5MB.

See Making a prediction for more details.

Authorization Scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

AnnotationPayload

Contains annotation information that is relevant to AutoML.

Fields
annotation_spec_id

string

Output only . The resource ID of the annotation spec that this annotation pertains to. The annotation spec comes from either an ancestor dataset, or the dataset that was used to train the model in use.

display_name

string

Output only. The value of display_name when the model was trained. Because this field returns a value at model training time, for different models trained using the same dataset, the returned value could be different as model owner could update the display_name between any two model training.

Union field detail. Output only . Additional information about the annotation specific to the AutoML domain. detail can be only one of the following:
classification

ClassificationAnnotation

Annotation details for classification predictions.

tables

TablesAnnotation

Annotation details for Tables.

AnnotationSpec

A definition of an annotation spec.

Fields
name

string

Output only. Resource name of the annotation spec. Form:

'projects/{project_id}/locations/{location_id}/datasets/{dataset_id}/annotationSpecs/{annotation_spec_id}'

display_name

string

Required. The name of the annotation spec to show in the interface. The name can be up to 32 characters long and must match the regexp [a-zA-Z0-9_]+.

example_count

int32

Output only. The number of examples in the parent dataset labeled by the annotation spec.

ArrayStats

The data statistics of a series of ARRAY values.

Fields
member_stats

DataStats

Stats of all the values of all arrays, as if they were a single long series of data. The type depends on the element type of the array.

BatchPredictInputConfig

Input configuration for BatchPredict Action.

See Making batch predictions for more information.

You can provide prediction data to AutoML Tables in two ways:

The URI of a BigQuery table. The size of data in the BigQuery table must be 100 GB or less.

The column names must match the column names used to train the model. The data in each column must match the data type used to train the model. The columns can be in any order.

One or more URIs from Google Cloud Storage identifying CSV files that contain the rows to make predictions for. Each file must be 10 GB or less in size. The total size of all files must be 100 GB or less.

The first file specified must have a header row with column names. The column names must match the column names used to train the model. The data in each column must match the data type used to train the model. The columns can be in any order.

If the first row of a subsequent file is the same as the original header row, then that row is also treated as a header. All other rows contain values for the corresponding columns.

Fields
Union field source. Required. The source of the input. source can be only one of the following:
gcs_source

GcsSource

The Google Cloud Storage location for the input content.

bigquery_source

BigQuerySource

The BigQuery location for the input content.

Union field auxiliary_source. Optional. Side inputs that help in prediction. For Tables: For FORECASTING

prediction_type: Historical data rows are required here, even if they would be identical as the ones on which the model has been trained. The historical rows must have non-NULL target column values. auxiliary_source can be only one of the following:

gcs_auxiliary_source

GcsSource

The Google Cloud Storage location for the input content.

bigquery_auxiliary_source

BigQuerySource

The BigQuery location for the input content.

BatchPredictOperationMetadata

Details of BatchPredict operation.

Fields
input_config

BatchPredictInputConfig

Output only. The input config that was given upon starting this batch predict operation.

output_info

BatchPredictOutputInfo

Output only. Information further describing this batch predict's output.

BatchPredictOutputInfo

Further describes this batch predict's output. Supplements

BatchPredictOutputConfig.

Fields
Union field output_location. The output location into which prediction output is written. output_location can be only one of the following:
gcs_output_directory

string

The full path of the Google Cloud Storage directory created, into which the prediction output is written.

bigquery_output_dataset

string

The path of the BigQuery dataset created, in bq://projectId.bqDatasetId format, into which the prediction output is written.

BatchPredictOutputConfig

Output configuration for BatchPredict action.

See Making batch predictions for more information.

You can configure the output location for predict results from AutoML Tables in two ways:

The URI of a BigQuery project. In the given project a new dataset is created with name "prediction_<model-display-name>_<timestamp>" where <model-display-name> is the model name, made compatible for BigQuery datasets, and timestamp is in YYYY_MM_DDThh_mm_ss_sssZ format.

Two tables are created in the dataset: predictions, and errors.

Each column name in the predictions table is the display name value for the input column followed by the target column with name in the format of "predicted_<target_column_specdisplay_name>".

The input feature columns contain the respective values of successfully predicted rows, and the target column contains an ARRAY of AnnotationPayloads values represented as STRUCTs, containing TablesAnnotation values.

The errors table contains rows for which the prediction has failed. Each column name in the errors table is in the format of "errors_<target_column_specdisplay_name>". The column contains a google.rpc.Status value represented as a STRUCT, with the status code and message.

The URI for a Google Cloud Storage bucket where the prediction results are stored. AutoML Tables creates a directory in the specified bucket. The name of the directory is in the format "prediction-<model-display-name>-<timestamp>" where <model-display-name> is the display name of the model and <timestamp> is in YYYY-MM-DDThh:mm:ss.sssZ ISO-8601 format.

AutoML Tables creates new files in the directory named tables_1.csv, tables_2.csv, ... ,tables_N.csv. One file each for the total number of rows with successful predictions.

If prediction for any rows failed, then AutoML Tables adds additional files named errors_1.csv, errors_2.csv, ... , errors_N.csv. One file each for the total number of rows where the prediction failed. The error files contain the google.rpc.Status code and message values.

Fields
Union field destination. Required. The destination of the output. destination can be only one of the following:
gcs_destination

GcsDestination

The Google Cloud Storage location of the directory where the output is to be written to.

bigquery_destination

BigQueryDestination

The BigQuery location where the output is to be written to.

BatchPredictRequest

Request message for PredictionService.BatchPredict.

Fields
name

string

Name of the model requested to serve the batch prediction.

Authorization requires the following Google IAM permission on the specified resource name:

  • automl.models.predict

input_config

BatchPredictInputConfig

Required. The input configuration for batch prediction.

output_config

BatchPredictOutputConfig

Required. The Configuration specifying where output predictions should be written.

params

map<string, string>

Additional domain-specific parameters for the predictions, any string must be up to 25000 characters long.

You can set the following fields:

See Making batch predictions for more details.

BatchPredictResult

Result of the Batch Predict. This message is returned in response of the operation returned by the PredictionService.BatchPredict.

Fields
metadata

map<string, string>

Additional domain-specific prediction response metadata.

BigQueryDestination

The BigQuery location for the output content.

Fields
output_uri

string

Required. BigQuery URI to a project, up to 2000 characters long. Accepted forms: * BigQuery path e.g. bq://projectId

BigQuerySource

The BigQuery location for the input content.

Fields
input_uri

string

Required. BigQuery URI to a table, up to 2000 characters long. Accepted forms: * BigQuery path e.g. bq://projectId.bqDatasetId.bqTableId

CategoryStats

The data statistics of a series of CATEGORY values.

Fields
top_category_stats[]

SingleCategoryStats

The statistics of the top 20 CATEGORY values, ordered by

count.

SingleCategoryStats

The statistics of a single CATEGORY value.

Fields
value

string

The CATEGORY value.

count

int64

The number of occurrences of this value in the series.

ClassificationAnnotation

Contains annotation details specific to classification.

Fields
score

float

Output only. A confidence estimate between 0.0 and 1.0. A higher value means greater confidence that the annotation is positive. If a user approves an annotation as negative or positive, the score value remains unchanged. If a user creates an annotation, the score is 0 for negative or 1 for positive.

ClassificationEvaluationMetrics

Model evaluation metrics for classification problems. For information on the prediction type, see BatchPredictRequest.params.

Fields
au_prc

float

Output only. The Area Under Precision-Recall Curve metric. Micro-averaged for the overall evaluation.

base_au_prc
(deprecated)

float

Output only. The Area Under Precision-Recall Curve metric based on priors. Micro-averaged for the overall evaluation. Deprecated.

au_roc

float

Output only. The Area Under Receiver Operating Characteristic curve metric. Micro-averaged for the overall evaluation.

log_loss

float

Output only. The Log Loss metric.

confidence_metrics_entry[]

ConfidenceMetricsEntry

Output only. Metrics for each confidence_threshold in 0.00,0.05,0.10,...,0.95,0.96,0.97,0.98,0.99 and position_threshold = INT32_MAX_VALUE. ROC and precision-recall curves, and other aggregated metrics are derived from them. The confidence metrics entries may also be supplied for additional values of position_threshold, but from these no aggregated metrics are computed.

confusion_matrix

ConfusionMatrix

Output only. Confusion matrix of the evaluation. Only set for MULTICLASS classification problems where number of labels is no more than 10. Only set for model level evaluation, not for evaluation per label.

annotation_spec_id[]

string

Output only. The annotation spec ids used for this evaluation.

ConfidenceMetricsEntry

Metrics for a single confidence threshold.

Fields
confidence_threshold

float

Output only. Metrics are computed with an assumption that the model never returns predictions with score lower than this value.

position_threshold

int32

Output only. Metrics are computed with an assumption that the model always returns at most this many predictions (ordered by their score, descendingly), but they all still need to meet the confidence_threshold.

recall

float

Output only. Recall (True Positive Rate) for the given confidence threshold.

precision

float

Output only. Precision for the given confidence threshold.

false_positive_rate

float

Output only. False Positive Rate for the given confidence threshold.

f1_score

float

Output only. The harmonic mean of recall and precision.

recall_at1

float

Output only. The Recall (True Positive Rate) when only considering the label that has the highest prediction score and not below the confidence threshold for each example.

precision_at1

float

Output only. The precision when only considering the label that has the highest prediction score and not below the confidence threshold for each example.

false_positive_rate_at1

float

Output only. The False Positive Rate when only considering the label that has the highest prediction score and not below the confidence threshold for each example.

f1_score_at1

float

Output only. The harmonic mean of recall_at1 and precision_at1.

true_positive_count

int64

Output only. The number of model created labels that match a ground truth label.

false_positive_count

int64

Output only. The number of model created labels that do not match a ground truth label.

false_negative_count

int64

Output only. The number of ground truth labels that are not matched by a model created label.

true_negative_count

int64

Output only. The number of labels that were not created by the model, but if they would, they would not match a ground truth label.

ConfusionMatrix

Confusion matrix of the model running the classification.

Fields
annotation_spec_id[]

string

Output only. IDs of the annotation specs used in the confusion matrix.

Returns a list of display_name values.

row[]

Row

Output only. Rows in the confusion matrix. The number of rows is equal to the size of annotation_spec_id. row[i].value[j] is the number of examples that have ground truth of the annotation_spec_id[i] and are predicted as annotation_spec_id[j] by the model being evaluated.

Row

Output only. A row in the confusion matrix.

Fields
example_count[]

int32

Output only. Value of the specific cell in the confusion matrix. The number of values each row has (i.e. the length of the row) is equal to the length of the annotation_spec_id field or, if that one is not populated, length of the display_name field.

ColumnSpec

A representation of a column in a relational table. When listing them, column specs are returned in the same order in which they were given on import.

Fields
name

string

Output only. The resource name of the column specs. Form:

projects/{project_id}/locations/{location_id}/datasets/{dataset_id}/tableSpecs/{table_spec_id}/columnSpecs/{column_spec_id}

data_type

DataType

The data type of elements stored in the column.

display_name

string

Output only. The name of the column to show in the interface. The name can be up to 100 characters long and can consist only of ASCII Latin letters A-Z and a-z, ASCII digits 0-9, underscores(_), and forward slashes(/), and must start with a letter or a digit.

forecasting_metadata

ForecastingMetadata

Additional column metadata specific to AutoML Tables FORECASTING

prediction_type.

data_stats

DataStats

Output only. Stats of the series of values in the column. This field may be stale, see the ancestor's Dataset.tables_dataset_metadata.stats_update_time field for the timestamp at which these stats were last updated.

top_correlated_columns[]

CorrelatedColumn

Deprecated.

etag

string

Used to perform consistent read-modify-write updates. If not set, a blind "overwrite" update happens.

CorrelatedColumn

Identifies the table's column, and its correlation with the column this ColumnSpec describes.

Fields
column_spec_id

string

The column_spec_id of the correlated column, which belongs to the same table as the in-context column.

correlation_stats

CorrelationStats

Correlation between this and the in-context column.

ForecastingMetadata

Additional metadata for the column spec used for AutoML Tables FORECASTING

prediction_type.

Fields
column_type

ColumnType

Required. The type of the column for FORECASTING model training purposes.

ColumnType

Describes how the column should be understood by the FORECASTING model training.

Enums
COLUMN_TYPE_UNSPECIFIED An un-set value of this enum.
KEY Key columns are used to identify timeseries. The union of key columns in a row specifies to which timeseries the row belongs. If a table has no such column, it is assumed all its rows pertain to a single time series for a single entity. Any such column must be not nullable. E.g. A table could have the key consisting of 3 columns which would be city's name, state and country.
KEY_METADATA This column contains information describing static properties of the entities identified by the key column(s) (e.g. city's ZIP code). Note: it is enough if these properties are static over the period intended for forecasting (e.g. ZIP codes may sometimes change, but rarely enough to likely not matter for forecasting done within a scope of a few years).
TIME_SERIES_AVAILABLE_PAST_ONLY This column contains information for the given entity and time, which information is, generally, known for the past but not the future (e.g. population of the city in a given year, or weather on a given day).
TIME_SERIES_AVAILABLE_PAST_AND_FUTURE

This column contains information for the given entity and time, which information is, generally, known both for the past and the sufficiently far future (see

[horizon][[google.cloud.automl.v1beta1.TablesModelMetadata.ForecastingConfig.horizon]]). E.g. are Olympics held in the city in a given year.

CorrelationStats

A correlation statistics between two series of DataType values. The series may have differing DataType-s, but within a single series the DataType must be the same.

Fields
cramers_v

double

The correlation value using the Cramer's V measure.

CreateDatasetRequest

Request message for AutoMl.CreateDataset.

Fields
parent

string

The resource name of the project to create the dataset for.

Authorization requires the following Google IAM permission on the specified resource parent:

  • automl.datasets.create

dataset

Dataset

The dataset to create.

CreateModelOperationMetadata

Details of CreateModel operation.

CreateModelRequest

Request message for AutoMl.CreateModel.

Fields
parent

string

Resource name of the parent project where the model is being created.

Authorization requires the following Google IAM permission on the specified resource parent:

  • automl.models.create

model

Model

The model to create.

DataStats

The data statistics of a series of values that share the same DataType.

Fields
distinct_value_count

int64

The number of distinct values.

null_value_count

int64

The number of values that are null.

valid_value_count

int64

The number of values that are valid.

Union field stats. The data statistics specific to a DataType. stats can be only one of the following:
float64_stats

Float64Stats

The statistics for FLOAT64 DataType.

string_stats

StringStats

The statistics for STRING DataType.

timestamp_stats

TimestampStats

The statistics for TIMESTAMP DataType.

array_stats

ArrayStats

The statistics for ARRAY DataType.

struct_stats

StructStats

The statistics for STRUCT DataType.

category_stats

CategoryStats

The statistics for CATEGORY DataType.

DataType

Indicated the type of data that can be stored in a structured data entity (e.g. a table).

Fields
type_code

TypeCode

Required. The TypeCode for this type.

nullable

bool

If true, this DataType can also be NULL. In .CSV files NULL value is expressed as an empty string.

Union field details. Details of DataType-s that need additional specification. details can be only one of the following:
list_element_type

DataType

If type_code == ARRAY, then list_element_type is the type of the elements.

struct_type

StructType

If type_code == STRUCT, then struct_type provides type information for the struct's fields.

time_format

string

If type_code == TIMESTAMP then time_format provides the format in which that time field is expressed. The time_format must either be one of: * UNIX_SECONDS * UNIX_MILLISECONDS * UNIX_MICROSECONDS * UNIX_NANOSECONDS (for respectively number of seconds, milliseconds, microseconds and nanoseconds since start of the Unix epoch); or be written in strftime syntax. If time_format is not set, then the default format as described on the type_code is used.

Dataset

A workspace for solving a single, particular machine learning (ML) problem. A workspace contains examples that may be annotated.

Fields
name

string

Output only. The resource name of the dataset. Form: projects/{project_id}/locations/{location_id}/datasets/{dataset_id}

display_name

string

Required. The name of the dataset to show in the interface. The name can be up to 32 characters long and can consist only of ASCII Latin letters A-Z and a-z, underscores (_), and ASCII digits 0-9.

description

string

User-provided description of the dataset. The description can be up to 25000 characters long.

example_count

int32

Output only. The number of examples in the dataset.

create_time

Timestamp

Output only. Timestamp when this dataset was created.

etag

string

Used to perform consistent read-modify-write updates. If not set, a blind "overwrite" update happens.

tables_dataset_metadata

TablesDatasetMetadata

Metadata for a dataset used for Tables.

DeleteDatasetRequest

Request message for AutoMl.DeleteDataset.

Fields
name

string

The resource name of the dataset to delete.

Authorization requires the following Google IAM permission on the specified resource name:

  • automl.datasets.delete

DeleteModelRequest

Request message for AutoMl.DeleteModel.

Fields
name

string

Resource name of the model being deleted.

Authorization requires the following Google IAM permission on the specified resource name:

  • automl.models.delete

DeleteOperationMetadata

Details of operations that perform deletes of any entities.

DeployModelOperationMetadata

Details of DeployModel operation.

DeployModelRequest

Request message for AutoMl.DeployModel.

Fields
name

string

Resource name of the model to deploy.

Authorization requires the following Google IAM permission on the specified resource name:

  • automl.models.deploy

DoubleRange

A range between two double numbers.

Fields
start

double

Start of the range, inclusive.

end

double

End of the range, exclusive.

ExamplePayload

Example data used for training or prediction.

Fields
Union field payload. Required. Input only. The example data. payload can be only one of the following:
row

Row

Example relational table row.

time_series

TimeSeries

Time series example.

ExportDataOperationMetadata

Details of ExportData operation.

Fields
output_info

ExportDataOutputInfo

Output only. Information further describing this export data's output.

ExportDataOutputInfo

Further describes this export data's output. Supplements OutputConfig.

Fields
Union field output_location. The output location to which the exported data is written. output_location can be only one of the following:
gcs_output_directory

string

The full path of the Google Cloud Storage directory created, into which the exported data is written.

bigquery_output_dataset

string

The path of the BigQuery dataset created, in bq://projectId.bqDatasetId format, into which the exported data is written.

ExportDataRequest

Request message for AutoMl.ExportData.

Fields
name

string

Required. The resource name of the dataset.

Authorization requires the following Google IAM permission on the specified resource name:

  • automl.datasets.export

output_config

OutputConfig

Required. The desired output location.

ExportEvaluatedExamplesOperationMetadata

Details of EvaluatedExamples operation.

Fields
output_info

ExportEvaluatedExamplesOutputInfo

Output only. Information further describing the output of this evaluated examples export.

ExportEvaluatedExamplesOutputInfo

Further describes the output of the evaluated examples export. Supplements

ExportEvaluatedExamplesOutputConfig.

Fields
bigquery_output_dataset

string

The path of the BigQuery dataset created, in bq://projectId.bqDatasetId format, into which the output of export evaluated examples is written.

ExportEvaluatedExamplesOutputConfig

Output configuration for ExportEvaluatedExamples Action. Note that this call is available only for 30 days since the moment the model was evaluated. The output depends on the domain, as follows (note that only examples from the TEST set are exported):

  • For Tables:

bigquery_destination pointing to a BigQuery project must be set. In the given project a new dataset will be created with name

export_evaluated_examples_<model-display-name>_<timestamp-of-export-call> where will be made BigQuery-dataset-name compatible (e.g. most special characters will become underscores), and timestamp will be in YYYY_MM_DDThh_mm_ss_sssZ "based on ISO-8601" format. In the dataset an evaluated_examples table will be created. It will have all the same columns as the

primary_table of the dataset from which the model was created, as they were at the moment of model's evaluation (this includes the target column with its ground truth), followed by a column called "predicted_". That last column will contain the model's prediction result for each respective row, given as ARRAY of AnnotationPayloads, represented as STRUCT-s, containing TablesAnnotation.

Fields
bigquery_destination

BigQueryDestination

The BigQuery location where the output is to be written to.

ExportEvaluatedExamplesRequest

Request message for AutoMl.ExportEvaluatedExamples.

Fields
name

string

Required. The resource name of the model whose evaluated examples are to be exported.

Authorization requires the following Google IAM permission on the specified resource name:

  • automl.modelEvaluations.get

output_config

ExportEvaluatedExamplesOutputConfig

Required. The desired output location and configuration.

Float64Stats

The data statistics of a series of FLOAT64 values.

Fields
mean

double

The mean of the series.

standard_deviation

double

The standard deviation of the series.

quantiles[]

double

Ordered from 0 to k k-quantile values of the data series of n values. The value at index i is, approximately, the i*n/k-th smallest value in the series; for i = 0 and i = k these are, respectively, the min and max values.

histogram_buckets[]

HistogramBucket

Histogram buckets of the data series. Sorted by the min value of the bucket, ascendingly, and the number of the buckets is dynamically generated. The buckets are non-overlapping and completely cover whole FLOAT64 range with min of first bucket being "-Infinity", and max of the last one being "Infinity".

HistogramBucket

A bucket of a histogram.

Fields
min

double

The minimum value of the bucket, inclusive.

max

double

The maximum value of the bucket, exclusive unless max = "Infinity", in which case it's inclusive.

count

int64

The number of data values that are in the bucket, i.e. are between min and max values.

GcsDestination

The Google Cloud Storage location where the output is to be written to.

Fields
output_uri_prefix

string

Required. Google Cloud Storage URI to output directory, up to 2000 characters long. Accepted forms: * Prefix path: gs://bucket/directory The requesting user must have write permission to the bucket. The directory is created if it doesn't exist.

GcsSource

The Google Cloud Storage location for the input content.

Fields
input_uris[]

string

Required. Google Cloud Storage URIs to input files, up to 2000 characters long. Accepted forms: * Full object path, e.g. gs://bucket/directory/object.csv

GetAnnotationSpecRequest

Request message for AutoMl.GetAnnotationSpec.

Fields
name

string

The resource name of the annotation spec to retrieve.

Authorization requires the following Google IAM permission on the specified resource name:

  • automl.annotationSpecs.get

GetColumnSpecRequest

Request message for AutoMl.GetColumnSpec.

Fields
name

string

The resource name of the column spec to retrieve.

Authorization requires the following Google IAM permission on the specified resource name:

  • automl.columnSpecs.get

field_mask

FieldMask

Mask specifying which fields to read.

GetDatasetRequest

Request message for AutoMl.GetDataset.

Fields
name

string

The resource name of the dataset to retrieve.

Authorization requires the following Google IAM permission on the specified resource name:

  • automl.datasets.get

GetModelEvaluationRequest

Request message for AutoMl.GetModelEvaluation.

Fields
name

string

Resource name for the model evaluation.

Authorization requires the following Google IAM permission on the specified resource name:

  • automl.modelEvaluations.get

GetModelRequest

Request message for AutoMl.GetModel.

Fields
name

string

Resource name of the model.

Authorization requires the following Google IAM permission on the specified resource name:

  • automl.models.get

GetTableSpecRequest

Request message for AutoMl.GetTableSpec.

Fields
name

string

The resource name of the table spec to retrieve.

Authorization requires the following Google IAM permission on the specified resource name:

  • automl.tableSpecs.get

field_mask

FieldMask

Mask specifying which fields to read.

ImportDataOperationMetadata

Details of ImportData operation.

ImportDataRequest

Request message for AutoMl.ImportData.

Fields
name

string

Required. Dataset name. Dataset must already exist. All imported annotations and examples will be added.

Authorization requires the following Google IAM permission on the specified resource name:

  • automl.datasets.import

input_config

InputConfig

Required. The desired input location and its domain specific semantics, if any.

InputConfig

Input configuration for ImportData action.

See Preparing your training data for more information.

You can provide model training data to AutoML Tables in two ways:

  • A BigQuery table. Specify the BigQuery URI in the bigquerySource field of your input configuration. The size of the table cannot exceed 100 GB.

  • Comma-separated values (CSV) files. Store the CSV files in Google Cloud Storage and specify the URIs to the CSV files in the gcsSource field of your input configuration. Each CSV file cannot exceed 10 GB in size, and the total size of all CSV files cannot exceed 100 GB.

The first file specified must have a header containing column names. If the first row of a subsequent file is the same as the header of the first specified file, then that row is also treated as a header. All other rows contain values for the corresponding columns.

If any of the provided CSV files can't be parsed or if more than certain percent of CSV rows cannot be processed then the operation fails and nothing is imported. Regardless of overall success or failure the per-row failures, up to a certain count cap, will be listed in Operation.metadata.partial_failures.

AutoML Tables limits the amount of table data that you can import to at least 1,000 and no more than 100,000,000 rows with at least 2 and no more than 1,000 columns. AutoML Tables infers the schema from the data when the data is imported. You can have at most five import requests running in parallel.

The Google Cloud Storage bucket must be Regional, and must reside in the us-central1 region.

Fields
params

map<string, string>

Additional domain-specific parameters describing the semantic of the imported data, any string must be up to 25000 characters long.

You must supply the following fields:

  • schema_inference_version - (integer) Required. The version of the algorithm that should be used for the initial inference of the schema--the column data types--of the table that you are importing data into. Allowed values: "1".

Union field source. The source of the input. source can be only one of the following:
gcs_source

GcsSource

The Google Cloud Storage location for the input content. In ImportData, the gcs_source points to a csv with structure described in the comment.

bigquery_source

BigQuerySource

The BigQuery location for the input content.

ListColumnSpecsRequest

Request message for AutoMl.ListColumnSpecs.

Fields
parent

string

The resource name of the table spec to list column specs from.

Authorization requires the following Google IAM permission on the specified resource parent:

  • automl.columnSpecs.list

field_mask

FieldMask

Mask specifying which fields to read.

filter

string

Filter expression, see go/filtering.

page_size

int32

Requested page size. The server can return fewer results than requested. If unspecified, the server will pick a default size.

page_token

string

A token identifying a page of results for the server to return. Typically obtained from the ListColumnSpecsResponse.next_page_token field of the previous AutoMl.ListColumnSpecs call.

ListColumnSpecsResponse

Response message for AutoMl.ListColumnSpecs.

Fields
column_specs[]

ColumnSpec

The column specs read.

next_page_token

string

A token to retrieve next page of results. Pass to ListColumnSpecsRequest.page_token to obtain that page.

ListDatasetsRequest

Request message for AutoMl.ListDatasets.

Fields
parent

string

The resource name of the project from which to list datasets.

Authorization requires the following Google IAM permission on the specified resource parent:

  • automl.datasets.list

filter

string

An expression for filtering the results of the request.

  • dataset_metadata: test for existence of metadata.

An example of using the filter is:

  • tables_dataset_metadata:* --> The dataset has tables_dataset_metadata.

  • regex(display_name, "^A") -> The dataset's display name starts with "A"

page_size

int32

Requested page size. Server may return fewer results than requested. If unspecified, server will pick a default size.

page_token

string

A token identifying a page of results for the server to return Typically obtained via ListDatasetsResponse.next_page_token of the previous AutoMl.ListDatasets call.

ListDatasetsResponse

Response message for AutoMl.ListDatasets.

Fields
datasets[]

Dataset

The datasets read.

next_page_token

string

A token to retrieve next page of results. Pass to ListDatasetsRequest.page_token to obtain that page.

ListModelEvaluationsRequest

Request message for AutoMl.ListModelEvaluations.

Fields
parent

string

Resource name of the model to list the model evaluations for. If modelId is set as "-", this will list model evaluations from across all models of the parent location.

Authorization requires the following Google IAM permission on the specified resource parent:

  • automl.modelEvaluations.list

filter

string

An expression for filtering the results of the request.

  • annotation_spec_id - for =, != or existence. See example below for the last.

Some examples of using the filter are:

  • annotation_spec_id!=4 --> The model evaluation was done for annotation spec with ID different than 4.
  • NOT annotation_spec_id:* --> The model evaluation was done for aggregate of all annotation specs.

page_size

int32

Requested page size.

page_token

string

A token identifying a page of results for the server to return. Typically obtained via ListModelEvaluationsResponse.next_page_token of the previous AutoMl.ListModelEvaluations call.

ListModelEvaluationsResponse

Response message for AutoMl.ListModelEvaluations.

Fields
model_evaluation[]

ModelEvaluation

List of model evaluations in the requested page.

next_page_token

string

A token to retrieve next page of results. Pass to the ListModelEvaluationsRequest.page_token field of a new AutoMl.ListModelEvaluations request to obtain that page.

ListModelsRequest

Request message for AutoMl.ListModels.

Fields
parent

string

Resource name of the project, from which to list the models.

Authorization requires the following Google IAM permission on the specified resource parent:

  • automl.models.list

filter

string

An expression for filtering the results of the request.

  • model_metadata: test for existence of metadata.

  • dataset_id: = or != a dataset ID.

Some examples of using the filter are:

  • tables_model_metadata:* --> The model has tables_model_metadata.

  • dataset_id=5 --> The model was created from a dataset with an ID of 5.

  • regex(display_name, "^A") -> The models's display name starts with "A".

page_size

int32

Requested page size.

page_token

string

A token identifying a page of results for the server to return Typically obtained via ListModelsResponse.next_page_token of the previous AutoMl.ListModels call.

ListModelsResponse

Response message for AutoMl.ListModels.

Fields
model[]

Model

List of models in the requested page.

next_page_token

string

A token to retrieve next page of results. Pass to ListModelsRequest.page_token to obtain that page.

ListTableSpecsRequest

Request message for AutoMl.ListTableSpecs.

Fields
parent

string

The resource name of the dataset to list table specs from.

Authorization requires the following Google IAM permission on the specified resource parent:

  • automl.tableSpecs.list

field_mask

FieldMask

Mask specifying which fields to read.

filter

string

Filter expression, see go/filtering.

page_size

int32

Requested page size. The server can return fewer results than requested. If unspecified, the server will pick a default size.

page_token

string

A token identifying a page of results for the server to return. Typically obtained from the ListTableSpecsResponse.next_page_token field of the previous AutoMl.ListTableSpecs call.

ListTableSpecsResponse

Response message for AutoMl.ListTableSpecs.

Fields
table_specs[]

TableSpec

The table specs read.

next_page_token

string

A token to retrieve next page of results. Pass to ListTableSpecsRequest.page_token to obtain that page.

Model

API proto representing a trained machine learning model.

Fields
name

string

Output only. Resource name of the model. Format: projects/{project_id}/locations/{location_id}/models/{model_id}

display_name

string

Required. The name of the model to show in the interface. The name can be up to 32 characters long and can consist only of ASCII Latin letters A-Z and a-z, underscores (_), and ASCII digits 0-9. It must start with a letter.

dataset_id

string

Required. The resource ID of the dataset used to create the model. The dataset must come from the same ancestor project and location.

create_time

Timestamp

Output only. Timestamp when the model training finished and can be used for prediction.

update_time

Timestamp

Output only. Timestamp when this model was last updated.

deployment_state

DeploymentState

Output only. Deployment state of the model. A model can only serve prediction requests after it gets deployed.

tables_model_metadata

TablesModelMetadata

Metadata for Tables models.

DeploymentState

Deployment state of the model.

Enums
DEPLOYMENT_STATE_UNSPECIFIED Should not be used, an un-set enum has this value by default.
DEPLOYED Model is deployed.
UNDEPLOYED Model is not deployed.

ModelEvaluation

Evaluation results of a model.

Fields
name

string

Output only. Resource name of the model evaluation. Format:

projects/{project_id}/locations/{location_id}/models/{model_id}/modelEvaluations/{model_evaluation_id}

annotation_spec_id

string

Output only. The ID of the annotation spec that the model evaluation applies to. The The ID is empty for the overall model evaluation. These are the distinct values of the target column at the moment of the evaluation; for this problem annotation specs in the dataset do not exist.

NOTE: Currently there is no way to obtain the display_name of the annotation spec from its ID. To see the display_names, review the model evaluations in the AutoML UI.

display_name

string

Output only. The value of display_name at the moment when the model was trained. Because this field returns a value at model training time, for different models trained from the same dataset, the values may differ, since display names could had been changed between the two model's trainings. For Tables CLASSIFICATION

prediction_type-s distinct values of the target column at the moment of the model evaluation are populated here. The display_name is empty for the overall model evaluation.

create_time

Timestamp

Output only. Timestamp when this model evaluation was created.

evaluated_example_count

int32

Output only. The number of examples used for model evaluation, i.e. for which ground truth from time of model creation is compared against the predicted annotations created by the model. For overall ModelEvaluation (i.e. with annotation_spec_id not set) this is the total number of all examples used for evaluation. Otherwise, this is the count of examples that according to the ground truth were annotated by the

annotation_spec_id.

Union field metrics. Output only. Problem type specific evaluation metrics. metrics can be only one of the following:
classification_evaluation_metrics

ClassificationEvaluationMetrics

Evaluation metrics for classification models.

AutoML Tables classification applies when the target column has a data type of either CATEGORY or ARRAY(CATEGORY).

regression_evaluation_metrics

RegressionEvaluationMetrics

Model evaluation metrics for Tables regression. Tables problem is considered a regression when the target column has FLOAT64 DataType.

tables_evaluation_metrics

TablesEvaluationMetrics

Evaluation metrics for Tables models.

OperationMetadata

Metadata used across all long running operations returned by AutoML API.

Fields
progress_percent

int32

Output only. Progress of operation. Range: [0, 100]. Not used currently.

partial_failures[]

Status

Output only. Partial failures encountered. E.g. single files that couldn't be read. This field should never exceed 20 entries. Status details field will contain standard GCP error details.

create_time

Timestamp

Output only. Time when the operation was created.

update_time

Timestamp

Output only. Time when the operation was updated for the last time.

Union field details. Ouptut only. Details of specific operation. Even if this field is empty, the presence allows to distinguish different types of operations. details can be only one of the following:
delete_details

DeleteOperationMetadata

Details of a Delete operation.

deploy_model_details

DeployModelOperationMetadata

Details of a DeployModel operation.

undeploy_model_details

UndeployModelOperationMetadata

Details of an UndeployModel operation.

create_model_details

CreateModelOperationMetadata

Details of CreateModel operation.

import_data_details

ImportDataOperationMetadata

Details of ImportData operation.

batch_predict_details

BatchPredictOperationMetadata

Details of BatchPredict operation.

export_data_details

ExportDataOperationMetadata

Details of ExportData operation.

export_evaluated_examples_details

ExportEvaluatedExamplesOperationMetadata

Details of ExportEvaluatedExamples operation.

OutputConfig

Output configuration for ExportData.

You can specify an output destination of either Google Cloud Storage or BigQuery.

Exporting to Google Cloud Storage

You can specify the path to your Google Cloud Storage output URI using the gcs_destination field.

The outputs correspond to how the data was imported, and may be used as input to import data. The output formats are represented as EBNF with literal commas and same non-terminal symbol definitions as in InputConfig, which is a CSV file(s) with each line in format:

ML_USE,GCS_FILE_PATH
  • ML_USE - Identifies the data set that the current row (file) applies to. This value can be one of the following:

    • TRAIN - Rows in this file are used to train the model.
    • TEST - Rows in this file are used to test the model during training.
    • UNASSIGNED - Rows in this file are not categorized. They are Automatically divided into train and test data. 80% for training and 20% for testing.
  • GCS_FILE_PATH - a Identifies JSON Lines (.JSONL) file stored in Google Cloud Storage that contains in-line text in-line as documents for model training.

The exported CSV file(s) are named tables_1.csv, tables_2.csv,..., tables_N.csv. Each exported file has a header line with the column names for the table, and the remaining lines in the CSV file contain a row of values for each respective column.

Exporting to BigQuery

You can specify the path to your BigQuery project output URI using the bigquery_destination field.

AutoML Tables creates a new dataset in the specified project with a name in the format export_data_<automl-dataset-display-name>_<timestamp-of-export-call>. <automl-dataset-display-name> is a data set name compatible with BigQuery naming (for example, most special characters are replaced with underscores) <timestamp-of-export-call> is in YYYY_MM_DDThh_mm_ss_sssZ format based on ISO-8601.

The dataset has a table called primary_table that is filled with the data that was imported into the AutoML Tables dataset.

Fields
Union field destination. Required. The destination of the output. destination can be only one of the following:
gcs_destination

GcsDestination

The Google Cloud Storage location where the output is to be written to. In the given directory a new directory will be created with name: export_data-<dataset-display-name>-<timestamp-of-export-call> where timestamp is in YYYY-MM-DDThh:mm:ss.sssZ ISO-8601 format. All export output will be written into that directory.

Exported data is written as CSV file(s) named tables_1.csv, tables_2.csv,...,tables_N.csv. Each exported file has a header line with the column names for the table, and the remaining lines in the CSV file contain a row of values for each respective column.

bigquery_destination

BigQueryDestination

The BigQuery location where the output is to be written to.

PredictRequest

Request message for PredictionService.Predict.

Fields
name

string

Name of the model requested to serve the prediction.

Authorization requires the following Google IAM permission on the specified resource name:

  • automl.models.predict

payload

ExamplePayload

Required. Payload to perform a prediction on. The payload must match the problem type that the model was trained to solve.

params

map<string, string>

Additional domain-specific parameters, any string must be up to 25000 characters long.

You can set the following fields:

See Making a prediction for more details.

PredictResponse

Response message for PredictionService.Predict.

Fields
payload[]

AnnotationPayload

Prediction result.

metadata

map<string, string>

Additional domain-specific prediction response metadata.

RegressionEvaluationMetrics

Metrics for regression problems.

Fields
root_mean_squared_error

float

Output only. Root Mean Squared Error (RMSE).

mean_absolute_error

float

Output only. Mean Absolute Error (MAE).

mean_absolute_percentage_error

float

Output only. Mean absolute percentage error. Only set if all ground truth values are are positive.

r_squared

float

Output only. R squared.

root_mean_squared_log_error

float

Output only. Root mean squared log error.

Row

A representation of a row in a relational table.

Fields
column_spec_ids[]
(deprecated)

string

Input Only. The resource IDs of the column specs describing the columns of the row. If set must contain, but possibly in a different order, all input feature

column_spec_ids of the Model this row is being passed to. Note: The below values field must match order of this field, if this field is set.

values[]
(deprecated)

Value

Input Only. The values of the row cells, given in the same order as the column_spec_ids, or, if not set, then in the same order as input feature

column_specs of the Model this row is being passed to.

cells

map<string, Value>

Works for input and is always populated in output. A map from column_spec_id to the values of the corresponding cell in the row. It must contain all column_spec_id in Model's

input_feature_column_specs. Should be empty if column_spec_ids or values are not empty.

StringStats

The data statistics of a series of STRING values.

Fields
top_unigram_stats[]

UnigramStats

The statistics of the top 20 unigrams, ordered by count.

UnigramStats

The statistics of a unigram.

Fields
value

string

The unigram.

count

int64

The number of occurrences of this unigram in the series.

StructStats

The data statistics of a series of STRUCT values.

Fields
field_stats

map<string, DataStats>

Map from a field name of the struct to data stats aggregated over series of all data in that field across all the structs.

StructType

StructType defines the DataType-s of a STRUCT type.

Fields
fields

map<string, DataType>

Unordered map of struct field names to their data types. Fields cannot be added or removed via Update. Their names and data types are still mutable.

TableSpec

A specification of a relational table. The table's schema is represented via its child column specs. It is pre-populated as part of ImportData by schema inference algorithm, the version of which is a required parameter of ImportData InputConfig.

Note: While working with a table, at times the schema may be inconsistent with the data in the table (e.g. string in a FLOAT64 column). The consistency validation is done upon creation of a model.

Fields
name

string

Output only. The resource name of the table spec. Form:

projects/{project_id}/locations/{location_id}/datasets/{dataset_id}/tableSpecs/{table_spec_id}

time_column_spec_id

string

column_spec_id of the time column. Only used if the parent dataset's ml_use_column_spec_id is not set. Used to split rows into TRAIN, VALIDATE and TEST sets such that oldest rows go to TRAIN set, newest to TEST, and those in between to VALIDATE. Required type: TIMESTAMP. If both this column and ml_use_column are not set, then ML use of all rows will be assigned by AutoML. NOTE: Updates of this field will instantly affect any other users concurrently working with the dataset.

row_count

int64

Output only. The number of rows (i.e. examples) in the table.

column_count

int64

Output only. The number of columns of the table. That is, the number of child ColumnSpec-s.

input_configs[]

InputConfig

Output only. Input configs via which data currently residing in the table had been imported.

etag

string

Used to perform consistent read-modify-write updates. If not set, a blind "overwrite" update happens.

TablesAnnotation

Contains annotation details specific to Tables.

Fields
score

float

Output only. A confidence estimate between 0.0 and 1.0, inclusive. A higher value means greater confidence in the returned value. For

target_column_spec of FLOAT64 data type the score is not populated.

prediction_interval

DoubleRange

Output only. Only populated when

target_column_spec has FLOAT64 data type. An interval in which the exactly correct target value has 95% chance to be in.

value

Value

The predicted value of the row's

target_column. The value depends on the column's DataType:

  • CATEGORY - the predicted (with the above confidence score) CATEGORY value.

  • FLOAT64 - the predicted (with above prediction_interval) FLOAT64 value.

tables_model_column_info[]

TablesModelColumnInfo

Output only. Auxiliary information for each of the model's

input_feature_column_specs with respect to this particular prediction. If no other fields than

column_spec_name and

column_display_name would be populated, then this whole field is not.

TablesClassificationMetrics

Metrics for Tables classification problems.

Fields
curve_metrics[]

CurveMetrics

Metrics building a curve.

CurveMetrics

Metrics curve data point for a single value.

Fields
value

string

The CATEGORY row value (for ARRAY unnested) the curve metrics are for.

position_threshold

int32

The position threshold value used to compute the metrics.

confidence_metrics_entries[]

TablesConfidenceMetricsEntry

Metrics that have confidence thresholds. Precision-recall curve and ROC curve can be derived from them.

auc_pr

float

The area under the precision-recall curve.

auc_roc

float

The area under receiver operating characteristic curve.

log_loss

float

The Log loss metric.

TablesConfidenceMetricsEntry

Metrics for a single confidence threshold.

Fields
confidence_threshold

float

The confidence threshold value used to compute the metrics.

false_positive_rate

float

FPR = #false positives / (#false positives + #true negatives)

true_positive_rate

float

TPR = #true positives / (#true positives + #false negatvies)

recall

float

Recall = #true positives / (#true positives + #false negatives).

precision

float

Precision = #true positives / (#true positives + #false positives).

f1_score

float

The harmonic mean of recall and precision. (2 * precision * recall) / (precision + recall)

true_positive_count

int64

True positive count.

false_positive_count

int64

False positive count.

true_negative_count

int64

True negative count.

false_negative_count

int64

False negative count.

TablesDatasetMetadata

Metadata for a dataset used for AutoML Tables.

Fields
primary_table_spec_id

string

Output only. The table_spec_id of the primary table of this dataset.

target_column_spec_id

string

column_spec_id of the primary table's column that should be used as the training & prediction target. This column must be non-nullable and have one of following data types (otherwise model creation will error):

  • CATEGORY

  • FLOAT64

If the type is CATEGORY , only up to 100 unique values may exist in that column across all rows.

NOTE: Updates of this field will instantly affect any other users concurrently working with the dataset.

weight_column_spec_id

string

column_spec_id of the primary table's column that should be used as the weight column, i.e. the higher the value the more important the row will be during model training. Required type: FLOAT64. Allowed values: 0 to 10000, inclusive on both ends; 0 means the row is ignored for training. If not set all rows are assumed to have equal weight of 1. NOTE: Updates of this field will instantly affect any other users concurrently working with the dataset.

ml_use_column_spec_id

string

column_spec_id of the primary table column which specifies a possible ML use of the row, i.e. the column will be used to split the rows into TRAIN, VALIDATE and TEST sets. Required type: STRING. This column, if set, must either have all of TRAIN, VALIDATE, TEST among its values, or only have TEST, UNASSIGNED values. In the latter case the rows with UNASSIGNED value will be assigned by AutoML. Note that if a given ml use distribution makes it impossible to create a "good" model, that call will error describing the issue. If both this column_spec_id and primary table's time_column_spec_id are not set, then all rows are treated as UNASSIGNED. NOTE: Updates of this field will instantly affect any other users concurrently working with the dataset.

target_column_correlations

map<string, CorrelationStats>

Output only. Correlations between

TablesDatasetMetadata.target_column_spec_id, and other columns of the

TablesDatasetMetadataprimary_table. Only set if the target column is set. Mapping from other column spec id to its CorrelationStats with the target column. This field may be stale, see the stats_update_time field for for the timestamp at which these stats were last updated.

stats_update_time

Timestamp

The most recent timestamp when target_column_correlations field and all descendant ColumnSpec.data_stats and ColumnSpec.top_correlated_columns fields were last (re-)generated. Any changes that happened to the dataset afterwards are not reflected in these fields values. The regeneration happens in the background on a best effort basis.

TablesEvaluationMetrics

Model evaluation metrics for Tables problems.

Fields
Union field metrics. Evaluation metrics specific to classification problem (if target column is of CATEGORY or ARRAY(CATEGORY) DataType), or regression problem (if target column is of FLOAT64 DataType). metrics can be only one of the following:
classification_metrics

TablesClassificationMetrics

Classification metrics.

regression_metrics

TablesRegressionMetrics

Regression metrics.

TablesModelColumnInfo

An information specific to given column and Tables Model, in context of the Model and the predictions created by it.

Fields
column_spec_name

string

Output only. The name of the ColumnSpec describing the column. Not populated when this proto is outputted to BigQuery.

column_display_name

string

Output only. The display name of the column (same as the display_name of its ColumnSpec).

feature_importance

float

Output only. When given as part of a Model (always populated): Measurement of how much model predictions correctness on the TEST data depend on values in this column. A value between 0 and 1, higher means higher influence. These values are normalized - for all input feature columns of a given model they add to 1.

When given back by Predict (populated iff feature_importance param is set) or Batch Predict (populated iff feature_importance param is set): Measurement of how impactful for the prediction returned for the given row the value in this column was. A value between 0 and 1, higher means larger impact. These values are normalized - for all input feature columns of a single predicted row they add to 1.

TablesModelMetadata

Model metadata specific to AutoML Tables.

Fields
prediction_type

PredictionType

The type of prediction this model is providing.

target_column_spec

ColumnSpec

Column spec of the dataset's primary table's column the model is predicting. Snapshotted when model creation started. Only 3 fields are used: name - May be set on CreateModel, if it's not then the ColumnSpec corresponding to the current target_column_spec_id of the dataset the model is trained from is used. If neither is set, CreateModel will error. display_name - Output only. data_type - Output only.

input_feature_column_specs[]

ColumnSpec

Column specs of the dataset's primary table's columns, on which the model is trained and which are used as the input for predictions. The

target_column as well as, according to dataset's state upon model creation,

weight_column, and

ml_use_column must never be included here.

Only 3 fields are used:

  • name - May be set on CreateModel, if set only the columns specified are used, otherwise all primary table's columns (except the ones listed above) are used for the training and prediction input.

  • display_name - Output only.

  • data_type - Output only.

excluded_feature_column_specs[]

ColumnSpec

Output only. Column specs of the dataset's primary table's columns that were excluded from training. Derived from the input feature column specs. The

target_column,

weight_column, and

ml_use_column are not included in this list because they are automatically excluded as prediction inputs. Only 3 fields are used: name, display_name, and data_type

optimization_objective

string

Objective function the model is optimizing towards. The training process creates a model that maximizes/minimizes the value of the objective function over the validation set.

The supported optimization objectives depend on the prediction type. If the field is not set, a default objective function is used.

CLASSIFICATION_BINARY: "MAXIMIZE_AU_ROC" (default) - Maximize the area under the receiver operating characteristic (ROC) curve. "MINIMIZE_LOG_LOSS" - Minimize log loss. "MAXIMIZE_AU_PRC" - Maximize the area under the precision-recall curve.

CLASSIFICATION_MULTI_CLASS : "MINIMIZE_LOG_LOSS" (default) - Minimize log loss.

REGRESSION: "MINIMIZE_RMSE" (default) - Minimize root-mean-squared error (RMSE). "MINIMIZE_MAE" - Minimize mean-absolute error (MAE). "MINIMIZE_RMSLE" - Minimize root-mean-squared log error (RMSLE).

tables_model_column_info[]

TablesModelColumnInfo

Output only. Auxiliary information for each of the input_feature_column_specs with respect to this particular model.

train_budget_milli_node_hours

int64

Required. The train budget of creating this model, expressed in milli node hours i.e. 1,000 value in this field means 1 node hour.

The training cost of the model will not exceed this budget. The final cost will be attempted to be close to the budget, though may end up being (even) noticeably smaller - at the backend's discretion. This especially may happen when further model training ceases to provide any improvements.

If the budget is set to a value known to be insufficient to train a model for the given dataset, the training won't be attempted and will error.

The train budget must be between 1,000 and 72,000 milli node hours, inclusive.

train_cost_milli_node_hours

int64

Output only. The actual training cost of the model, expressed in milli node hours, i.e. 1,000 value in this field means 1 node hour. Guaranteed to not exceed the train budget.

disable_early_stopping

bool

Use the entire training budget. This disables the early stopping feature. By default, the early stopping feature is enabled, which means that AutoML Tables might stop training before the entire training budget has been used.

time_column_spec_id

string

ID of the time column. If set, then this overrides

time_column_spec_id. If unset, then

time_column_spec_id will be used. Only used if

ml_use_column is not set.

forecasting_config

ForecastingConfig

Additional model configuration specific to FORECASTING.

Union field data_split_strategy. The strategy used to split the data between training, validation, and test sets. If set at model creation time, then it overrides

ml_use_column. Otherwise, then the dataset properties will be used to determine a strategy. If no strategy is set, then the dataset will be split using SplitPercentageConfig with 80% to train set, 10% to validation set, and 10% to test set. data_split_strategy can be only one of the following:

ml_use_column_spec_id

string

ID of the column use to split the table.

split_percentage_config

SplitPercentageConfig

A data split strategy using percentage configuration.

ForecastingConfig

An additional configuration needed for FORECASTING prediction type. By key, the union of the

[key][forecasting_metadata]google.cloud.automl.v1beta1.ColumnSpec.ForecastingMetadata.ColumnType.KEY column(s) is meant. The table's rows are the data points of the time series, and their timestamps are the values in the

primary_table's time_column

Fields
granularity

TimeGranularity

Required. Describes the granularity of the time series. E.g. is a row for each key given per hour, per 3 months or per year? For each key the rows' timestamps should be spaced approximately by this granularity period. Missing rows are allowed i.e. there may be periods for which data for a given key is missing, also data for a key may start late or end early. On the other hand within a single granularity period no more than one row per key is allowed. The time column must be at least as precise as this granularity requires.

horizon_periods

int64

Required. The number of periods the model is able to predict into the future, where each period is one unit of granularity as defined by the granularity field above. Forecasting only supports batch predictions and when such a one is requested both historical data rows, as well as to-be-predicted rows must be provided. Prediction for a row fails if, for its key, the time difference between its timestamp and timestamp of key's latest historical row is larger than the horizon. If the asked to be predicted row's key is not present in the historical data then its prediction fails if the time difference between its timestamp and timestamp of latest historical row for any key is over the horizon.

PredictionType

The type of prediction this model is providing.

The available and default prediction types depend on

target_column's

data_type and, possibly, number of distinct values in that column:

  • CATEGORY, two distinct values: CLASSIFICATION_BINARY (default)

  • CATEGORY, more than two distinct values: CLASSIFICATION_MULTI_CLASS (default)

  • ARRAY(CATEGORY): CLASSIFICATION_MULTI_LABEL (default)

  • FLOAT64: REGRESSION (default) FORECASTING

Enums
PREDICTION_TYPE_UNSPECIFIED An un-set value of this enum, must not be used.
CLASSIFICATION_BINARY One out of two target values is picked per example.
CLASSIFICATION_MULTI_CLASS One out of multiple target values is picked per example.
CLASSIFICATION_MULTI_LABEL Multiple values are picked per example.
REGRESSION A value is chosen based on its relation to other values.
FORECASTING

A value for a new timestamp is chosen, based on the historical time series data. Time series for multiple entities, which are identified by

key, column(s) may be forecast by a single model. The use of this prediction type has following requirements: * primary_table_spec's'

time_column_spec_id must be set, and must be not nullable. * Each

input_feature_column_spec, except for time_column_spec must have

forecasting_metadata set (note that

target_column_spec is separate and needs not this information). * The

forecasting_config must be set.

SplitPercentageConfig

An additional configuration for a data split by percentages. train_set_percentage, validation_set_percentage, and test_set_percentage must all add up to 100.

Fields
train_set_percentage

int32

Required. The percentage of data to reserve for the training set.

validation_set_percentage

int32

Required. The percentage of data to reserve for the validation set.

test_set_percentage

int32

Required. The percentage of data to reserve for the test set.

TablesRegressionMetrics

Metrics for Tables regression problems.

Fields
root_mean_squared_error

float

Root mean squared error.

mean_absolute_error

float

Mean absolute error.

mean_absolute_percentage_error

float

Mean absolute percentage error, only set if all of the target column's values are positive.

r_squared

float

R squared.

root_mean_squared_log_error

float

Root mean squared log error.

TimeGranularity

A duration of time expressed in time granularity units.

Fields
unit

TimeGranularityUnit

The unit of this time period.

quantity

int64

The number of units per period, e.g. 3 weeks or 2 months.

TimeGranularityUnit

Represents a unit of time granularity.

Enums
TIME_GRANULARITY_UNIT_UNSPECIFIED Should not be used.
HOUR A period of 60 minutes.
DAY A period of 24 hours.
WEEK A period of 7 days.
MONTH A period of one calendar month.
YEAR A period of 12 calendar months.

TimeSeries

Represents a time series.

Fields
row

Row

Required. To be predicted row in the time series.

historical_rows[]

Row

Required in predict. Historical data in the time series.

TimestampStats

The data statistics of a series of TIMESTAMP values.

Fields
granular_stats

map<string, GranularStats>

The string key is the pre-defined granularity. Currently supported: hour_of_day, day_of_week, month_of_year. Granularities finer that the granularity of timestamp data are not populated (e.g. if timestamps are at day granularity, then hour_of_day is not populated).

GranularStats

Stats split by a defined in context granularity.

Fields
buckets

map<int32, int64>

A map from granularity key to example count for that key. E.g. for hour_of_day 13 means 1pm, or for month_of_year 5 means May).

TypeCode

TypeCode is used as a part of DataType.

Enums
TYPE_CODE_UNSPECIFIED Not specified. Should not be used.
FLOAT64 Encoded as number, or the strings "NaN", "Infinity", or "-Infinity".
TIMESTAMP Must be between 0AD and 9999AD. Encoded as string according to time_format, or, if that format is not set, then in RFC 3339 date-time format, where time-offset = "Z" (e.g. 1985-04-12T23:20:50.52Z).
STRING Encoded as string.
ARRAY

Encoded as list, where the list elements are represented according to

list_element_type.

STRUCT Encoded as struct, where field values are represented according to struct_type.
CATEGORY Values of this type are not further understood by AutoML, e.g. AutoML is unable to tell the order of values (as it could with FLOAT64), or is unable to say if one value contains another (as it could with STRING). Encoded as string (bytes should be base64-encoded, as described in RFC 4648, section 4).

UndeployModelOperationMetadata

Details of UndeployModel operation.

UndeployModelRequest

Request message for AutoMl.UndeployModel.

Fields
name

string

Resource name of the model to undeploy.

Authorization requires the following Google IAM permission on the specified resource name:

  • automl.models.undeploy

UpdateColumnSpecRequest

Request message for AutoMl.UpdateColumnSpec

Fields
column_spec

ColumnSpec

The column spec which replaces the resource on the server.

Authorization requires the following Google IAM permission on the specified resource columnSpec:

  • automl.columnSpecs.update

update_mask

FieldMask

The update mask applies to the resource.

UpdateDatasetRequest

Request message for AutoMl.UpdateDataset

Fields
dataset

Dataset

The dataset which replaces the resource on the server.

Authorization requires the following Google IAM permission on the specified resource dataset:

  • automl.datasets.update

update_mask

FieldMask

The update mask applies to the resource.

UpdateTableSpecRequest

Request message for AutoMl.UpdateTableSpec

Fields
table_spec

TableSpec

The table spec which replaces the resource on the server.

Authorization requires the following Google IAM permission on the specified resource tableSpec:

  • automl.tableSpecs.update

update_mask

FieldMask

The update mask applies to the resource.

هل كانت هذه الصفحة مفيدة؟ يرجى تقييم أدائنا:

إرسال تعليقات حول...