Train an AutoML model (console)

This page describes how to train an AutoML model using the Google Cloud console.

For information about using the Vertex AI API to train an AutoML model, see Training an AutoML model using the Vertex AI API.

Before you begin

Before you can train a model, you must have prepared your training data, created a dataset, and associated the data with the dataset.

Train an AutoML model

  1. In the Google Cloud console, in the Vertex AI section, go to the Datasets page.

    Go to the Datasets page

  2. Click the name of the dataset you want to use to train your model to open its details page.

  3. If your data type uses annotation sets, select the annotation set you want to use for this model.

  4. Click Train new model.

  5. In the Train new model page, complete the following steps for your data type:

    Image

    1. Select the model training method.

      • AutoML is a good choice for a wide range of use cases.
      • Seq2seq+ is a good choice for experimentation. The algorithm is likely to converge faster than AutoML because its architecture is simpler and it uses a smaller search space. Our experiments find that Seq2Seq+ performs well with a small time budget and on datasets smaller than 1 GB in size.
      Click Continue.

    2. Enter the display name for your new model.

    3. If you want manually set how your training data is split, expand Advanced options and select a data split option. Learn more.

    4. Click Continue.

    5. Object Detection models only: In the Optimize your model section, select the option you want to optimize for (accuracy or latency).

    6. Click Continue.

    7. Classification models only (optional): In the Explainability section, select Generate explainable bitmaps for each image in the test set to enable Vertex Explainable AI. Choose visualization settings and click Continue.

      This feature has costs associated with it. See Pricing for more information.

    8. In the Compute and pricing window, enter the maximum number of hours you want your model to train for.

      This setting helps you put a cap on the training costs. The actual time elapsed can be longer than this value, because there are other operations involved in creating a new model.

    9. If you want to stop training when the model is no longer improving, select Enable early stopping.

    10. Click Start Training.

      Model training can take many hours, depending on the size and complexity of your data and your training budget, if you specified one. You can close this tab and return to it later. You will receive an email when your model has completed training.

    Tabular

    Select your objective.

    Classification/Regression

    1. Select the model training method.

      • AutoML is a good choice for a wide range of use cases.
      • Seq2seq+ is a good choice for experimentation. The algorithm is likely to converge faster than AutoML because its architecture is simpler and it uses a smaller search space. Our experiments find that Seq2Seq+ performs well with a small time budget and on datasets smaller than 1 GB in size.
      Click Continue.

    2. Enter the display name for your new model.

    3. Select your target column.

      The target column is the value that the model will predict.

      Learn more about target column requirements.

    4. If you would like to export your test dataset to BigQuery, check Export test dataset to BigQuery and provide the name of the table.

    5. If you want to manually control your data split, open the Advanced options.

      The default data split is random. Depending on your data, you can select Manual to use a Data split column, control the percentages for the data split, or provide a Time column. Learn more about data splits.

    6. Click Continue.

    7. If you haven't already, click Generate statistics.

      Generating statistics populates the Transformation dropdown menus.

    8. On the Training options page, review your column list and exclude any columns from training that should not be used to train the model.

      If you are using a data split column, it should be included.

    9. Review the transformations selected for your included features, along with whether invalid data is allowed, and make any required updates.

      Learn more about transformations and invalid data.

    10. If you want to specify a weight column, or change your optimization objective from the default, open the Advanced options and make your selections.

      Learn more about weight columns and optimization objectives.

    11. Click Continue.

    12. In the Compute and pricing window, enter the maximum number of hours you want your model to train for.

      This setting helps you put a cap on the training costs. The actual time elapsed can be longer than this value, because there are other operations involved in creating a new model.

      Suggested training time is related to the size of your training data. The table below shows suggested training time ranges by row count; a large number of columns will also increase the required training time.

      Rows Suggested training time
      Less than 100,000 1-3 hours
      100,000 - 1,000,000 1-6 hours
      1,000,000 - 10,000,000 1-12 hours
      More than 10,000,000 3 - 24 hours

      For information about training pricing, see the pricing page.

    13. Click Start Training.

      Model training can take many hours, depending on the size and complexity of your data and your training budget, if you specified one. You can close this tab and return to it later. You will receive an email when your model has completed training.

    Forecasting

    1. Select the model training method.

      • AutoML is a good choice for a wide range of use cases.
      • Seq2seq+ is a good choice for experimentation. The algorithm is likely to converge faster than AutoML because its architecture is simpler and it uses a smaller search space. Our experiments find that Seq2Seq+ performs well with a small time budget and on datasets smaller than 1 GB in size.
      Click Continue.

    2. Enter the display name for your new model.

    3. Select your target column.

      The target column is the value that the model will forecast. Learn more about target column requirements.

    4. If you did not set your Series identifier and Timestamp columns on your dataset, select them now.

    5. Select your Data granularity.

      Learn more.

    6. Enter your Context window and Forecast horizon.

      If you do not specify a Context window, it defaults to the value set for Forecast horizon. For more information, see Considerations for setting the context window and forecast horizon.

    7. If you would like to export your test dataset to BigQuery, check Export test dataset to BigQuery and provide the name of the table.

    8. If you want Vertex AI to continue training even if your data has validation errors, you can select Ignore validation.

      Unless you understand the source of data errors and their impact on model quality, you should allow Vertex AI to cancel training for validation errors.

    9. If you want to manually control your data split, open the Advanced options.

      The default data split is chronological, with the standard 80/10/10 percentages. If you would like to manually specify which rows are assigned to which split, select Manual and specify your Data split column.

      Learn more about data splits.

    10. Click Continue.

    11. If you haven't already, click Generate statistics.

      Generating statistics populates the Transformation dropdown menus.

    12. On the Training options page, review your column list and exclude any columns from training that should not be used to train the model.

      If you are using a data split column, it should be included.

    13. Review the transformations selected for your included features and make any required updates.

      Rows containing data that is invalid for the selected transformation are excluded from training. Learn more about transformations.

    14. For each column you included for training, specify the Feature type for how that feature relates to its time series, and whether it is available at forecast time. Learn more about feature type and availability.

    15. If you want to specify a weight column, or change your optimization objective from the default, open the Advanced options and make your selections.

      Learn more about weight columns and optimization objectives.

    16. Click Continue.

    17. In the Compute and pricing window, enter the maximum number of hours you want your model to train for.

      This setting helps you put a cap on the training costs. The actual time elapsed can be longer than this value, because there are other operations involved in creating a new model.

      Suggested training time is related to the size of your forecast horizon and your training data. The table below provides some sample forecasting training runs, and the range of training time that was needed to train a high-quality model.

      Rows Features Forecast horizon Training time
      12 million 10 6 3-6 hours
      20 million 50 13 6-12 hours
      16 million 30 365 24-48 hours

      For information about training pricing, see the pricing page.

    18. Click Start Training.

      Model training can take many hours, depending on the size and complexity of your data and your training budget, if you specified one. You can close this tab and return to it later. You will receive an email when your model has completed training.

    Text

    1. For the training method, select AutoML.

    2. Click Continue.

    3. Enter a name for the model.

    4. If you want manually set how your training data is split, expand Advanced options and select a data split option. Learn more.

    5. Click Start Training.

      Model training can take many hours, depending on the size and complexity of your data and your training budget, if you specified one. You can close this tab and return to it later. You will receive an email when your model has completed training.

    Video

    1. Enter the display name for your new model.

    2. If you want manually set how your training data is split, expand Advanced options and select a data split option. Learn more.

    3. Click Continue.

    4. Select the model training method.

      • AutoML is a good choice for a wide range of use cases.
      • Seq2seq+ is a good choice for experimentation. The algorithm is likely to converge faster than AutoML because its architecture is simpler and it uses a smaller search space. Our experiments find that Seq2Seq+ performs well with a small time budget and on datasets smaller than 1 GB in size.
      Click Continue.

    5. Click Start Training.

      Model training can take many hours, depending on the size and complexity of your data and your training budget, if you specified one. You can close this tab and return to it later. You will receive an email when your model has completed training.

      Several minutes after training starts, you can check the training node hour estimation from the model's properties information. If you cancel the training, there is no charge on the current product.

What's next