Training Cloud-hosted models

You create a custom model by training it using a prepared dataset. AutoML API uses the items from the dataset to train the model, test it, and evaluate its performance. You review the results, adjust the training dataset as needed, and train a new model using the improved dataset.

Training a model can take several hours to complete. The AutoML API enables you to check the status of training.

Since AutoML Vision creates a new model each time you start training, your project may include numerous models. You can get a list of the models in your project can delete models you no longer need. Alternatively, you can use the Cloud AutoML Vision UI to list and delete models created via the AutoML API that you do not need anymore.

Note:

  • Unless otherwise specified in applicable terms of service or documentation, custom models created in Cloud AutoML products cannot be exported.
  • The maximum lifespan for a custom model is two years. You must create and train a new model to continue classifying content after that amount of time.
  • Edge models are optimized for inference on an Edge device. Consequently, Edge model accuracy will differ from Cloud model accuracy.

Using curl

To make it more convenient to run the curl samples in this topic, set the following environment variable. Replace project-id with the name of your GCP project.

export PROJECT_ID="project-id"

Training models

When you have a dataset with a solid set of labeled training items, you are ready to create and train the custom model.

Web UI

  1. Open the AutoML Vision UI.

    The Datasets page shows the available datasets for the current project.

    Dataset list page

  2. Select the dataset you want to use to train the custom model.

    The display name of the selected dataset appears in the title bar, and the page lists the individual items in the dataset along with their labels.

    Image items page

  3. When you are done reviewing the dataset, click the Train tab just below the title bar.

    The training page provides a basic analysis of your dataset and advises you about whether it is adequate for training. If AutoML Vision suggests changes, consider returning to the Images page and adding items or labels.

  4. When the dataset is ready, click Start Training.

Training a model can take several hours to complete. After the model is successfully trained, you will receive a message at the email address that you used to sign up for the program.

Integrated UI

  1. Open the Vision Dashboard.

    The Datasets page shows the available datasets for the current project.

    updated dataset list page

  2. Select the dataset you want to use to train the custom model.

    The display name of the selected dataset appears in the title bar, and the page lists the individual items in the dataset along with their labels.

    updated image items page

  3. When you are done reviewing the dataset, select the Train tab just below the title bar.

    The training page provides a basic analysis of your dataset and advises you about whether it is adequate for training. If AutoML Vision suggests changes, consider returning to the Images page and adding items or labels.

  4. When the dataset is ready, select Start Training.

  5. A side window with model training options will appear. Choose a proper training budget value based on your dataset size.

    By default, 24 node hours should be sufficient for most datasets.

  6. Optional: Select the checkbox check_box Deploy model to 1 node after training to opt-in to automatic model deployment after training has completed.

    Automatic model deployment means your model will be available for prediction immediately after training.

  7. Select Start training.

Command-line

  • Replace dataset-id with the ID of your dataset. The ID is the last element of the name of your dataset. For example, if the name of your dataset is projects/434039606874/locations/us-central1/datasets/3104518874390609379, then the ID of your dataset is 3104518874390609379.
curl \
  -X POST \
  -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
  -H "Content-Type: application/json" \
  https://automl.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/models \
  -d '{
    "displayName": "test_model",
    "dataset_id": "dataset-id",
    "imageClassificationModelMetadata": {
      "trainBudget": "1"
    },
  }'

You should see output similar to the following. You can use the operation ID to get the status of the task. For an example, see Getting the status of an operation.

{
  "name": "projects/434039606874/locations/us-central1/operations/1979469554520652445",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.automl.v1beta1.OperationMetadata",
    "createTime": "2018-04-27T01:28:41.338120Z",
    "updateTime": "2018-04-27T01:28:41.338120Z",
    "cancellable": true
  }
}

Python

# TODO(developer): Uncomment and set the following variables
# project_id = 'PROJECT_ID_HERE'
# compute_region = 'COMPUTE_REGION_HERE'
# dataset_id = 'DATASET_ID_HERE'
# model_name = 'MODEL_NAME_HERE'
# train_budget = integer amount for maximum cost of model

from google.cloud import automl_v1beta1 as automl

client = automl.AutoMlClient()

# A resource that represents Google Cloud Platform location.
project_location = client.location_path(project_id, compute_region)

# Set model name and model metadata for the image dataset.
my_model = {
    "display_name": model_name,
    "dataset_id": dataset_id,
    "image_classification_model_metadata": {"train_budget": train_budget}
    if train_budget
    else {},
}

# Create a model with the model metadata in the region.
response = client.create_model(project_location, my_model)

print("Training operation name: {}".format(response.operation.name))
print("Training started...")

Java

/**
 * Demonstrates using the AutoML client to create a model.
 *
 * @param projectId the Id of the project.
 * @param computeRegion the Region name.
 * @param dataSetId the Id of the dataset to which model is created.
 * @param modelName the Name of the model.
 * @param trainBudget the Budget for training the model.
 */
static void createModel(String projectId, String computeRegion, String dataSetId,
    String modelName, String trainBudget) {
  // Instantiates a client
  try (AutoMlClient client = AutoMlClient.create()) {

    // A resource that represents Google Cloud Platform location.
    LocationName projectLocation = LocationName.of(projectId, computeRegion);

    // Set model metadata.
    ImageClassificationModelMetadata imageClassificationModelMetadata =
        Long.valueOf(trainBudget) == 0
            ? ImageClassificationModelMetadata.newBuilder().build()
            : ImageClassificationModelMetadata.newBuilder()
                .setTrainBudget(Long.valueOf(trainBudget))
                .build();

    // Set model name and model metadata for the image dataset.
    Model myModel =
        Model.newBuilder()
            .setDisplayName(modelName)
            .setDatasetId(dataSetId)
            .setImageClassificationModelMetadata(imageClassificationModelMetadata)
            .build();

    // Create a model with the model metadata in the region.
    OperationFuture<Model, OperationMetadata> response =
        client.createModelAsync(projectLocation, myModel);

    System.out.println(
        String
            .format("Training operation name: %s", response.getInitialFuture().get().getName()));
    System.out.println("Training started...");
  } catch (IOException | ExecutionException | InterruptedException e) {
    e.printStackTrace();
  }
}

Node.js

  async function automlVisionCreateModel() {
    const automl = require(`@google-cloud/automl`).v1beta1;

    const client = new automl.AutoMlClient();

    /**
     * TODO(developer): Uncomment the following line before running the sample.
     */
    // const projectId = `The GCLOUD_PROJECT string, e.g. "my-gcloud-project"`;
    // const computeRegion = `region-name, e.g. "us-central1"`;
    // const datasetId = `Id of the dataset`;
    // const modelName = `Name of the model, e.g. "myModel"`;
    // const trainBudget = `Budget for training model, e.g. 50`;

    // A resource that represents Google Cloud Platform location.
    const projectLocation = client.locationPath(projectId, computeRegion);

    // Check train budget condition.
    if (trainBudget === 0) {
      trainBudget = {};
    } else {
      trainBudget = {trainBudget: trainBudget};
    }

    // Set model name and model metadata for the dataset.
    const myModel = {
      displayName: modelName,
      datasetId: datasetId,
      imageClassificationModelMetadata: trainBudget,
    };

    // Create a model with the model metadata in the region.
    const [operation, initialApiResponse] = await client.createModel({
      parent: projectLocation,
      model: myModel,
    });
    console.log(`Training operation name: `, initialApiResponse.name);
    console.log(`Training started...`);
    const [model] = await operation.promise();

    // Retrieve deployment state.
    let deploymentState = ``;
    if (model.deploymentState === 1) {
      deploymentState = `deployed`;
    } else if (model.deploymentState === 2) {
      deploymentState = `undeployed`;
    }

    // Display the model information.
    console.log(`Model name: ${model.name}`);
    console.log(`Model id: ${model.name.split(`/`).pop(-1)}`);
    console.log(`Model display name: ${model.displayName}`);
    console.log(`Model create time:`);
    console.log(`\tseconds: ${model.createTime.seconds}`);
    console.log(`\tnanos: ${model.createTime.nanos}`);
    console.log(`Model deployment state: ${deploymentState}`);
  }

  automlVisionCreateModel().catch(console.error);

Getting the status of an operation

You can check the status of a long-running task (importing items into a dataset or training a model) using the operation ID from the response when you started the task.

You can only check the status of operations using the AutoML API.

In the command below, replace operation-name with the full name of your operation. The full name has the format projects/{project-id}/locations/us-central1/operations/{operation-id}.

curl \
  -X GET \
  -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
  -H "Content-Type: application/json" \
  https://automl.googleapis.com/v1beta1/operation-id

You should see output similar to the following for an import operation:

{
  "name": "projects/434039606874/locations/us-central1/operations/2116326435840390257",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.automl.v1beta1.OperationMetadata",
    "progressPercentage": 100,
    "partialFailures": [
      {
        "code": 7,
        "message": "Duplicated files detected gs://my-project-lcm/training-data/astros.txt
        and gs://my-project-lcm/training-data/cubs.txt have the same content"
      },
    ],
    "createTime": "2018-04-27T01:39:59.821460Z",
    "updateTime": "2018-04-27T01:43:09.564770Z"
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.protobuf.Empty"
  }
}

You should see output similar to the following for a create model operation:

{
  "name": "projects/434039606874/locations/us-central1/operations/2126599795587061786",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.automl.v1beta1.OperationMetadata",
    "progressPercentage": 100,
    "createTime": "2018-04-27T01:56:28.395640Z",
    "updateTime": "2018-04-27T02:04:12.336070Z"
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.cloud.automl.v1beta1.Model",
    "name": "projects/434039606874/locations/us-central1/models/3745331181667467569",
    "createTime": "2018-04-27T02:00:22.329970Z",
    "imageClassificationModelMetadata": {
      "trainBudget": "1",
      "trainCost": "1",
      "stopReason": "BUDGET_REACHED"
    },
    "displayName": "a_98487760535e48319dd204e6394670"
  }
}

Cancelling an 0peration

You can cancel an import or training task using the operation ID. In the command below, replace operation-name with the full name of your operation. The full name has the format projects/{project-id}/locations/us-central1/operations/{operation-id}.

curl \
  -X POST "Content-Type: application/json" \
  -H "Authorization: Bearer `gcloud auth application-default print-access-token`" \
  https://automl.googleapis.com/v1beta1/operation-name:cancel"
Was this page helpful? Let us know how we did:

Send feedback about...

Cloud AutoML Vision
Need help? Visit our support page.