Cloud model creation via the AutoML API quickstart

This quickstart walks you through the process of:

  • Copying a CSV listing images and bounding boxes with labels into Google Cloud Storage.
  • Using AutoML Vision Object Detection to create your dataset, and train and deploy your model.

In this quickstart you use cURL commands to send requests to AutoML Vision API. You can also complete all the steps listed here in the user interface by referencing the Quickstart using the user interface. For further directions on using the UI or API, see the how-to guides.

Before you begin

Set up your project

  1. Sign in to your Google Account.

    If you don't already have one, sign up for a new account.

  2. In the Google Cloud Console, on the project selector page, select or create a Google Cloud project.

    Go to the project selector page

  3. Make sure that billing is enabled for your Cloud project. Learn how to confirm that billing is enabled for your project.

  4. Enable the AutoML and Cloud Storage APIs.

    Enable the APIs

  5. Install the gcloud command line tool.
  6. Follow the instructions to create a service account and download a key file for that account.
  7. Set the GOOGLE_APPLICATION_CREDENTIALS environment variable to the path to the service account key file that you downloaded when you created the service account.
    export GOOGLE_APPLICATION_CREDENTIALS=key-file
  8. Set the PROJECT_ID environment variable to your Project ID.
    export PROJECT_ID=your-project-id
    The AutoML API calls and resource names include your Project ID in them. The PROJECT_ID environment variable provides a convenient way to specify the ID.
  9. If you are an owner for your project, add your service account to the AutoML Editor IAM role, replacing service-account-name with the name of your new service account. For example, service-account1@myproject.iam.gserviceaccount.com.
    gcloud auth login
    gcloud projects add-iam-policy-binding $PROJECT_ID \
       --member="serviceAccount:service-account-name" \
       --role="roles/automl.editor"
    
  10. Otherwise (if you are not a project owner), ask a project owner to add both your user ID and your service account to the AutoML Editor IAM role.

Preparing a dataset

In this quickstart you use a dataset created from Open Images Dataset V4. This publicly available Salads dataset is located at gs://cloud-ml-data/img/openimage/csv/salads_ml_use.csv.

The CSV format is as follows:

TRAINING,gs://cloud-ml-data/img/openimage/3/2520/3916261642_0a504acd60_o.jpg,Salad,0.0,0.0954,,,0.977,0.957,,
VALIDATION,gs://cloud-ml-data/img/openimage/3/2520/3916261642_0a504acd60_o.jpg,Seafood,0.0154,0.1538,,,1.0,0.802,,
TEST,gs://cloud-ml-data/img/openimage/3/2520/3916261642_0a504acd60_o.jpg,Tomato,0.0,0.655,,,0.231,0.839,,
dataset image example
3916261642_0a504acd60_o.jpg

Each row corresponds to an object localized inside a larger image, with each object specifically designated as test, train, or validation data. The three lines included here indicate three distinct objects located inside the same image available at gs://cloud-ml-data/img/openimage/3/2520/3916261642_0a504acd60_o.jpg. Each row has a different label: Salad, Seafood, Tomato, in addition to other rows with Baked goods or Cheese labels.

Bounding boxes are specified for each image using the top left and bottom right vertices:

  • (0,0) corresponds to the top left-most vertex.
  • (1,1) corresponds to the bottom right-most vertex.

For the first row shown above, the (x, y) coordinates for the top left vertex of the Salad labeled object are (0.0,0.0954), and the coordinates for the bottom right vertex of the object are (0.977,0.957).

For more detailed information on how to format your CSV file and the minimum requirements for creating a valid dataset, see Preparing your training data.


Create a dataset and import training data

The curl command uses the gcloud auth application-default print-access-token command to obtain an access token for a service account that you set up earlier in the topic. The path to the service account key file is stored in the GOOGLE_APPLICATION_CREDENTIALS environment variable.

For the Beta release use us-central1 as the region, regardless of your actual location.

Create a dataset

Use the following curl command to create a new dataset with a display name of your choosing:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
-H "Content-Type: application/json" \
"https://automl.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/datasets" -d '{
    "display_name": "your_display_name",
    "image_object_detection_dataset_metadata": {
    }
}'

The response returns a relative dataset ID (such as IOD5491013845671477445) that you will need in following steps.

{
  "name": "projects/${PROJECT_ID}/locations/us-central1/datasets/IOD5491013845671477445",
  "displayName": "your_display_name",
  "createTime": "2018-10-29T15:45:53.353442Z",
  "imageObjectDetectionDatasetMetadata": {}
}

Import data

Import your training data into your dataset. The importData command takes as input the URI of your dataset. Here you will provide the publicly available Google Cloud Storage address where the Salads dataset is stored. This process may take up to 30 minutes.

  • Replace your-dataset-id with the dataset identifier for your dataset (not the display name). For example: IOD5491013845671477445.
curl -X POST \
-H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
-H "Content-Type: application/json" \
https://automl.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/datasets/your-dataset-id:importData \
-d '{
  "input_config": {
    "gcs_source": {
       "input_uris": [
         "gs://cloud-ml-data/img/openimage/csv/salads_ml_use.csv"
        ]
    }
  }
}'

The response returns a relative operation ID (for example, IOD1555149246326374411) that can be used to get the status of the operation.

{
  "name": "projects/${PROJECT_NO}/locations/us-central1/operations/IOD1555149246326374411",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.automl.v1beta1.OperationMetadata",
    "createTime": "2018-10-29T15:56:29.176485Z",
    "updateTime": "2018-10-29T15:56:29.176485Z",
    "importDataDetails": {}
  }
}

Get the status of the import operation

You can query the status of your import data operation by using the following curl command.

  • Replace your-operation-id with the operation id returned from the import data operation.
curl -X GET -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
-H "Content-Type: application/json" \
https://automl.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/operations/your-operation-id

The import operation can take some time to complete. When the import task is finished, you will see done: true in the status of the operation with no errors listed, as shown in the following example.

This request will also return any warnings for your dataset import. Errors were added to this dataset to show you an example of these warnings. Using this operation to see these warnings is a useful way to find errors in your data.

{
  "name": "projects/${PROJECT_NO}/locations/us-central1/operations/IOD1555149246326374411",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.automl.v1beta1.OperationMetadata",
    "createTime": "2018-10-29T15:56:29.176485Z",
    "updateTime": "2018-10-29T16:10:41.326614Z",
    "importDataDetails": {}
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.protobuf.Empty"
  }
}

Get a list of datasets

You can get a list of your datasets by using the following command.

curl \
-H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
https://automl.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/datasets

You should see output similar to the following:

{
  "datasets": [
    {
      "name": "projects/${PROJECT_NO}/locations/us-central1/datasets/dataset-id1",
      "displayName": "display_name1",
      "createTime": "2018-10-29T15:45:53.353442Z",
      "exampleCount": 227,
      "imageObjectDetectionDatasetMetadata": {}
    },
    {
      "name": "projects/${PROJECT_NO}/locations/us-central1/datasets/dataset-id2",
      "displayName": "display_name2",
      "createTime": "2018-10-24T21:06:05.390059Z",
      "exampleCount": 227,
      "imageObjectDetectionDatasetMetadata": {}
    }
  ]
}

Train your model

Launch a model training application

After you have created your dataset and imported your training data into your dataset, you can train your custom model.

Train your model by using the following curl command.

  • Replace your-dataset-id with the dataset identifier for your dataset (not the display name).
  • Replace your_display_name with a name that you choose for your model.
  • Specify image_object_detection_model_metadata.model_type. Two available options are to optimize for latency or accuracy:
    • cloud-low-latency-1 - Optimizes training for latency.
    • cloud-high-accuracy-1 - Optimizes training for accuracy.
curl -X POST \
-H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
-H "Content-Type: application/json" \
https://automl.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/models -d '{
  "datasetId": "your-dataset-id",
  "displayName": "your_display_name",
  "image_object_detection_model_metadata": {},
}'

You should receive an operation id for your train model operation (for example, IOD5644417707978784777) which you can use to get the status of the training operation.

{
  "name": "projects/${PROJECT_NO}/locations/us-central1/operations/IOD5644417707978784777",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.automl.v1beta1.OperationMetadata",
    "createTime": "2018-10-29T16:41:23.902167Z",
    "updateTime": "2018-10-29T16:41:23.902167Z",
    "createModelDetails": {}
  }
}

The training process may take several hours to complete.

Get the status of the model training operation

You can query the status of your model training operation by using the following curl command.

  • Replace your-operation-id with the operation id for your training operation.
curl -X GET -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
-H "Content-Type: application/json" \
https://automl.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/operations/your-operation-id

You should see output similar to the following. When the operation is complete, you will see done: true with no errors listed.


{
  "name": "projects/${PROJECT_NO}/locations/us-central1/operations/IOD5644417707978784777",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.automl.v1beta1.OperationMetadata",
    "createTime": "2018-10-24T22:08:23.327323Z",
    "updateTime": "2018-10-24T23:41:18.452855Z",
    "createModelDetails": {}
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.cloud.automl.v1beta1.Model",
    "name": "projects/${PROJECT_NO}/locations/us-central1/models/IOD5644417707978784777"
  }
}

Verify that the model is available

After your model training operation successfully completes, you can verify that your model is available by using the following command to list the models for your project.

curl -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
https://automl.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/models

You should receive a model id (for example, IOD5644417707978784778) which you can use to get model evaluation metrics when the model is done training. You should see output similar to the following:

{
  "model": [
    {
      "name": "projects/${PROJECT_NO}/locations/us-central1/models/IOD5644417707978784778",
      "displayName": "your_display_name",
      "datasetId": "IOD5491013845671477445",
      "createTime": "2018-10-24T23:37:00.858493Z",
      "updateTime": "2018-10-24T23:37:00.858493Z",
      "deploymentState": "DEPLOYED",
      "imageObjectDetectionModelMetadata": {
          "modelType": "cloud-low-latency-1",
          "nodeCount": "1",
          "nodeQps": 1.2987012987012987
      }
    }
  ]
}

Evaluate the model

After your model has finished training you can list model evaluation metrics using the following curl command.

  • Replace your-model-id with the identifier for your model.
curl -X GET -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
-H "Content-Type: application/json" \
https://automl.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/models/your-model-id/modelEvaluations

AutoML Vision Object Detection provides an aggregate set of evaluation metrics indicating how well the model performs overall, as well as evaluation metrics for each label, indicating how well the model performs for that label under differing thresholds.

Deploy your model

Before you can make a prediction you must manually deploy your model.

Use the following command to deploy your model:

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
  -H "Content-Type: application/json" \
  https://automl.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/models/model-id:deploy \
  -d '{
      "imageObjectDetectionModelDeploymentMetadata": {
        "nodeCount": 2
      }
    }'

Make a prediction

Send a prediction request

You can use your deployed model to make a prediction on a local image using the following JSON file and curl command.

  • Create a request JSON file called predict_request.json and provide a base64-encoded image in the "imageBytes" field.

predict_request.json

{
    "payload": {
        "image": {
            "image_bytes": "/9j/4QAYRXhpZgAA...base64-encoded-image...9tAVx/zDQDlGxn//2Q=="
        }
    }
}

  • Replace your-model-id with the identifier for your model.
curl -X POST \
-H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
-H "Content-Type: application/json" \
"https://automl.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/models/your-model-id:predict" -d @path-to-json-file/predict_request.json

A successful request returns a response with one or more bounding boxes specified by two diagonally opposite normalizedVertices. Each bounding box identified has an associated confidence score and annotation (displayName).

{
  "payload": [
    {
      "imageObjectDetection": {
        "boundingBox": {
          "normalizedVertices": [
            {
              "x": 0.034553755,
              "y": 0.015524037
            },
            {
              "x": 0.941527,
              "y": 0.9912563
            }
          ]
        },
        "score": 0.9997793
      },
      "displayName": "Salad"
    },
    {
      "imageObjectDetection": {
        "boundingBox": {
          "normalizedVertices": [
            {
              "x": 0.11737197,
              "y": 0.7098793
            },
            {
              "x": 0.510878,
              "y": 0.87987
            }
          ]
        },
        "score": 0.63219965
      },
      "displayName": "Tomato"
    }
  ]
}

Undeploy your model (optional)

Your model incurs charges while it is deployed. To avoid incurring this model-hosting cost you can undeploy your model.

Run the following command to undeploy your model:

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
  -H "Content-Type: application/json" \
  "https://automl.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/models/your-model-id:undeploy"

Cleanup

If you no longer need your custom model and the related dataset, you can delete them.

List models

You can list the models for your project, along with their identifiers, by using the following command:

curl -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
https://automl.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/models

Delete a model

You can delete a model by using the following command.

  • Replace your-model-id with the identifier for your model.
curl -X DELETE -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
-H "Content-Type: application/json" \
https://automl.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/models/your-model-id

List datasets

You can list the datasets for your project, along with their identifiers, by using the following command:

curl -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
https://automl.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/datasets

Delete a dataset

You can delete a dataset by using the following command.

  • Replace your-dataset-id with the identifier for your model.
curl -X DELETE -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
-H "Content-Type: application/json" \
https://automl.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/datasets/your-dataset-id