Detect labels in an image by using the command line

This page shows you how to send three feature detection and annotation requests to the Vision API using the REST interface and the curl command.

Vision API enables easy integration of Google vision recognition technologies into developer applications. You can send image data and desired feature types to the Vision API, which then returns a corresponding response based on the image attributes you are interested in. For more information about the feature types offered, see the List of all Vision API features.

Before you begin

Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

Install the Google Cloud CLI.

To initialize the gcloud CLI, run the following command:

gcloud init

Create or select a Google Cloud project.

Create a Google Cloud project:
```
gcloud projects create PROJECT_ID
```
Replace PROJECT_ID with a name for the Google Cloud project you are creating.
Select the Google Cloud project that you created:
```
gcloud config set project PROJECT_ID
```
Replace PROJECT_ID with your Google Cloud project name.

Make sure that billing is enabled for your Google Cloud project.

Enable the Vision API:

gcloud services enable vision.googleapis.com

Grant roles to your user account. Run the following command once for each of the following IAM roles: roles/storage.objectViewer

gcloud projects add-iam-policy-binding PROJECT_ID --member="user:USER_IDENTIFIER" --role=ROLE

Replace PROJECT_ID with your project ID.
Replace USER_IDENTIFIER with the identifier for your user account. For example, user:myemail@example.com.
Replace ROLE with each individual role.

Install the Google Cloud CLI.

To initialize the gcloud CLI, run the following command:

gcloud init

Create or select a Google Cloud project.

Create a Google Cloud project:
```
gcloud projects create PROJECT_ID
```
Replace PROJECT_ID with a name for the Google Cloud project you are creating.
Select the Google Cloud project that you created:
```
gcloud config set project PROJECT_ID
```
Replace PROJECT_ID with your Google Cloud project name.

Make sure that billing is enabled for your Google Cloud project.

Enable the Vision API:

gcloud services enable vision.googleapis.com

Grant roles to your user account. Run the following command once for each of the following IAM roles: roles/storage.objectViewer

gcloud projects add-iam-policy-binding PROJECT_ID --member="user:USER_IDENTIFIER" --role=ROLE

Replace PROJECT_ID with your project ID.
Replace USER_IDENTIFIER with the identifier for your user account. For example, user:myemail@example.com.
Replace ROLE with each individual role.

Make an image annotation request

After completing the Before you begin steps you can use the Vision API to annotate an image file.

In this example you use curl to send a request to the Vision API using the following image:

Cloud Storage URI:

gs://cloud-samples-data/vision/using_curl/shanghai.jpeg

HTTPS URL:

https://console.cloud.google.com/storage/browser/cloud-samples-data/vision/using_curl/shanghai.jpeg

Shanghai street image. — *Image credit*: Steve Long on Unsplash.

Create the request JSON

The following request.json file demonstrates how to request three images:annotate features and limit the results in the response.

Create the JSON request file with the following text, and save it as a request.json plain text file in your working directory:

                        

request.json class="devsite-click-to-copy" translate="no" dir="ltr" is-upgraded syntax="JSON">{ "requests": [ { "image": { "source": { "imageUri": "gs://cloud-samples-data/vision/using_curl/shanghai.jpeg" } }, "features": [ { "type": "LABEL_DETECTION", "maxResults": 3 }, { "type": "OBJECT_LOCALIZATION", "maxResults": 1 }, { "type": "TEXT_DETECTION", "maxResults": 1, "model": "builtin/latest" } ] } ] }

Field value details

image.source.gcsImageUri - Indicates the image stored in a Cloud Storage bucket. You change this request to image.source.imageUri for a publicly available URI, or image.content to pass a base64 encoded string representation of an image.
features - An object representing a specific feature type. You can request multiple feature types for a single image.

type - The enum value specifying a feature.
maxResults (optional) - A limiting value on the results returned.
model (optional) - If applicable you can specify either builtin/stable (the default if unset) or builtin/latest to choose your model. Refer to the Release notes topic for a list of recently updated models.

Send the request

You use curl and the body content from request.json to send the request to the Vision API. Enter the following on your command line:

curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "x-goog-user-project: PROJECT_ID" \
    -H "Content-Type: application/json; charset=utf-8" \
    https://vision.googleapis.com/v1/images:annotate -d @request.json

Interpret the response

You should see a JSON response similar to the one below.

The request JSON body specified maxResults for each annotation type. Consequently, you will see the following in the response JSON:

three labelAnnotations results
one textAnnotations result (shortened for clarity)
one localizedObjectAnnotations result

Response

{
  "responses": [
    {
      "labelAnnotations": [
        {
          "mid": "/m/09g5pq",
          "description": "People",
          "score": 0.9504782,
          "topicality": 0.9504782
        },
        {
          "mid": "/m/01c8br",
          "description": "Street",
          "score": 0.8911568,
          "topicality": 0.8911568
        },
        {
          "mid": "/m/079bkr",
          "description": "Mode of transport",
          "score": 0.89089024,
          "topicality": 0.89089024
        }
      ],
      "textAnnotations": [
        {
          "locale": "zh",
          "description": "牛牛面馆\n",
          "boundingPoly": {
            "vertices": [
              {
                "x": 159,
                "y": 212
              },
              {
                "x": 947,
                "y": 212
              },
              {
                "x": 947,
                "y": 354
              },
              {
                "x": 159,
                "y": 354
              }
            ]
          }
        },
        ...
      ],
      "fullTextAnnotation": {
        "pages": [
          {
            ...
                "paragraphs": [
                  {
                    ...
                    "words": [
                      {
                        ...
                        "symbols": [
                          {
                            ...
                ],
                "blockType": "TEXT"
              }
            ]
          }
        ],
        "text": "牛牛面馆\n"
      },
      "localizedObjectAnnotations": [
        {
          "mid": "/m/01g317",
          "name": "Person",
          "score": 0.94413143,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.26063988,
                "y": 0.46869153
              },
              {
                "x": 0.40736017,
                "y": 0.46869153
              },
              {
                "x": 0.40736017,
                "y": 0.8957791
              },
              {
                "x": 0.26063988,
                "y": 0.8957791
              }
            ]
          }
        }
      ]
    }
  ]
}

Label detection results

description: "People", score: 0.950
description: "Street", score: 0.891
description: "Mode of transport", score: 0.890

Text detection results

text: 牛牛面馆\n
vertices: (x: 159, y: 212), (x: 947, y: 212), (x: 947, y: 354), (x: 159, y: 354 )

Object detection results

name: "Person", score: 0.944
normalized vertices: (x: 0.260, y: 0.468), (x: 0.407, y: 0.468), (x: 0.407, y: 0.895), (x: 0.260, y: 0.895)

Congratulations! You've sent your first request to the Vision API.

Clean up

To avoid incurring charges to your Google Cloud account for the resources used on this page, delete the Google Cloud project with the resources.

Optional: Revoke credentials from the gcloud CLI.

gcloud auth revoke

What's next

See a list of all feature types and their uses.
Get started with the Vision API in your language of choice by using a Vision API Client Library.
Use the How-to guides to learn more about specific features, see example annotations, and get annotations for an individual file or image.
Learn about batch image and file (PDF/TIFF/GIF) annotation.
Browse through a comprehensive list of client library code samples.