コマンドラインを使用して画像内のラベルを検出する

このページでは、REST インターフェースと curl コマンドを使用して、Vision API に 3 つの特徴検出リクエストとアノテーションリクエストを送信する方法について説明します。

Vision API を使用すると、Google の視覚認識技術をデベロッパーのアプリケーションに簡単に統合できます。Vision API に画像データと目的特徴タイプを送信すると、目的の画像属性に基づく対応するレスポンスが返されます。利用可能な特徴タイプの詳細については、Vision API のすべての機能の一覧をご覧ください。

始める前に

Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

Install the Google Cloud CLI.

If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.

To initialize the gcloud CLI, run the following command:

gcloud init

Create or select a Google Cloud project.

Create a Google Cloud project:
```
gcloud projects create PROJECT_ID
```
Replace PROJECT_ID with a name for the Google Cloud project you are creating.
Select the Google Cloud project that you created:
```
gcloud config set project PROJECT_ID
```
Replace PROJECT_ID with your Google Cloud project name.

Verify that billing is enabled for your Google Cloud project.

Enable the Vision API:

gcloud services enable vision.googleapis.com

Grant roles to your user account. Run the following command once for each of the following IAM roles: roles/storage.objectViewer

gcloud projects add-iam-policy-binding PROJECT_ID --member="user:USER_IDENTIFIER" --role=ROLE

Replace the following:

PROJECT_ID: your project ID.
USER_IDENTIFIER: the identifier for your user account—for example, myemail@example.com.
ROLE: the IAM role that you grant to your user account.

Install the Google Cloud CLI.

If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.

To initialize the gcloud CLI, run the following command:

gcloud init

Create or select a Google Cloud project.

Create a Google Cloud project:
```
gcloud projects create PROJECT_ID
```
Replace PROJECT_ID with a name for the Google Cloud project you are creating.
Select the Google Cloud project that you created:
```
gcloud config set project PROJECT_ID
```
Replace PROJECT_ID with your Google Cloud project name.

Verify that billing is enabled for your Google Cloud project.

Enable the Vision API:

gcloud services enable vision.googleapis.com

Grant roles to your user account. Run the following command once for each of the following IAM roles: roles/storage.objectViewer

gcloud projects add-iam-policy-binding PROJECT_ID --member="user:USER_IDENTIFIER" --role=ROLE

Replace the following:

PROJECT_ID: your project ID.
USER_IDENTIFIER: the identifier for your user account—for example, myemail@example.com.
ROLE: the IAM role that you grant to your user account.

画像アノテーションリクエストを作成する

始める前にの手順を完了すると、Vision API を使用して画像ファイルにアノテーションを付けられるようになります。

この例では、次の画像で curl を使用して Vision API にリクエストを送信します。

Cloud Storage URI:

gs://cloud-samples-data/vision/using_curl/shanghai.jpeg

HTTPS URL:

https://console.cloud.google.com/storage/browser/cloud-samples-data/vision/using_curl/shanghai.jpeg

上海の街の画像 — 画像クレジット: Steve Long、Unsplash より抜粋

JSON リクエストを作成する

次の request.json ファイルでは、3 つの images:annotate 機能をリクエストする方法と、レスポンスの結果を制限する方法について説明します。

次のテキストを含む JSON リクエストファイルを作成し、作業ディレクトリに request.json 書式なしテキストファイルとして保存します。

request.json

{
  "requests": [
    {
      "image": {
        "source": {
          "imageUri": "gs://cloud-samples-data/vision/using_curl/shanghai.jpeg"
        }
      },
      "features": [
        {
          "type": "LABEL_DETECTION",
          "maxResults": 3
        },
        {
          "type": "OBJECT_LOCALIZATION",
          "maxResults": 1
        },
        {
          "type": "TEXT_DETECTION",
          "maxResults": 1,
          "model": "builtin/latest"
        }
      ]
    }
  ]
}

フィールド値の詳細

image.source.gcsImageUri - Cloud Storage バケットに保存されている画像を示します。このリクエストを公開されている URI の image.source.imageUri に変更するか、base64 でエンコードされた画像の文字列表現を渡す image.content に変更します。
features - 特定の特徴タイプを表すオブジェクトです。1 つの画像に複数の機能タイプをリクエストできます。

type - 機能を指定する列挙値です。
maxResults（省略可） - 返される結果の上限値です。
model（省略可） - 該当する場合、builtin/stable（設定されていない場合はデフォルト）または builtin/latest を指定して、モデルを選択します。最近更新されたモデルの一覧については、リリースノートのトピックをご覧ください。

リクエストを送信する

request.json の curl と本文のコンテンツを使用して、リクエストを Vision API に送信します。コマンドラインで次のように入力します。

curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "x-goog-user-project: PROJECT_ID" \
    -H "Content-Type: application/json; charset=utf-8" \
    https://vision.googleapis.com/v1/images:annotate -d @request.json

レスポンスを解釈する

以下のような JSON レスポンスが表示されます。

各アノテーション型に maxResults が指定された JSON 本文のリクエスト。したがって、レスポンスの JSON には次のようになります。

3 つの labelAnnotations の結果
1 つの textAnnotations の結果（わかりやすくするため短縮しています）
1 つの localizedObjectAnnotations の結果

レスポンス

{
  "responses": [
    {
      "labelAnnotations": [
        {
          "mid": "/m/09g5pq",
          "description": "People",
          "score": 0.9504782,
          "topicality": 0.9504782
        },
        {
          "mid": "/m/01c8br",
          "description": "Street",
          "score": 0.8911568,
          "topicality": 0.8911568
        },
        {
          "mid": "/m/079bkr",
          "description": "Mode of transport",
          "score": 0.89089024,
          "topicality": 0.89089024
        }
      ],
      "textAnnotations": [
        {
          "locale": "zh",
          "description": "牛牛面馆\n",
          "boundingPoly": {
            "vertices": [
              {
                "x": 159,
                "y": 212
              },
              {
                "x": 947,
                "y": 212
              },
              {
                "x": 947,
                "y": 354
              },
              {
                "x": 159,
                "y": 354
              }
            ]
          }
        },
        ...
      ],
      "fullTextAnnotation": {
        "pages": [
          {
            ...
                "paragraphs": [
                  {
                    ...
                    "words": [
                      {
                        ...
                        "symbols": [
                          {
                            ...
                ],
                "blockType": "TEXT"
              }
            ]
          }
        ],
        "text": "牛牛面馆\n"
      },
      "localizedObjectAnnotations": [
        {
          "mid": "/m/01g317",
          "name": "Person",
          "score": 0.94413143,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.26063988,
                "y": 0.46869153
              },
              {
                "x": 0.40736017,
                "y": 0.46869153
              },
              {
                "x": 0.40736017,
                "y": 0.8957791
              },
              {
                "x": 0.26063988,
                "y": 0.8957791
              }
            ]
          }
        }
      ]
    }
  ]
}

ラベル検出の結果

説明: 「人物」、スコア: 0.950
説明: 「通り」、スコア: 0.891
説明: 「交通手段」、スコア: 0.890

テキスト検出の結果

テキスト: 牛牛面馆\ n
頂点: （x: 159、y: 212）、（x: 947、y: 212）、（x: 947、y: 354）、（x: 159、y: 354）

オブジェクト検出の結果

名前: 「人物」、スコア: 0.944
正規化された頂点: （x: 0.260、y: 0.468）、（x: 0.407、y: 0.468）、（x: 0.407、y: 0.895）、（x: 0.260、y: 0.895）

これで完了です。Vision API への最初のリクエストを送信しました。

クリーンアップ

このページで使用したリソースについて、 Google Cloud アカウントに課金されないようにするには、Google Cloud プロジェクトとそのリソースをまとめて削除してください。

Optional: Revoke credentials from the gcloud CLI.

gcloud auth revoke

次のステップ

すべての機能タイプとその用途のリストをご覧ください。
お使いのプログラミング言語に対応した Vision API クライアントライブラリを使用して、Vision API の使用を開始しましょう。
入門ガイドで機能タイプの詳細や、個々のファイルまたは画像のアノテーションやサンプルをご覧ください。
一括処理で画像やファイル（PDF / TIFF / GIF）にアノテーションを設定する方法をご確認ください。
クライアントライブラリのコードサンプルの全体的なリストをご覧ください。

コマンドラインを使用して画像内のラベルを検出する

始める前に

画像アノテーション リクエストを作成する

JSON リクエストを作成する

リクエストを送信する

レスポンスを解釈する

ラベル検出の結果

テキスト検出の結果

オブジェクト検出の結果

クリーンアップ

次のステップ

画像アノテーションリクエストを作成する