コマンドラインを使用して画像内のラベルを検出する

このページでは、REST インターフェースと curl コマンドを使用して、Vision API に 3 つの特徴検出リクエストとアノテーションリクエストを送信する方法について説明します。

Vision API を使用すると、Google の視覚認識技術をデベロッパーのアプリケーションに簡単に統合できます。Vision API に画像データと目的特徴タイプを送信すると、目的の画像属性に基づく対応するレスポンスが返されます。利用可能な特徴タイプの詳細については、Vision API のすべての機能の一覧をご覧ください。

始める前に

Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

Install the Google Cloud CLI.

To initialize the gcloud CLI, run the following command:

gcloud init

Create or select a Google Cloud project.

Create a Google Cloud project:
```
gcloud projects create PROJECT_ID
```
Replace PROJECT_ID with a name for the Google Cloud project you are creating.
Select the Google Cloud project that you created:
```
gcloud config set project PROJECT_ID
```
Replace PROJECT_ID with your Google Cloud project name.

Make sure that billing is enabled for your Google Cloud project.

Enable the Vision API:

gcloud services enable vision.googleapis.com

Grant roles to your user account. Run the following command once for each of the following IAM roles: roles/storage.objectViewer

gcloud projects add-iam-policy-binding PROJECT_ID --member="user:USER_IDENTIFIER" --role=ROLE

Replace PROJECT_ID with your project ID.
Replace USER_IDENTIFIER with the identifier for your user account. For example, user:myemail@example.com.
Replace ROLE with each individual role.

Install the Google Cloud CLI.

To initialize the gcloud CLI, run the following command:

gcloud init

Create or select a Google Cloud project.

Create a Google Cloud project:
```
gcloud projects create PROJECT_ID
```
Replace PROJECT_ID with a name for the Google Cloud project you are creating.
Select the Google Cloud project that you created:
```
gcloud config set project PROJECT_ID
```
Replace PROJECT_ID with your Google Cloud project name.

Make sure that billing is enabled for your Google Cloud project.

Enable the Vision API:

gcloud services enable vision.googleapis.com

Grant roles to your user account. Run the following command once for each of the following IAM roles: roles/storage.objectViewer

gcloud projects add-iam-policy-binding PROJECT_ID --member="user:USER_IDENTIFIER" --role=ROLE

Replace PROJECT_ID with your project ID.
Replace USER_IDENTIFIER with the identifier for your user account. For example, user:myemail@example.com.
Replace ROLE with each individual role.

画像アノテーションリクエストを作成する

始める前にの手順を完了すると、Vision API を使用して画像ファイルにアノテーションを付けられるようになります。

この例では、次の画像で curl を使用して Vision API にリクエストを送信します。

Cloud Storage URI:

gs://cloud-samples-data/vision/using_curl/shanghai.jpeg

HTTPS URL:

https://console.cloud.google.com/storage/browser/cloud-samples-data/vision/using_curl/shanghai.jpeg

上海の街の画像 — 画像クレジット: Steve Long、Unsplash より抜粋

JSON リクエストを作成する

次の request.json ファイルでは、3 つの images:annotate 機能をリクエストする方法と、レスポンスの結果を制限する方法について説明します。

次のテキストを含む JSON リクエストファイルを作成し、作業ディレクトリに request.json 書式なしテキストファイルとして保存します。

request.json

{
  "requests": [
    {
      "image": {
        "source": {
          "imageUri": "gs://cloud-samples-data/vision/using_curl/shanghai.jpeg"
        }
      },
      "features": [
        {
          "type": "LABEL_DETECTION",
          "maxResults": 3
        },
        {
          "type": "OBJECT_LOCALIZATION",
          "maxResults": 1
        },
        {
          "type": "TEXT_DETECTION",
          "maxResults": 1,
          "model": "builtin/latest"
        }
      ]
    }
  ]
}

フィールド値の詳細

image.source.gcsImageUri - Cloud Storage バケットに保存されている画像を示します。このリクエストを公開されている URI の image.source.imageUri に変更するか、base64 でエンコードされた画像の文字列表現を渡す image.content に変更します。
features - 特定の特徴タイプを表すオブジェクトです。1 つの画像に複数の機能タイプをリクエストできます。

type - 機能を指定する列挙値です。
maxResults（省略可） - 返される結果の上限値です。
model（省略可） - 該当する場合、builtin/stable（設定されていない場合はデフォルト）または builtin/latest を指定して、モデルを選択します。最近更新されたモデルの一覧については、リリースノートのトピックをご覧ください。

リクエストを送信する

request.json の curl と本文のコンテンツを使用して、リクエストを Vision API に送信します。コマンドラインで次のように入力します。

curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "x-goog-user-project: PROJECT_ID" \
    -H "Content-Type: application/json; charset=utf-8" \
    https://vision.googleapis.com/v1/images:annotate -d @request.json

レスポンスを解釈する

以下のような JSON レスポンスが表示されます。

各アノテーション型に maxResults が指定された JSON 本文のリクエスト。したがって、レスポンスの JSON には次のようになります。

3 つの labelAnnotations の結果
1 つの textAnnotations の結果（わかりやすくするため短縮しています）
1 つの localizedObjectAnnotations の結果

レスポンス

{
  "responses": [
    {
      "labelAnnotations": [
        {
          "mid": "/m/09g5pq",
          "description": "People",
          "score": 0.9504782,
          "topicality": 0.9504782
        },
        {
          "mid": "/m/01c8br",
          "description": "Street",
          "score": 0.8911568,
          "topicality": 0.8911568
        },
        {
          "mid": "/m/079bkr",
          "description": "Mode of transport",
          "score": 0.89089024,
          "topicality": 0.89089024
        }
      ],
      "textAnnotations": [
        {
          "locale": "zh",
          "description": "牛牛面馆\n",
          "boundingPoly": {
            "vertices": [
              {
                "x": 159,
                "y": 212
              },
              {
                "x": 947,
                "y": 212
              },
              {
                "x": 947,
                "y": 354
              },
              {
                "x": 159,
                "y": 354
              }
            ]
          }
        },
        ...
      ],
      "fullTextAnnotation": {
        "pages": [
          {
            ...
                "paragraphs": [
                  {
                    ...
                    "words": [
                      {
                        ...
                        "symbols": [
                          {
                            ...
                ],
                "blockType": "TEXT"
              }
            ]
          }
        ],
        "text": "牛牛面馆\n"
      },
      "localizedObjectAnnotations": [
        {
          "mid": "/m/01g317",
          "name": "Person",
          "score": 0.94413143,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.26063988,
                "y": 0.46869153
              },
              {
                "x": 0.40736017,
                "y": 0.46869153
              },
              {
                "x": 0.40736017,
                "y": 0.8957791
              },
              {
                "x": 0.26063988,
                "y": 0.8957791
              }
            ]
          }
        }
      ]
    }
  ]
}

ラベル検出の結果

説明: 「人物」、スコア: 0.950
説明: 「通り」、スコア: 0.891
説明: 「交通手段」、スコア: 0.890

テキスト検出の結果

テキスト: 牛牛面馆\ n
頂点: （x: 159、y: 212）、（x: 947、y: 212）、（x: 947、y: 354）、（x: 159、y: 354）

オブジェクト検出の結果

名前: 「人物」、スコア: 0.944
正規化された頂点: （x: 0.260、y: 0.468）、（x: 0.407、y: 0.468）、（x: 0.407、y: 0.895）、（x: 0.260、y: 0.895）

これで完了です。Vision API への最初のリクエストを送信しました。

クリーンアップ

このページで使用したリソースに対して Google Cloud アカウントで課金されないようにするには、Google Cloud プロジェクトとそのリソースを削除します。

Optional: Revoke credentials from the gcloud CLI.

gcloud auth revoke

次のステップ

すべての機能タイプとその用途のリストをご覧ください。
お使いのプログラミング言語に対応した Vision API クライアントライブラリを使用して、Vision API の使用を開始しましょう。
入門ガイドで機能タイプの詳細や、個々のファイルまたは画像のアノテーションやサンプルをご覧ください。
一括処理で画像やファイル（PDF / TIFF / GIF）にアノテーションを設定する方法をご確認ください。
クライアントライブラリのコードサンプルの全体的なリストをご覧ください。

コマンドラインを使用して画像内のラベルを検出する

始める前に

画像アノテーション リクエストを作成する

JSON リクエストを作成する

リクエストを送信する

レスポンスを解釈する

ラベル検出の結果

テキスト検出の結果

オブジェクト検出の結果

クリーンアップ

次のステップ

画像アノテーションリクエストを作成する