バッチ画像のアノテーションをオフラインで生成する

Vision API は、Vision の特徴タイプを使用してオフライン（非同期）で検出サービスを実行できます。また、大量の画像ファイルのアノテーションをバッチ処理で生成することもできます。たとえば、1 つの画像のバッチに 1 つ以上の Vision API 機能（TEXT_DETECTION、LABEL_DETECTION、LANDMARK_DETECTION など）を指定できます。

オフラインのバッチリクエストの出力は、指定した Cloud Storage バケットに JSON ファイルとして書き込まれます。

オンライン（同期）リクエスト - オンラインのアノテーションリクエスト（images:annotate または files:annotate）は、ユーザーにインラインアノテーションをすぐに返します。オンラインのアノテーションリクエストの場合、1 回のリクエストでアノテーション処理できるファイルの数に上限があります。images:annotate リクエストでは、アノテーション処理の対象として少数の画像（16 枚以下）のみを指定できます。files:annotate リクエストでは、指定できるのは 1 ファイルのみで、そのファイルに含まれる少数のページ（5 ページ以下）をアノテーション処理の対象に指定します。
オフライン（非同期）リクエスト - オフラインのアノテーションリクエスト（images:asyncBatchAnnotate または files:asyncBatchAnnotate）は、長時間実行オペレーション（LRO）を開始します。呼び出し側にレスポンスを返すまでに時間がかかります。LRO が完了すると、アノテーションが指定の Cloud Storage バケットにファイルとして保存されます。images:asyncBatchAnnotate リクエストでは、リクエストごとに最大 2,000 枚の画像を指定できます。files:asyncBatchAnnotate リクエストを使用すると、より大規模なファイルのバッチを指定し、1 回のアノテーション生成処理にオンラインリクエストよりもファイルあたり多くのページ（最大 2,000）を指定できます。

制限事項

Vision API は、最大で 2,000 個までの画像ファイルを処理できます。これより多い画像ファイルを指定すると、エラーが発生します。

現在サポートされている特徴タイプ

機能タイプ
`CROP_HINTS`	画像上のクロップ領域の推奨頂点を調べます。
`DOCUMENT_TEXT_DETECTION`	ドキュメント（PDF / TIFF）などの高密度テキスト画像や手書き文字を含む画像に OCR を実行します。`TEXT_DETECTION` は、スパース領域のテキスト画像に使用できます。`DOCUMENT_TEXT_DETECTION` と `TEXT_DETECTION` の両方が存在する場合に優先されます。
`FACE_DETECTION`	画像内の顔を検出します。
`IMAGE_PROPERTIES`	画像のドミナントカラーなどの一連の画像プロパティを計算します。
`LABEL_DETECTION`	画像の内容に基づいてラベルを追加します。
`LANDMARK_DETECTION`	画像内の地理的ランドマークを検出します。
`LOGO_DETECTION`	画像内の企業ロゴを検出します。
`OBJECT_LOCALIZATION`	画像内の複数のオブジェクトを検出して抽出できます。
`SAFE_SEARCH_DETECTION`	セーフサーチを実行して、安全でない可能性のあるコンテンツや不適切なコンテンツを検出します。
`TEXT_DETECTION`	画像内のテキストに対して光学式文字認識（OCR）を実行します。テキスト検出は、大きな画像のスパース領域向けに最適化されています。画像がドキュメント（PDF / TIFF ）で、テキストが密に存在しているか、手書き文字が含まれている場合は、代わりに `DOCUMENT_TEXT_DETECTION` を使用します。
`WEB_DETECTION`	画像内のニュース、イベント、有名人などの時事的なエンティティを検出し、Google 画像検索の機能を使用してウェブ上で同様の画像を検索します。

サンプルコード

次のサンプルコードでは、Cloud Storage にある画像ファイルのバッチに対してオフラインアノテーションサービスを実行します。

注: 次のサンプルコードでは、それぞれのリクエストの要素（requests_element/requestsElement）が 1 つの画像に対応しています。複数の画像にアノテーションを生成するには、各画像にリクエスト要素を作成し、リクエストの配列（requests）に追加します。

Java

このサンプルを試す前に、Vision API クイックスタート: クライアントライブラリの使用の Java の設定手順を完了してください。詳細については、Vision API Java のリファレンスドキュメントをご覧ください。

import com.google.cloud.vision.v1.AnnotateImageRequest;
import com.google.cloud.vision.v1.AsyncBatchAnnotateImagesRequest;
import com.google.cloud.vision.v1.AsyncBatchAnnotateImagesResponse;
import com.google.cloud.vision.v1.Feature;
import com.google.cloud.vision.v1.GcsDestination;
import com.google.cloud.vision.v1.Image;
import com.google.cloud.vision.v1.ImageAnnotatorClient;
import com.google.cloud.vision.v1.ImageSource;
import com.google.cloud.vision.v1.OutputConfig;
import java.io.IOException;
import java.util.concurrent.ExecutionException;

public class AsyncBatchAnnotateImages {

  public static void asyncBatchAnnotateImages()
      throws InterruptedException, ExecutionException, IOException {
    String inputImageUri = "gs://cloud-samples-data/vision/label/wakeupcat.jpg";
    String outputUri = "gs://YOUR_BUCKET_ID/path/to/save/results/";
    asyncBatchAnnotateImages(inputImageUri, outputUri);
  }

  public static void asyncBatchAnnotateImages(String inputImageUri, String outputUri)
      throws IOException, ExecutionException, InterruptedException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (ImageAnnotatorClient imageAnnotatorClient = ImageAnnotatorClient.create()) {

      // You can send multiple images to be annotated, this sample demonstrates how to do this with
      // one image. If you want to use multiple images, you have to create a `AnnotateImageRequest`
      // object for each image that you want annotated.
      // First specify where the vision api can find the image
      ImageSource source = ImageSource.newBuilder().setImageUri(inputImageUri).build();
      Image image = Image.newBuilder().setSource(source).build();

      // Set the type of annotation you want to perform on the image
      // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#google.cloud.vision.v1.Feature.Type
      Feature feature = Feature.newBuilder().setType(Feature.Type.LABEL_DETECTION).build();

      // Build the request object for that one image. Note: for additional images you have to create
      // additional `AnnotateImageRequest` objects and store them in a list to be used below.
      AnnotateImageRequest imageRequest =
          AnnotateImageRequest.newBuilder().setImage(image).addFeatures(feature).build();

      // Set where to store the results for the images that will be annotated.
      GcsDestination gcsDestination = GcsDestination.newBuilder().setUri(outputUri).build();
      OutputConfig outputConfig =
          OutputConfig.newBuilder()
              .setGcsDestination(gcsDestination)
              .setBatchSize(2) // The max number of responses to output in each JSON file
              .build();

      // Add each `AnnotateImageRequest` object to the batch request and add the output config.
      AsyncBatchAnnotateImagesRequest request =
          AsyncBatchAnnotateImagesRequest.newBuilder()
              .addRequests(imageRequest)
              .setOutputConfig(outputConfig)
              .build();

      // Make the asynchronous batch request.
      AsyncBatchAnnotateImagesResponse response =
          imageAnnotatorClient.asyncBatchAnnotateImagesAsync(request).get();

      // The output is written to GCS with the provided output_uri as prefix
      String gcsOutputUri = response.getOutputConfig().getGcsDestination().getUri();
      System.out.format("Output written to GCS with prefix: %s%n", gcsOutputUri);
    }
  }
}

Node.js

このサンプルを試す前に、Vision クイックスタート: クライアントライブラリの使用にある Node.js の設定手順を完了してください。詳細については、Vision Node.js API のリファレンスドキュメントをご覧ください。

Vision に対する認証を行うには、アプリケーションのデフォルト認証情報を設定します。詳細については、ローカル開発環境の認証を設定するをご覧ください。

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const inputImageUri = 'gs://cloud-samples-data/vision/label/wakeupcat.jpg';
// const outputUri = 'gs://YOUR_BUCKET_ID/path/to/save/results/';

// Imports the Google Cloud client libraries
const {ImageAnnotatorClient} = require('@google-cloud/vision').v1;

// Instantiates a client
const client = new ImageAnnotatorClient();

// You can send multiple images to be annotated, this sample demonstrates how to do this with
// one image. If you want to use multiple images, you have to create a request object for each image that you want annotated.
async function asyncBatchAnnotateImages() {
  // Set the type of annotation you want to perform on the image
  // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#google.cloud.vision.v1.Feature.Type
  const features = [{type: 'LABEL_DETECTION'}];

  // Build the image request object for that one image. Note: for additional images you have to create
  // additional image request objects and store them in a list to be used below.
  const imageRequest = {
    image: {
      source: {
        imageUri: inputImageUri,
      },
    },
    features: features,
  };

  // Set where to store the results for the images that will be annotated.
  const outputConfig = {
    gcsDestination: {
      uri: outputUri,
    },
    batchSize: 2, // The max number of responses to output in each JSON file
  };

  // Add each image request object to the batch request and add the output config.
  const request = {
    requests: [
      imageRequest, // add additional request objects here
    ],
    outputConfig,
  };

  // Make the asynchronous batch request.
  const [operation] = await client.asyncBatchAnnotateImages(request);

  // Wait for the operation to complete
  const [filesResponse] = await operation.promise();

  // The output is written to GCS with the provided output_uri as prefix
  const destinationUri = filesResponse.outputConfig.gcsDestination.uri;
  console.log(`Output written to GCS with prefix: ${destinationUri}`);
}

asyncBatchAnnotateImages();

Python

このサンプルを試す前に、Vision クイックスタート: クライアントライブラリの使用にある Python の設定手順を完了してください。詳細については、Vision Python API のリファレンスドキュメントをご覧ください。


from google.cloud import vision_v1

def sample_async_batch_annotate_images(
    input_image_uri="gs://cloud-samples-data/vision/label/wakeupcat.jpg",
    output_uri="gs://your-bucket/prefix/",
):
    """Perform async batch image annotation."""
    client = vision_v1.ImageAnnotatorClient()

    source = {"image_uri": input_image_uri}
    image = {"source": source}
    features = [
        {"type_": vision_v1.Feature.Type.LABEL_DETECTION},
        {"type_": vision_v1.Feature.Type.IMAGE_PROPERTIES},
    ]

    # Each requests element corresponds to a single image.  To annotate more
    # images, create a request element for each image and add it to
    # the array of requests
    requests = [{"image": image, "features": features}]
    gcs_destination = {"uri": output_uri}

    # The max number of responses to output in each JSON file
    batch_size = 2
    output_config = {"gcs_destination": gcs_destination, "batch_size": batch_size}

    operation = client.async_batch_annotate_images(
        requests=requests, output_config=output_config
    )

    print("Waiting for operation to complete...")
    response = operation.result(90)

    # The output is written to GCS with the provided output_uri as prefix
    gcs_output_uri = response.output_config.gcs_destination.uri
    print(f"Output written to GCS with prefix: {gcs_output_uri}")

レスポンス

リクエストが成功すると、コードサンプルで指定した Cloud Storage バケットにレスポンス JSON ファイルが返されます。JSON ファイルあたりのレスポンス数は、コードサンプルの batch_size によって決まります。

画像に対してリクエストする特徴タイプにもよりますが、返されるレスポンスは通常の Vision API 特徴検出のレスポンスと類似しています。

次のレスポンスは、LABEL_DETECTION と TEXT_DETECTION アノテーション（image1.png）、IMAGE_PROPERTIES アノテーション（image2.jpg）、OBJECT_LOCALIZATION アノテーション（image3.jpg）を返しています。

レスポンスには、ファイルの URI を示す context フィールドも含まれます。

`offline_batch_output/output-1-to-2.json`

{
  "responses": [
    {
      "labelAnnotations": [
        {
          "mid": "/m/07s6nbt",
          "description": "Text",
          "score": 0.93413997,
          "topicality": 0.93413997
        },
        {
          "mid": "/m/0dwx7",
          "description": "Logo",
          "score": 0.8733531,
          "topicality": 0.8733531
        },
        ...
        {
          "mid": "/m/03bxgrp",
          "description": "Company",
          "score": 0.5682425,
          "topicality": 0.5682425
        }
      ],
      "textAnnotations": [
        {
          "locale": "en",
          "description": "Google\n",
          "boundingPoly": {
            "vertices": [
              {
                "x": 72,
                "y": 40
              },
              {
                "x": 613,
                "y": 40
              },
              {
                "x": 613,
                "y": 233
              },
              {
                "x": 72,
                "y": 233
              }
            ]
          }
        },
        ...
                ],
                "blockType": "TEXT"
              }
            ]
          }
        ],
        "text": "Google\n"
      },
      "context": {
        "uri": "gs://cloud-samples-data/vision/document_understanding/image1.png"
      }
    },
    {
      "imagePropertiesAnnotation": {
        "dominantColors": {
          "colors": [
            {
              "color": {
                "red": 229,
                "green": 230,
                "blue": 238
              },
              "score": 0.2744754,
              "pixelFraction": 0.075339235
            },
            ...
            {
              "color": {
                "red": 86,
                "green": 87,
                "blue": 95
              },
              "score": 0.025770646,
              "pixelFraction": 0.13109145
            }
          ]
        }
      },
      "cropHintsAnnotation": {
        "cropHints": [
          {
            "boundingPoly": {
              "vertices": [
                {},
                {
                  "x": 1599
                },
                {
                  "x": 1599,
                  "y": 1199
                },
                {
                  "y": 1199
                }
              ]
            },
            "confidence": 0.79999995,
            "importanceFraction": 1
          }
        ]
      },
      "context": {
        "uri": "gs://cloud-samples-data/vision/document_understanding/image2.jpg"
      }
    }
  ]
}

`offline_batch_output/output-3-to-3.json`

{
  "responses": [
    {
      "context": {
        "uri": "gs://cloud-samples-data/vision/document_understanding/image3.jpg"
      },
      "localizedObjectAnnotations": [
        {
          "mid": "/m/0bt9lr",
          "name": "Dog",
          "score": 0.9669734,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.6035543,
                "y": 0.1357359
              },
              {
                "x": 0.98546547,
                "y": 0.1357359
              },
              {
                "x": 0.98546547,
                "y": 0.98426414
              },
              {
                "x": 0.6035543,
                "y": 0.98426414
              }
            ]
          }
        },
        ...
        {
          "mid": "/m/0jbk",
          "name": "Animal",
          "score": 0.58003056,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.014534635,
                "y": 0.1357359
              },
              {
                "x": 0.37197515,
                "y": 0.1357359
              },
              {
                "x": 0.37197515,
                "y": 0.98426414
              },
              {
                "x": 0.014534635,
                "y": 0.98426414
              }
            ]
          }
        }
      ]
    }
  ]
}