ファイル内のテキストを検出する（PDF / TIFF）

注:

モバイルデバイスアプリでこの API を使用する場合は、Firebase Machine Learning と ML Kit を試してください。これは、Cloud Vision サービスを使用するためのプラットフォーム固有の SDK（Android および iOS 用）のほか、デバイス上の ML Vision API、カスタム ML モデルを使用したデバイス上の推論を提供します。
Vision API では、すべての特徴に対してオンライン（同期）リクエストを送信し、少数の PDF / TIFF / GIF ファイルにバッチ処理でアノテーションを追加できます。この同期リクエストでは、ファイルのバッチを検出し、アノテーションを設定します。この機能の詳細については、オンラインで少数のファイルにアノテーションを一括で設定するをご覧ください。

Vision API では、Cloud Storage に保存されている PDF ファイルと TIFF ファイルのテキストを検出して転写できます。

PDF と TIFF からのドキュメントテキスト検出は、files:asyncBatchAnnotate 関数を使用してリクエストする必要があります。これにより、オフライン（非同期）リクエストが行われ、operations リソースでそのステータスを確認できるようになります。

PDF / TIFF リクエストからの出力は、指定した Cloud Storage バケットに作成された JSON ファイルに書き込まれます。

制限事項

Vision API は、2,000 ページまでの PDF / TIFF ファイルを受け入れます。これよりファイルが大きくなるとエラーが返されます。

認証

API キーは、files:asyncBatchAnnotate リクエストではサポートされていません。サービスアカウントによる認証の手順については、サービスアカウントの使用をご覧ください。

認証に使用するアカウントは、出力のために指定する Cloud Storage バケットへのアクセス権（roles/editor または roles/storage.objectCreator 以上）が付与されている必要があります。

API キーを使用してオペレーションのステータスをクエリできます。手順については、API キーの使用をご覧ください。

ドキュメントテキスト検出リクエスト

現在のところ、PDF や TIFF ドキュメントの検出は Cloud Storage バケットに保存されているファイルに対してのみ実行できます。レスポンスの JSON ファイルも Cloud Storage バケットに保存されます。

2010 年米国国勢調査の PDF ページ — `gs://cloud-samples-data/vision/pdf_tiff/census2010.pdf`, **ソース**: 米国国勢調査局

注: この機能は、実際のピクセル値（vertices）ではなく normalizedVertices [0,1] で結果を返します。

REST

リクエストのデータを使用する前に、次のように置き換えます。

CLOUD_STORAGE_BUCKET: 次の形式で出力ファイルを保存する Cloud Storage バケット/ディレクトリ。
- gs://bucket/directory/
リクエスト元のユーザーには、バケットへの書き込み権限が必要です。
CLOUD_STORAGE_FILE_URI: Cloud Storage バケット内の有効なファイル（PDF/TIFF）へのパス。少なくとも、ファイルに対する読み取り権限が必要です。例:
- ```
gs://cloud-samples-data/vision/pdf_tiff/census2010.pdf
```
FEATURE_TYPE: 有効な特徴タイプ。files:asyncBatchAnnotate リクエストには、次の特徴タイプを使用できます。
- DOCUMENT_TEXT_DETECTION
- TEXT_DETECTION
PROJECT_ID: Google Cloud プロジェクト ID。

フィールド固有の考慮事項:

inputConfig は、他の Vision API リクエストで使用される image フィールドの代わりです。これには、次の 2 つの子フィールドが含まれます。
- gcsSource.uri - PDF または TIFF ファイルの Google Cloud Storage URI（リクエストを行うユーザーまたはサービスアカウントがアクセス可能な URI）。
- mimeType - 使用可能なファイルタイプのいずれか（application/pdf または image/tiff）。
outputConfig は、出力の詳細を指定します。これには、次の 2 つの子フィールドが含まれます。
- gcsDestination.uri - 有効な Google Cloud Storage URI。バケットは、リクエストを行うユーザーまたはサービスアカウントによって書き込み可能である必要があります。ファイル名は output-x-to-y です。ここで、x と y は出力ファイルに含まれる PDF / TIFF のページ番号です。ファイルが存在する場合、その内容は上書きされます。
- batchSize - それぞれの JSON 出力ファイルに含める出力ページ数を指定します。

HTTP メソッドと URL:

POST https://vision.googleapis.com/v1/files:asyncBatchAnnotate

リクエストの本文（JSON）:

{
  "requests":[
    {
      "inputConfig": {
        "gcsSource": {
          "uri": "CLOUD_STORAGE_FILE_URI"
        },
        "mimeType": "application/pdf"
      },
      "features": [
        {
          "type": "FEATURE_TYPE"
        }
      ],
      "outputConfig": {
        "gcsDestination": {
          "uri": "CLOUD_STORAGE_BUCKET"
        },
        "batchSize": 1
      }
    }
  ]
}

リクエストを送信するには、次のいずれかのオプションを選択します。

curl

注: 次のコマンドは、gcloud init または gcloud auth login を実行して、ユーザーアカウントで gcloud CLI にログインしているか、Cloud Shell を使用して自動的に gcloud CLI にログインしていることを前提としています。gcloud auth list を実行すると、現在アクティブなアカウントを確認できます。

リクエスト本文を request.json という名前のファイルに保存して、次のコマンドを実行します。

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "x-goog-user-project: PROJECT_ID" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://vision.googleapis.com/v1/files:asyncBatchAnnotate"

PowerShell

注: 次のコマンドは、gcloud init または gcloud auth login を実行して、ご自分のユーザーアカウントで gcloud CLI にログインしていることを前提としています。gcloud auth list を実行すると、現在アクティブなアカウントを確認できます。

リクエスト本文を request.json という名前のファイルに保存して、次のコマンドを実行します。

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = "PROJECT_ID" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://vision.googleapis.com/v1/files:asyncBatchAnnotate" | Select-Object -Expand Content

レスポンス:

asyncBatchAnnotate リクエストに成功すると、次のような name フィールドのみを含むレスポンスが返されます。

{
  "name": "projects/usable-auth-library/operations/1efec2285bd442df"
}

この name は関連 ID（例: 1efec2285bd442df）を持つ長時間実行オペレーションの名前です。この名前は、v1.operations API を使用してクエリできます。

Vision のアノテーションレスポンスを取得するには、v1.operations エンドポイントに GET リクエストを送信し、URL でオペレーション ID を渡します。

GET https://vision.googleapis.com/v1/operations/operation-id

例:

curl -X GET -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
-H "Content-Type: application/json" \
https://vision.googleapis.com/v1/projects/project-id/locations/location-id/operations/1efec2285bd442df

オペレーションが進行中の場合:

{
  "name": "operations/1efec2285bd442df",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.vision.v1.OperationMetadata",
    "state": "RUNNING",
    "createTime": "2019-05-15T21:10:08.401917049Z",
    "updateTime": "2019-05-15T21:10:33.700763554Z"
  }
}

オペレーションが完了すると、state が DONE となり、指定した Google Cloud Storage ファイルに結果が書き込まれます。

{
  "name": "operations/1efec2285bd442df",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.vision.v1.OperationMetadata",
    "state": "DONE",
    "createTime": "2019-05-15T20:56:30.622473785Z",
    "updateTime": "2019-05-15T20:56:41.666379749Z"
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.cloud.vision.v1.AsyncBatchAnnotateFilesResponse",
    "responses": [
      {
        "outputConfig": {
          "gcsDestination": {
            "uri": "gs://your-bucket-name/folder/"
          },
          "batchSize": 1
        }
      }
    ]
  }
}

出力ファイル内の JSON は、画像の [ドキュメントテキスト検出リクエスト](/vision/docs/ocr) の JSON と似ていますが、指定された PDF または TIFF の場所とファイルのページ数を示す context フィールドが追加されています。

output-1-to-1.json

完全なファイル

    
{
  "inputConfig": {
    "gcsSource": {
      "uri": "gs://cloud-samples-data/vision/pdf_tiff/census2010.pdf"
    },
    "mimeType": "application/pdf"
  },
  "responses": [
    {
      "fullTextAnnotation": {
        "pages": [
          {
            "property": {
              "detectedLanguages": [
                {
                  "languageCode": "en",
                  "confidence": 0.94
                }
              ]
            },
            "width": 612,
            "height": 792,
            "blocks": [
              {
                "boundingBox": {
                  "normalizedVertices": [
                    {
                      "x": 0.12908497,
                      "y": 0.10479798
                    },
                    ...
                    {
                      "x": 0.12908497,
                      "y": 0.1199495
                    }
                  ]
                },
                "paragraphs": [
                  {
                  ...
                    },
                    "words": [
                      {
                        ...
                        },
                        "symbols": [
                          {
                          ...
                            "text": "C",
                            "confidence": 0.99
                          },
                          {
                            "property": {
                              "detectedLanguages": [
                                {
                                  "languageCode": "en"
                                }
                              ]
                            },
                            "text": "O",
                            "confidence": 0.99
                          },
             ...
             }
            ]
          }
        ],
        "text": "CONTENTS\n.\n1-1\nII-1\nIII-1\nList of Statistical Tables...
        \nHow to Use This Census Report ..\nTable Finding Guide .\nUser
        Notes .......\nStatistical Tables.........\nAppendixes
        \nA Geographic Terms and Concepts .........\nB Definitions of
        Subject Characteristics.\nData Collection and Processing Procedures...
        \nQuestionnaire. ........\nE Maps .................\nF Operational
        Overview and accuracy of the Data.......\nG Residence Rule and
        Residence Situations for the \n2010 Census of the United States...
        \nH Acknowledgments .....\nE\n*Appendix may be found in the separate
        volume, CPH-1-A, Summary Population and\nHousing Characteristics,
        Selected Appendixes, on the Internet at
        <www.census.gov\n/prod/cen2010/cph-1-a.pdf>.\nContents\n"
      },
      "context": {
        "uri": "gs://cloud-samples-data/vision/pdf_tiff/census2010.pdf",
        "pageNumber": 1
      }
    }
  ]
}

Go

このサンプルを試す前に、Vision クイックスタート: クライアントライブラリの使用にある Go の設定手順を完了してください。詳細については、Vision Go API のリファレンスドキュメントをご覧ください。

Vision に対する認証を行うには、アプリケーションのデフォルト認証情報を設定します。詳細については、ローカル開発環境の認証を設定するをご覧ください。


// detectAsyncDocumentURI performs Optical Character Recognition (OCR) on a
// PDF file stored in GCS.
func detectAsyncDocumentURI(w io.Writer, gcsSourceURI, gcsDestinationURI string) error {
	ctx := context.Background()

	client, err := vision.NewImageAnnotatorClient(ctx)
	if err != nil {
		return err
	}

	request := &visionpb.AsyncBatchAnnotateFilesRequest{
		Requests: []*visionpb.AsyncAnnotateFileRequest{
			{
				Features: []*visionpb.Feature{
					{
						Type: visionpb.Feature_DOCUMENT_TEXT_DETECTION,
					},
				},
				InputConfig: &visionpb.InputConfig{
					GcsSource: &visionpb.GcsSource{Uri: gcsSourceURI},
					// Supported MimeTypes are: "application/pdf" and "image/tiff".
					MimeType: "application/pdf",
				},
				OutputConfig: &visionpb.OutputConfig{
					GcsDestination: &visionpb.GcsDestination{Uri: gcsDestinationURI},
					// How many pages should be grouped into each json output file.
					BatchSize: 2,
				},
			},
		},
	}

	operation, err := client.AsyncBatchAnnotateFiles(ctx, request)
	if err != nil {
		return err
	}

	fmt.Fprintf(w, "Waiting for the operation to finish.")

	resp, err := operation.Wait(ctx)
	if err != nil {
		return err
	}

	fmt.Fprintf(w, "%v", resp)

	return nil
}

Java

このサンプルを試す前に、Vision API クイックスタート: クライアントライブラリの使用の Java の設定手順を完了してください。詳細については、Vision API Java のリファレンスドキュメントをご覧ください。

/**
 * Performs document text OCR with PDF/TIFF as source files on Google Cloud Storage.
 *
 * @param gcsSourcePath The path to the remote file on Google Cloud Storage to detect document
 *     text on.
 * @param gcsDestinationPath The path to the remote file on Google Cloud Storage to store the
 *     results on.
 * @throws Exception on errors while closing the client.
 */
public static void detectDocumentsGcs(String gcsSourcePath, String gcsDestinationPath)
    throws Exception {

  // Initialize client that will be used to send requests. This client only needs to be created
  // once, and can be reused for multiple requests. After completing all of your requests, call
  // the "close" method on the client to safely clean up any remaining background resources.
  try (ImageAnnotatorClient client = ImageAnnotatorClient.create()) {
    List<AsyncAnnotateFileRequest> requests = new ArrayList<>();

    // Set the GCS source path for the remote file.
    GcsSource gcsSource = GcsSource.newBuilder().setUri(gcsSourcePath).build();

    // Create the configuration with the specified MIME (Multipurpose Internet Mail Extensions)
    // types
    InputConfig inputConfig =
        InputConfig.newBuilder()
            .setMimeType(
                "application/pdf") // Supported MimeTypes: "application/pdf", "image/tiff"
            .setGcsSource(gcsSource)
            .build();

    // Set the GCS destination path for where to save the results.
    GcsDestination gcsDestination =
        GcsDestination.newBuilder().setUri(gcsDestinationPath).build();

    // Create the configuration for the System.output with the batch size.
    // The batch size sets how many pages should be grouped into each json System.output file.
    OutputConfig outputConfig =
        OutputConfig.newBuilder().setBatchSize(2).setGcsDestination(gcsDestination).build();

    // Select the Feature required by the vision API
    Feature feature = Feature.newBuilder().setType(Feature.Type.DOCUMENT_TEXT_DETECTION).build();

    // Build the OCR request
    AsyncAnnotateFileRequest request =
        AsyncAnnotateFileRequest.newBuilder()
            .addFeatures(feature)
            .setInputConfig(inputConfig)
            .setOutputConfig(outputConfig)
            .build();

    requests.add(request);

    // Perform the OCR request
    OperationFuture<AsyncBatchAnnotateFilesResponse, OperationMetadata> response =
        client.asyncBatchAnnotateFilesAsync(requests);

    System.out.println("Waiting for the operation to finish.");

    // Wait for the request to finish. (The result is not used, since the API saves the result to
    // the specified location on GCS.)
    List<AsyncAnnotateFileResponse> result =
        response.get(180, TimeUnit.SECONDS).getResponsesList();

    // Once the request has completed and the System.output has been
    // written to GCS, we can list all the System.output files.
    Storage storage = StorageOptions.getDefaultInstance().getService();

    // Get the destination location from the gcsDestinationPath
    Pattern pattern = Pattern.compile("gs://([^/]+)/(.+)");
    Matcher matcher = pattern.matcher(gcsDestinationPath);

    if (matcher.find()) {
      String bucketName = matcher.group(1);
      String prefix = matcher.group(2);

      // Get the list of objects with the given prefix from the GCS bucket
      Bucket bucket = storage.get(bucketName);
      com.google.api.gax.paging.Page<Blob> pageList = bucket.list(BlobListOption.prefix(prefix));

      Blob firstOutputFile = null;

      // List objects with the given prefix.
      System.out.println("Output files:");
      for (Blob blob : pageList.iterateAll()) {
        System.out.println(blob.getName());

        // Process the first System.output file from GCS.
        // Since we specified batch size = 2, the first response contains
        // the first two pages of the input file.
        if (firstOutputFile == null) {
          firstOutputFile = blob;
        }
      }

      // Get the contents of the file and convert the JSON contents to an AnnotateFileResponse
      // object. If the Blob is small read all its content in one request
      // (Note: the file is a .json file)
      // Storage guide: https://cloud.google.com/storage/docs/downloading-objects
      String jsonContents = new String(firstOutputFile.getContent());
      Builder builder = AnnotateFileResponse.newBuilder();
      JsonFormat.parser().merge(jsonContents, builder);

      // Build the AnnotateFileResponse object
      AnnotateFileResponse annotateFileResponse = builder.build();

      // Parse through the object to get the actual response for the first page of the input file.
      AnnotateImageResponse annotateImageResponse = annotateFileResponse.getResponses(0);

      // Here we print the full text from the first page.
      // The response contains more information:
      // annotation/pages/blocks/paragraphs/words/symbols
      // including confidence score and bounding boxes
      System.out.format("%nText: %s%n", annotateImageResponse.getFullTextAnnotation().getText());
    } else {
      System.out.println("No MATCH");
    }
  }
}

Node.js

このサンプルを試す前に、Vision クイックスタート: クライアントライブラリの使用にある Node.js の設定手順を完了してください。詳細については、Vision Node.js API のリファレンスドキュメントをご覧ください。


// Imports the Google Cloud client libraries
const vision = require('@google-cloud/vision').v1;

// Creates a client
const client = new vision.ImageAnnotatorClient();

/**
 * TODO(developer): Uncomment the following lines before running the sample.
 */
// Bucket where the file resides
// const bucketName = 'my-bucket';
// Path to PDF file within bucket
// const fileName = 'path/to/document.pdf';
// The folder to store the results
// const outputPrefix = 'results'

const gcsSourceUri = `gs://${bucketName}/${fileName}`;
const gcsDestinationUri = `gs://${bucketName}/${outputPrefix}/`;

const inputConfig = {
  // Supported mime_types are: 'application/pdf' and 'image/tiff'
  mimeType: 'application/pdf',
  gcsSource: {
    uri: gcsSourceUri,
  },
};
const outputConfig = {
  gcsDestination: {
    uri: gcsDestinationUri,
  },
};
const features = [{type: 'DOCUMENT_TEXT_DETECTION'}];
const request = {
  requests: [
    {
      inputConfig: inputConfig,
      features: features,
      outputConfig: outputConfig,
    },
  ],
};

const [operation] = await client.asyncBatchAnnotateFiles(request);
const [filesResponse] = await operation.promise();
const destinationUri =
  filesResponse.responses[0].outputConfig.gcsDestination.uri;
console.log('Json saved to: ' + destinationUri);

Python

このサンプルを試す前に、Vision クイックスタート: クライアントライブラリの使用にある Python の設定手順を完了してください。詳細については、Vision Python API のリファレンスドキュメントをご覧ください。

def async_detect_document(gcs_source_uri, gcs_destination_uri):
    """OCR with PDF/TIFF as source files on GCS"""
    import json
    import re
    from google.cloud import vision
    from google.cloud import storage

    # Supported mime_types are: 'application/pdf' and 'image/tiff'
    mime_type = "application/pdf"

    # How many pages should be grouped into each json output file.
    batch_size = 2

    client = vision.ImageAnnotatorClient()

    feature = vision.Feature(type_=vision.Feature.Type.DOCUMENT_TEXT_DETECTION)

    gcs_source = vision.GcsSource(uri=gcs_source_uri)
    input_config = vision.InputConfig(gcs_source=gcs_source, mime_type=mime_type)

    gcs_destination = vision.GcsDestination(uri=gcs_destination_uri)
    output_config = vision.OutputConfig(
        gcs_destination=gcs_destination, batch_size=batch_size
    )

    async_request = vision.AsyncAnnotateFileRequest(
        features=[feature], input_config=input_config, output_config=output_config
    )

    operation = client.async_batch_annotate_files(requests=[async_request])

    print("Waiting for the operation to finish.")
    operation.result(timeout=420)

    # Once the request has completed and the output has been
    # written to GCS, we can list all the output files.
    storage_client = storage.Client()

    match = re.match(r"gs://([^/]+)/(.+)", gcs_destination_uri)
    bucket_name = match.group(1)
    prefix = match.group(2)

    bucket = storage_client.get_bucket(bucket_name)

    # List objects with the given prefix, filtering out folders.
    blob_list = [
        blob
        for blob in list(bucket.list_blobs(prefix=prefix))
        if not blob.name.endswith("/")
    ]
    print("Output files:")
    for blob in blob_list:
        print(blob.name)

    # Process the first output file from GCS.
    # Since we specified batch_size=2, the first response contains
    # the first two pages of the input file.
    output = blob_list[0]

    json_string = output.download_as_bytes().decode("utf-8")
    response = json.loads(json_string)

    # The actual response for the first page of the input file.
    first_page_response = response["responses"][0]
    annotation = first_page_response["fullTextAnnotation"]

    # Here we print the full text from the first page.
    # The response contains more information:
    # annotation/pages/blocks/paragraphs/words/symbols
    # including confidence scores and bounding boxes
    print("Full text:\n")
    print(annotation["text"])

gcloud

使用する gcloud コマンドは、ファイル形式によって異なります。

PDF テキスト検出を行う場合は、次の例のように gcloud ml vision detect-text-pdf コマンドを実行します。
```
gcloud ml vision detect-text-pdf gs://my_bucket/input_file  gs://my_bucket/out_put_prefix
```
TIFF テキスト検出を行うには、次の例のように gcloud ml vision detect-text-tiff コマンドを実行します。
```
gcloud ml vision detect-text-tiff gs://my_bucket/input_file  gs://my_bucket/out_put_prefix
```

その他の言語

C#: クライアントライブラリページの C# の設定手順を行ってから、.NET 用の Vision リファレンスドキュメントをご覧ください。

PHP: クライアントライブラリページの PHP の設定手順を行ってから、PHP 用の Vision リファレンスドキュメントをご覧ください。

Ruby: クライアントライブラリページの Ruby の設定手順を行ってから、Ruby 用の Vision リファレンスドキュメントをご覧ください。

マルチリージョンのサポート

この機能は現在、OCR 機能（TEXT_DETECTION または DOCUMENT_TEXT_DETECTION タイプ）にのみ適用されます。

大陸レベルでデータストレージと OCR 処理を指定できるようになりました。現在サポートされているリージョンは次のとおりです。

us: 米国のみ
eu: 欧州連合

ロケーション

Cloud Vision では、プロジェクトのリソースが保存、処理されるロケーションをある程度制御できます。特に、データを欧州連合でのみ保存して処理するように Cloud Vision を構成できます。

デフォルトでは、Cloud Vision はリソースをグローバル ロケーションに保存して処理します。つまり、Cloud Vision は、リソースが特定のロケーションやリージョンに留まることを保証しません。ロケーションとして欧州連合を選択した場合、欧州連合でのみデータが保存され、処理されます。ユーザーはどこからでもデータにアクセスできます。

API を使用してロケーションを設定する

Vision API は、グローバル API エンドポイント（vision.googleapis.com）と、2 つのリージョンベースのエンドポイント（EU エンドポイント eu-vision.googleapis.com と米国エンドポイント us-vision.googleapis.com）をサポートしています。これらのエンドポイントはリージョン固有の処理に使用します。たとえば、EU でのみデータを保存して処理する場合は、REST API 呼び出しに vision.googleapis.com ではなく URI eu-vision.googleapis.com を使用します。

https://eu-vision.googleapis.com/v1/projects/PROJECT_ID/locations/eu/images:annotate
https://eu-vision.googleapis.com/v1/projects/PROJECT_ID/locations/eu/images:asyncBatchAnnotate
https://eu-vision.googleapis.com/v1/projects/PROJECT_ID/locations/eu/files:annotate
https://eu-vision.googleapis.com/v1/projects/PROJECT_ID/locations/eu/files:asyncBatchAnnotate

米国でのみデータを保存して処理する場合は、前述の方法で米国のエンドポイント（us-vision.googleapis.com）を使用します。

クライアントライブラリを使用してロケーションを設定する

Vision API クライアントライブラリは、デフォルトでグローバル API エンドポイント（vision.googleapis.com）にアクセスします。欧州連合でのみデータを保存して処理するには、エンドポイント（eu-vision.googleapis.com）を明示的に設定する必要があります。以下のサンプルコードは、この設定を構成する方法を示しています。

注: この機能は、実際のピクセル値（vertices）ではなく normalizedVertices [0,1] で結果を返します。

REST

リクエストのデータを使用する前に、次のように置き換えます。

REGION_ID: 有効なリージョンのロケーション ID のいずれか。
- us: 米国のみ
- eu: 欧州連合
CLOUD_STORAGE_IMAGE_URI: Cloud Storage バケット内の有効な画像ファイルへのパス。少なくとも、ファイルに対する読み取り権限が必要です。例:
- ```
gs://cloud-samples-data/vision/pdf_tiff/census2010.pdf
```
CLOUD_STORAGE_BUCKET: 次の形式で出力ファイルを保存する Cloud Storage バケット/ディレクトリ。
- gs://bucket/directory/
リクエスト元のユーザーには、バケットへの書き込み権限が必要です。
FEATURE_TYPE: 有効な特徴タイプ。files:asyncBatchAnnotate リクエストには、次の特徴タイプを使用できます。
- DOCUMENT_TEXT_DETECTION
- TEXT_DETECTION
PROJECT_ID: Google Cloud プロジェクト ID。

フィールド固有の考慮事項:

inputConfig は、他の Vision API リクエストで使用される image フィールドの代わりです。これには、次の 2 つの子フィールドが含まれます。
- gcsSource.uri - PDF または TIFF ファイルの Google Cloud Storage URI（リクエストを行うユーザーまたはサービスアカウントがアクセス可能な URI）。
- mimeType - 使用可能なファイルタイプのいずれか（application/pdf または image/tiff）。
outputConfig は、出力の詳細を指定します。これには、次の 2 つの子フィールドが含まれます。
- gcsDestination.uri - 有効な Google Cloud Storage URI。バケットは、リクエストを行うユーザーまたはサービスアカウントによって書き込み可能である必要があります。ファイル名は output-x-to-y です。ここで、x と y は出力ファイルに含まれる PDF / TIFF のページ番号です。ファイルが存在する場合、その内容は上書きされます。
- batchSize - それぞれの JSON 出力ファイルに含める出力ページ数を指定します。

HTTP メソッドと URL:

POST https://REGION_ID-vision.googleapis.com/v1/projects/PROJECT_ID/locations/REGION_ID/files:asyncBatchAnnotate

リクエストの本文（JSON）:

{
  "requests":[
    {
      "inputConfig": {
        "gcsSource": {
          "uri": "CLOUD_STORAGE_IMAGE_URI"
        },
        "mimeType": "application/pdf"
      },
      "features": [
        {
          "type": "FEATURE_TYPE"
        }
      ],
      "outputConfig": {
        "gcsDestination": {
          "uri": "CLOUD_STORAGE_BUCKET"
        },
        "batchSize": 1
      }
    }
  ]
}

リクエストを送信するには、次のいずれかのオプションを選択します。

curl

リクエスト本文を request.json という名前のファイルに保存して、次のコマンドを実行します。

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "x-goog-user-project: PROJECT_ID" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://REGION_ID-vision.googleapis.com/v1/projects/PROJECT_ID/locations/REGION_ID/files:asyncBatchAnnotate"

PowerShell

リクエスト本文を request.json という名前のファイルに保存して、次のコマンドを実行します。

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = "PROJECT_ID" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://REGION_ID-vision.googleapis.com/v1/projects/PROJECT_ID/locations/REGION_ID/files:asyncBatchAnnotate" | Select-Object -Expand Content

レスポンス:

asyncBatchAnnotate リクエストに成功すると、次のような name フィールドのみを含むレスポンスが返されます。

{
  "name": "projects/usable-auth-library/operations/1efec2285bd442df"
}

この name は関連 ID（例: 1efec2285bd442df）を持つ長時間実行オペレーションの名前です。この名前は、v1.operations API を使用してクエリできます。

Vision のアノテーションレスポンスを取得するには、v1.operations エンドポイントに GET リクエストを送信し、URL でオペレーション ID を渡します。

GET https://vision.googleapis.com/v1/operations/operation-id

例:

curl -X GET -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
-H "Content-Type: application/json" \
https://vision.googleapis.com/v1/projects/project-id/locations/location-id/operations/1efec2285bd442df

オペレーションが進行中の場合:

{
  "name": "operations/1efec2285bd442df",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.vision.v1.OperationMetadata",
    "state": "RUNNING",
    "createTime": "2019-05-15T21:10:08.401917049Z",
    "updateTime": "2019-05-15T21:10:33.700763554Z"
  }
}

オペレーションが完了すると、state が DONE となり、指定した Google Cloud Storage ファイルに結果が書き込まれます。

{
  "name": "operations/1efec2285bd442df",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.vision.v1.OperationMetadata",
    "state": "DONE",
    "createTime": "2019-05-15T20:56:30.622473785Z",
    "updateTime": "2019-05-15T20:56:41.666379749Z"
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.cloud.vision.v1.AsyncBatchAnnotateFilesResponse",
    "responses": [
      {
        "outputConfig": {
          "gcsDestination": {
            "uri": "gs://your-bucket-name/folder/"
          },
          "batchSize": 1
        }
      }
    ]
  }
}

DOCUMENT_TEXT_DETECTION 機能を使用した場合、出力ファイル内の JSON は画像のドキュメントテキスト検出レスポンスに似ています。TEXT_DETECTION 機能を使用した場合はテキスト検出レスポンスに似ています。出力には、指定された PDF または TIFF の場所とファイルのページ数を示す context フィールドが追加されています。

output-1-to-1.json

完全なファイル

    
{
  "inputConfig": {
    "gcsSource": {
      "uri": "gs://cloud-samples-data/vision/pdf_tiff/census2010.pdf"
    },
    "mimeType": "application/pdf"
  },
  "responses": [
    {
      "fullTextAnnotation": {
        "pages": [
          {
            "property": {
              "detectedLanguages": [
                {
                  "languageCode": "en",
                  "confidence": 0.94
                }
              ]
            },
            "width": 612,
            "height": 792,
            "blocks": [
              {
                "boundingBox": {
                  "normalizedVertices": [
                    {
                      "x": 0.12908497,
                      "y": 0.10479798
                    },
                    ...
                    {
                      "x": 0.12908497,
                      "y": 0.1199495
                    }
                  ]
                },
                "paragraphs": [
                  {
                  ...
                    },
                    "words": [
                      {
                        ...
                        },
                        "symbols": [
                          {
                          ...
                            "text": "C",
                            "confidence": 0.99
                          },
                          {
                            "property": {
                              "detectedLanguages": [
                                {
                                  "languageCode": "en"
                                }
                              ]
                            },
                            "text": "O",
                            "confidence": 0.99
                          },
             ...
             }
            ]
          }
        ],
        "text": "CONTENTS\n.\n1-1\nII-1\nIII-1\nList of Statistical Tables...
        \nHow to Use This Census Report ..\nTable Finding Guide .\nUser
        Notes .......\nStatistical Tables.........\nAppendixes
        \nA Geographic Terms and Concepts .........\nB Definitions of
        Subject Characteristics.\nData Collection and Processing Procedures...
        \nQuestionnaire. ........\nE Maps .................\nF Operational
        Overview and accuracy of the Data.......\nG Residence Rule and
        Residence Situations for the \n2010 Census of the United States...
        \nH Acknowledgments .....\nE\n*Appendix may be found in the separate
        volume, CPH-1-A, Summary Population and\nHousing Characteristics,
        Selected Appendixes, on the Internet at
        <www.census.gov\n/prod/cen2010/cph-1-a.pdf>.\nContents\n"
      },
      "context": {
        "uri": "gs://cloud-samples-data/vision/pdf_tiff/census2010.pdf",
        "pageNumber": 1
      }
    }
  ]
}

Go

import (
	"context"
	"fmt"

	vision "cloud.google.com/go/vision/apiv1"
	"google.golang.org/api/option"
)

// setEndpoint changes your endpoint.
func setEndpoint(endpoint string) error {
	// endpoint := "eu-vision.googleapis.com:443"

	ctx := context.Background()
	client, err := vision.NewImageAnnotatorClient(ctx, option.WithEndpoint(endpoint))
	if err != nil {
		return fmt.Errorf("NewImageAnnotatorClient: %w", err)
	}
	defer client.Close()

	return nil
}

Java

ImageAnnotatorSettings settings =
    ImageAnnotatorSettings.newBuilder().setEndpoint("eu-vision.googleapis.com:443").build();

// Initialize client that will be used to send requests. This client only needs to be created
// once, and can be reused for multiple requests. After completing all of your requests, call
// the "close" method on the client to safely clean up any remaining background resources.
ImageAnnotatorClient client = ImageAnnotatorClient.create(settings);

Node.js

// Imports the Google Cloud client library
const vision = require('@google-cloud/vision');

async function setEndpoint() {
  // Specifies the location of the api endpoint
  const clientOptions = {apiEndpoint: 'eu-vision.googleapis.com'};

  // Creates a client
  const client = new vision.ImageAnnotatorClient(clientOptions);

  // Performs text detection on the image file
  const [result] = await client.textDetection('./resources/wakeupcat.jpg');
  const labels = result.textAnnotations;
  console.log('Text:');
  labels.forEach(label => console.log(label.description));
}
setEndpoint();

Python

from google.cloud import vision

client_options = {"api_endpoint": "eu-vision.googleapis.com"}

client = vision.ImageAnnotatorClient(client_options=client_options)

使ってみる

Google Cloud を初めて使用される方は、アカウントを作成して、実際のシナリオでの Cloud Vision API のパフォーマンスを評価してください。新規のお客様には、ワークロードの実行、テスト、デプロイができる無料クレジット $300 分を差し上げます。

Cloud Vision API の無料トライアル

ファイル内のテキストを検出する（PDF / TIFF）

制限事項

認証

ドキュメント テキスト検出リクエスト

REST

curl

PowerShell

Go

Java

Node.js

Python

gcloud

その他の言語

マルチリージョンのサポート

ロケーション

API を使用してロケーションを設定する

クライアント ライブラリを使用してロケーションを設定する

REST

curl

PowerShell

Go

Java

Node.js

Python

使ってみる

ドキュメントテキスト検出リクエスト

クライアントライブラリを使用してロケーションを設定する