小さなバッチファイルのアノテーションをオンラインで生成する

オンライン（同期）リクエスト - オンラインのアノテーションリクエスト（images:annotate または files:annotate）は、ユーザーにインラインアノテーションをすぐに返します。オンラインのアノテーションリクエストの場合、1 回のリクエストでアノテーション処理できるファイルの数に上限があります。images:annotate リクエストでは、アノテーション処理の対象として少数の画像（16 枚以下）のみを指定できます。files:annotate リクエストでは、指定できるのは 1 ファイルのみで、そのファイルに含まれる少数のページ（5 ページ以下）をアノテーション処理の対象に指定します。
オフライン（非同期）リクエスト - オフラインのアノテーションリクエスト（images:asyncBatchAnnotate または files:asyncBatchAnnotate）は、長時間実行オペレーション（LRO）を開始します。呼び出し側にレスポンスを返すまでに時間がかかります。LRO が完了すると、アノテーションが指定の Cloud Storage バケットにファイルとして保存されます。images:asyncBatchAnnotate リクエストでは、リクエストごとに最大 2,000 枚の画像を指定できます。files:asyncBatchAnnotate リクエストを使用すると、より大規模なファイルのバッチを指定し、1 回のアノテーション生成処理にオンラインリクエストよりもファイルあたり多くのページ（最大 2,000）を指定できます。

Vision API では、Cloud Storage に保存された PDF、TIFF、GIF ファイルの複数のページやフレームにアノテーションをオンラインで（即時に）追加できます。

1 つのファイルから 5 つのフレーム（GIF、image/gif）またはページ（PDF、application/pdf または TIFF、image/tiff）を選択して、オンラインでの特徴検出とアノテーションの追加をリクエストできます。

このページでは、DOCUMENT_TEXT_DETECTION のアノテーションの例を示しますが、小さいバッチファイルに対するオンラインでのアノテーション作成は、Vision のすべて機能で使用できます。

注: Vision API では、PDF / TIFF ファイルに対してオフライン（非同期）でのアノテーション作成をサポートしていますが、現時点では DOCUMENT_TEXT_DETECTION 機能タイプでのみ使用可能です。

オフラインの非同期リクエストは、Cloud Storage バケットにレスポンスの JSON ファイルを返します。このリクエストは 2,000 ページまでのファイルに対応しています。詳細については、ファイル内のテキストを検出する（PDF / TIFF）をご覧ください。

PDF ファイルの最初の 5 ページ — gs://cloud-samples-data/vision/document_understanding/custom_0773375000.pdf

ページ 1

...
"text": "á\n7.1.15\nOIL, GAS AND MINERAL LEASE
\nNORVEL J. CHITTIM, ET AL\n.\n.
\nTO\nW. L. SCHEIG\n"
},
"context": {"pageNumber": 1}
...

ページ 2

...
"text": "...\n.\n*\n.\n.\n.\nA\nNY\nALA...\n7
\n| THE STATE OF TEXAS
\nOIL, GAS AND MINERAL LEASE
\nCOUNTY OF MAVERICK ]
\nTHIS AGREEMENT made this 14 day of_June
\n1954, between Norvel J. Chittim and his wife, Lieschen G. Chittim;
\nMary Anne Chittim Parker, joined herein pro forma by her husband,
\nJoseph Bright Parker; Dorothea Chittim Oppenheimer, joined herein
\npro forma by her husband, Fred J. Oppenheimer; Tuleta Chittim
\nWright, joined herein pro forma by her husband, Gilbert G. Wright,
\nJr.; Gilbert G. Wright, III; Dela Wright White, joined herein pro
\nforma by her husband, John H. White; Anne Wright Basse, joined
\nherein pro forma by her husband, E. A. Basse, Jr.; Norvel J.
\nChittim, Independent Executor and Trustee for Estate of Marstella
\nChittim, Deceased; Mary Louise Roswell, joined herein pro forma by
\nher husband, Charles M. 'Roswell; and James M. Chittim and his wife,
\nThelma Neal Chittim; as LESSORS, and W. L. Scheig of San Antonio,
\nTexas, as LESSEE,

\nW I T N E s s E T H:
\n1. Lessors, in consideration of $10.00, cash in hand paid,
\nof the royalties herein provided, and of the agreement of Lessee
\nherein contained, hereby grant, lease and let exclusively unto
\nLessee the tracts of land hereinafter described for the purpose of
\ntesting for mineral indications, and in such tests use the Seismo-
\ngraph, Torsion Balance, Core Drill, or any other tools, machinery,
\nequipment or explosive necessary and proper; and also prospecting,
\ndrilling and mining for and producing oil, gas and other minerals
\n(except metallic minerals), laying pipe lines, building tanks,
\npower stations, telephone lines and other structures thereon to
\nproduce, save, take care of, treat, transport and own said pro-
\nducts and housing its employees (Lessee to conduct its geophysical
\nwork in such manner as not to damage the buildings, water tanks
\nor wells of Lessors, or the livestock of Lessors or Lessors' ten- !
\nants, )said lands being situated in Maverick, Zavalla and Dimmit
\nCounties, Texas, to-wit:\n3-1.\n"
},
"context": {"pageNumber": 2}
...

ページ 3

...
"text": "Being a tract consisting of 140,769.86 acres, more or
\nless, out of what is known as the \"Chittim Ranch\" in said counties,
\nas designated and described in Exhibit \"A\" hereto attached and
\nmade a part hereof as if fully written herein. It being under-
\nstood that the acreage intended to be included in this lease aggre-
\ngates approximately 140,769.86 acres whether it actually comprises
\nmore or less, but for the purpose of calculating the payments
\nhereinafter provided for, it is agreed that the land included with-
\nin the terms of this lease is One hundred forty thousand seven
\nhundred sixty-nine and eighty-six one hundredths (140,769.86) acres,
\nand that each survey listed above contains the acreage stated above.
\nIt is understood that tract designated \"TRACT II\" in
\nExhibit \"A\" is subject to a one-sixteenth (1/16) royalty reserved.
\nto the State of Texas, and the rights of the State of Texas must
\nbe respected in the development of the said property.

\n2. Subject to the other provisions hereof, this lease shall
\nbe for a term of ten (10) years from date hereof (called \"Primary
\nTerm\"), and as long thereafter as oil, gas or other minerals
\n(except metallic minerals) are produced from said land hereunder
\nin paying quantities, subject, however, to all of the terms and
\nprovisions of this lease. After expiration of the primary term,
\nthis lease shall terminate as to all lands included herein, save
\nand except as to those tracts which lessee maintains in force and
\neffect according to the requirements hereof.
\n3. The royalties to be paid by Lessee are (a) on oil, one-
\neighth (1/8) of that produced and saved from said land, the same to
\nbe delivered at the well or to the credit of Lessors into the pipe i
\nline to which the well may be connected; (b) on gas, including
\ni casinghead gas or other gaseous or vaporous substance, produced
\nfrom the leased premises and sold or used by Lessee off the leased
\npremises or in the manufacture of gasoline or other products, the
\nmarket value, at the mouth of the well, of one-eighth (1/8) of
\n.\n3-2-\n?\n"
},
"context": {"pageNumber": 3}
...

ページ 4

...
"text": "•\n:\n.\nthe gas or casinghead gas so used or sold. On all gas or casing-
\nhead gas sold at the well, the royalty shall be one-eighth (1/8)
\nof the amounts realized from such sales. While gas from any well
\nproducing gas only is being used or sold by. Lessee, Lessor may have
\nenough of said gas for all stoves and inside lights in the prin-
\ncipal dwelling house on the leased premises by making Lessors' own
\nconnections with the well and by assuming all risk and paying all
\nexpenses. And (c) on all other minerals (except metallic minerals)
\nmined and marketed, one tenth (1/10). either in kind or value at the
\nwell or mine at Lessee's election.
\nFor the purpose of royalty payments under 3 (b) hereof,
\nall liquid hydrocarbons (including distillate) recovered and saved
n| by Lessee in separators or traps on the leased premises shall be
\nconsidered as oil. Should such a plant be constructed by another
\nthan Lessee to whom Lessee should sell or deliver the gas or cas-
\ninghead gas produced from the leased premises for processing, then
\nthe royalty thereon shall be one-eighth (1/8) of the amounts
\nrealized by Lessee from such sales or deliveries.

\nOr if such plant is owned or constructed or operated by
\nLessee, then the royalty shall be on the basis of one-eighth (1/8) |
\nof the prevailing price in the area for such products..
\nThe provisions of this paragraph shall control as to any
\nconflict with Paragraph 3 (b). Lessors shall also be entitled to
\nsaid royalty interest in all residue gas .obtained, saved and mar-
\nketed from said premises, or used off the premises, or that may be
\nreplaced in the reservoir by 'any recycling process, settlement
\ntherefor to be made to Lessors when such gas is marketed or used
\noff the premises. !
\nIf at the expiration of the primary term of this lease
\nLessee has not found and produced oil or gas in paying quantities
\nin any formation lying fifty (50) feet below the base of what is
\nknown as the Rhodessa section at the particular point where the
\nwell is drilled, then, subject to the further provisions hereof,
\nthis lease shall terminate as to all horizons below fifty (50)
\nI feet below the Rhodessa section. And if at the expiration of the
\n3 -3-\n"
},
"context": {"pageNumber": 4}
...

ページ 5

...
"text": ".\n.\n:\nI\n.\n.\n.:250:-....\n.\n...\n.\n....\n....\n..\n..\n. ..
\n.\n..\n.\n...\n...\n.-\n.\n.\n..\n..\n17\n.\n:\n-\n-\n-\n.\n..\n.
\nprimary term production of oil or gas in paying quantities is not
\nfound in the Jurassic, then this lease shall terminate as to the
\nJurassic and lower formations unless Lessee shall have completed
\nat least two (2) tests in the Jurassic. And after the primary
\nterm Lessee shall complete at least one (1) Jurassic test each
\nthree years on said property as to which this lease is still in
\neffect, until paying production is obtained in or below the
\nJurassic, or upon failure so to do Lessee shall release this
\nlease as to all formations below the top of the Jurassic. Upon
\ncompliance with the above provisions as to Jurassic tests, and
\nif production is found in the Jurassic, then, subject to the
\nother provisions hereof, this lease shall be effective as to all
\nhorizons, including the Jurassic..
\n5. It is understood and expressly agreed that the consider-
\niation first recited in this lease, the down cash payment, receipt
\nof which is hereby acknowledged by Lessors, is full and adequate
\nconsideration to maintain this lease in full force and effect for
\na period of one year from the date hereof, and does not impose
\nany obligation on the part of Lessee to drill and develop this
\nlease during the said term of one year from date of this lease.

\n6. This lease shall terminate as to both parties unless
\non or before one year from this date, Lessee shall pay to or ten- !
\nder to Lessors or to the credit of Lessors, in the National Bank
\nof Commerce, at San Antonio, Texas, (which bank and its successors
\nare Lessors' agent, and shall continue as the depository for all \"
\nrental payable hereunder regardless of changes in ownership of
\nsaid land or the rental), the sum of One Dollar ($1.00) per acre
\nas to all acreage then covered by this lease, and not surrendered,
\nor maintained by production of oil, gas or other minerals, or by
\ndrilling-reworking operations, all as hereinafter fully set out, :
\nwhich shall maintain this lease in full force and effect for
\nanother twelve-month period, without imposing any obligation on
\nthe part of Lessee to drill and develop this lease. In like
\nmanner, and upon like payment or tender annually, Lessee may
\nmaintain this lease .in full force and effect for successive
\ntwelve-month periods during the primary term, without imposing
\n.\n--.\n.\n.\n.\n-\n::\n---
\n-\n3\n.\n..-\n-\n-\n:.\n.\n::\n.
\n3-4-\n"
},
"context": {"pageNumber": 5}
...

制限事項

最大で 5 ページにアノテーションが生成されます。アノテーションを付けるページを 5 ページまで指定できます。

認証

API キーは、files:annotate リクエストではサポートされていません。

Google Cloud プロジェクトと認証を設定する

Google Cloud プロジェクトをまだ作成していない場合は、ここで作成します。このセクションを開いて手順を確認してください。

Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Go to project selector

Verify that billing is enabled for your Google Cloud project.

Enable the Vision API.

Enable the API

Install the Google Cloud CLI.

外部 ID プロバイダ（IdP）を使用している場合は、まず連携 ID を使用して gcloud CLI にログインする必要があります。

gcloud CLI を初期化するには、次のコマンドを実行します。

gcloud init

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Go to project selector

Verify that billing is enabled for your Google Cloud project.

Enable the Vision API.

Enable the API

Install the Google Cloud CLI.

外部 ID プロバイダ（IdP）を使用している場合は、まず連携 ID を使用して gcloud CLI にログインする必要があります。

gcloud CLI を初期化するには、次のコマンドを実行します。

gcloud init

現在サポートされている機能タイプ

機能タイプ
`CROP_HINTS`	画像上のクロップ領域の推奨頂点を調べます。
`DOCUMENT_TEXT_DETECTION`	ドキュメント（PDF / TIFF）などの高密度テキスト画像や手書き文字を含む画像に OCR を実行します。`TEXT_DETECTION` は、スパース領域のテキスト画像に使用できます。`DOCUMENT_TEXT_DETECTION` と `TEXT_DETECTION` の両方が存在する場合に優先されます。
`FACE_DETECTION`	画像内の顔を検出します。
`IMAGE_PROPERTIES`	画像のドミナントカラーなどの一連の画像プロパティを計算します。
`LABEL_DETECTION`	画像の内容に基づいてラベルを追加します。
`LANDMARK_DETECTION`	画像内の地理的ランドマークを検出します。
`LOGO_DETECTION`	画像内の企業ロゴを検出します。
`OBJECT_LOCALIZATION`	画像内の複数のオブジェクトを検出して抽出できます。
`SAFE_SEARCH_DETECTION`	セーフサーチを実行して、安全でない可能性のあるコンテンツや不適切なコンテンツを検出します。
`TEXT_DETECTION`	画像内のテキストに対して光学式文字認識（OCR）を実行します。テキスト検出は、大きな画像のスパース領域向けに最適化されています。画像がドキュメント（PDF / TIFF ）で、テキストが密に存在しているか、手書き文字が含まれている場合は、代わりに `DOCUMENT_TEXT_DETECTION` を使用します。
`WEB_DETECTION`	画像内のニュース、イベント、有名人などの時事的なエンティティを検出し、Google 画像検索の機能を使用してウェブ上で同様の画像を検索します。

サンプルコード

アノテーション作成リクエストでは、ローカルに保存されたファイルを使用することも、Cloud Storage に保存されているファイルを使用することもできます。

ローカルに保存されたファイルを使用する

次のサンプルコードでは、ローカルに保存されているファイルを使用して、任意の特徴のアノテーションを取得します。

REST

小さなバッチファイルに対して PDF / TIFF / GIF の特徴検出をオンラインで実行するには、POST リクエストを送信して、該当するリクエスト本文を提供します。

リクエストのデータを使用する前に、次のように置き換えます。

BASE64_ENCODED_FILE: バイナリファイルデータの base64 表現（ASCII 文字列）。これは次のような文字列になります。
- JVBERi0xLjUNCiW1tbW1...ydHhyZWYNCjk5NzM2OQ0KJSVFT0Y=
詳細については、base64 エンコードをご覧ください。
PROJECT_ID: 実際の Google Cloud プロジェクト ID。

フィールド固有の考慮事項:

inputConfig.mimeType - application/pdf、image/tiff、image/gif のいずれか。
pages - 特徴検出を行うファイルの特定のページを指定します。

HTTP メソッドと URL:

POST https://vision.googleapis.com/v1/files:annotate

リクエストの本文（JSON）:

{
  "requests": [
    {
      "inputConfig": {
        "content": "BASE64_ENCODED_FILE",
        "mimeType": "application/pdf"
      },
      "features": [
        {
          "type": "DOCUMENT_TEXT_DETECTION"
        }
      ],
      "pages": [
        1,2,3,4,5
      ]
    }
  ]
}

リクエストを送信するには、次のいずれかのオプションを選択します。

curl

注: 次のコマンドは、gcloud init または gcloud auth login を実行して、ユーザーアカウントで gcloud CLI にログインしているか、Cloud Shell を使用して自動的に gcloud CLI にログインしていることを前提としています。gcloud auth list を実行すると、現在アクティブなアカウントを確認できます。

リクエスト本文を request.json という名前のファイルに保存して、次のコマンドを実行します。

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "x-goog-user-project: PROJECT_ID" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://vision.googleapis.com/v1/files:annotate"

PowerShell

注: 次のコマンドは、gcloud init または gcloud auth login を実行して、ご自分のユーザーアカウントで gcloud CLI にログインしていることを前提としています。gcloud auth list を実行すると、現在アクティブなアカウントを確認できます。

リクエスト本文を request.json という名前のファイルに保存して、次のコマンドを実行します。

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = "PROJECT_ID" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://vision.googleapis.com/v1/files:annotate" | Select-Object -Expand Content

レスポンス:

annotate リクエストに成功すると、JSON 形式のレスポンスがすぐに返されます。

この機能（DOCUMENT_TEXT_DETECTION）の場合、JSON 形式のレスポンスは画像のドキュメントテキスト検出リクエストのレスポンスに類似しています。レスポンスには、段落、単語、個々の記号で区切られたブロックの境界ボックスが含まれます。全文も検出されます。レスポンスには、指定された PDF または TIFF の場所とファイル内のページ番号を示す context フィールドも含まれます。

次の JSON レスポンスは、単一ページ（ページ 2）のものです。わかりやすく説明するため、ここでは一部を省略しています。

レスポンス

{
  "responses": [
    {
      "responses": [
        {
          "fullTextAnnotation": {
            "pages": [
              {
                "property": {
                  "detectedLanguages": [
                    {
                      "languageCode": "en",
                      "confidence": 0.99
                    },
                    {
                      "languageCode": "pl",
                      "confidence": 0.01
                    }
                  ]
                },
                "width": 1342,
                "height": 2234,
                "blocks": [
                  {
                    "boundingBox": {
                      "vertices": [
                      ...
                      ]
                    },
                    "paragraphs": [
                      {
                        "boundingBox": {
                          "vertices": [
                          ...
                          ]
                        },
                        "words": [
                          {
                            "property": {
                              "detectedLanguages": [
                                {
                                  "languageCode": "en"
                                }
                              ]
                            },
                            "boundingBox": {
                              "vertices": [
                              ...
                              ]
                            },
                            "symbols": [
                              {
                                "property": {
                                  "detectedLanguages": [
                                    {
                                      "languageCode": "en"
                                    }
                                  ],
                                  "detectedBreak": {
                                    "type": "SPACE"
                                  }
                                },
                                "boundingBox": {
                                  "vertices": [
                                ...
                                  ]
                                },
                                "text": "#",
                                "confidence": 0.07
                              }
                            ],
                            "confidence": 0.07
                          },
                          ...
                    ],
                    "blockType": "TEXT",
                    "confidence": 0.88
                  },
                  ...
            ...
            "text": "# THE STATE OF TEXAS\n0\nOIL, GAS AND MINERAL LEASE\n
            COUNTY OF MAVERICK\nTHIS AGREEMENT made this 14 day of_June\n1954,
            between Norvel J. Chittim and his wife, Lieschen G. Chittim;\nMary
            Anne Chittim Parker, joined herein pro forma by her husband,\nJoseph
            Bright Parker; Dorothea Chittim Oppenheimer, joined herein\nji pro
            forma by her husband, Fred J. Oppenheimer; Tuleta Chittim\nWright,
            joined herein pro forma by her husband, Gilbert G. Wright,\nJr.;
            Gilbert G. Wright, III; Delă Wright White, joined herein pro\nforma
            by her husband, John H. White; Anne Wright Basse, joined\nherein
            pro forma by her husband, E. A. Basse, Jr.; Norvel J.\nChittim,
            Independent Executor and Trustee for Estate of Marstella\nChittim,
            Deceased; Mary Louise Roswell, joined herein pro forma by\nher
            husband, Charles M. 'Roswell; and James M. Chittim and his wife\n
            Thelma Neal Chittim; as LESSORS, and W. L. Scheig of San Antonio,\n
            Texas, as LESSEE,\n10\nW ITNESS ETH:\nLessors, in consideration of
            $10.00, cash in hand paid, i\nof the royalties herein provided,
            and of the agreement of Lessee\nherein contained, hereby grant,
            lease and let exclusively unto\nLessee the tracts of land
            hereinafter described for the purpose of\ntesting for mineral
            indications, and in such tests use the Seismo-\ngraph, Torsion
            Balance, Core Drill, or any other tools, machinery,\nequipment
            or explosive necessary and proper; and also prospecting,\ndrilling
            and mining for and producing oil, gas and other minerals i\n
            (except metallic minerals), laying pipe lines, building tanks,\n
            power stations, telephone lines and other structures thereon to\n
            produce, save, take care of, treat, transport and own said pro-\n
            ducts and housing its employees (Lessee to conduct its geophysical\n
            work in such manner as not to damage the buildings, water tanks\n
            or wells of Lessors, or the livestock of Lessors or Lessors' ten-\n
            ants, ) said lands being situated in Maverick, Zavalla and Dimmit\n
            Counties, Texas, to-wit:\n3 -1.\n"
          },
          "context": {
            "pageNumber": 2
          }
        }
      ]
    }
  ]
}

Java

このサンプルを試す前に、Vision API クイックスタート: クライアントライブラリの使用の Java の設定手順を完了してください。詳細については、Vision API Java のリファレンスドキュメントをご覧ください。

import com.google.cloud.vision.v1.AnnotateFileRequest;
import com.google.cloud.vision.v1.AnnotateImageResponse;
import com.google.cloud.vision.v1.BatchAnnotateFilesRequest;
import com.google.cloud.vision.v1.BatchAnnotateFilesResponse;
import com.google.cloud.vision.v1.Block;
import com.google.cloud.vision.v1.Feature;
import com.google.cloud.vision.v1.ImageAnnotatorClient;
import com.google.cloud.vision.v1.InputConfig;
import com.google.cloud.vision.v1.Page;
import com.google.cloud.vision.v1.Paragraph;
import com.google.cloud.vision.v1.Symbol;
import com.google.cloud.vision.v1.Word;
import com.google.protobuf.ByteString;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;

public class BatchAnnotateFiles {

  public static void batchAnnotateFiles() throws IOException {
    String filePath = "path/to/your/file.pdf";
    batchAnnotateFiles(filePath);
  }

  public static void batchAnnotateFiles(String filePath) throws IOException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (ImageAnnotatorClient imageAnnotatorClient = ImageAnnotatorClient.create()) {
      // You can send multiple files to be annotated, this sample demonstrates how to do this with
      // one file. If you want to use multiple files, you have to create a `AnnotateImageRequest`
      // object for each file that you want annotated.
      // First read the files contents
      Path path = Paths.get(filePath);
      byte[] data = Files.readAllBytes(path);
      ByteString content = ByteString.copyFrom(data);

      // Specify the input config with the file's contents and its type.
      // Supported mime_type: application/pdf, image/tiff, image/gif
      // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#inputconfig
      InputConfig inputConfig =
          InputConfig.newBuilder().setMimeType("application/pdf").setContent(content).build();

      // Set the type of annotation you want to perform on the file
      // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#google.cloud.vision.v1.Feature.Type
      Feature feature = Feature.newBuilder().setType(Feature.Type.DOCUMENT_TEXT_DETECTION).build();

      // Build the request object for that one file. Note: for additional file you have to create
      // additional `AnnotateFileRequest` objects and store them in a list to be used below.
      // Since we are sending a file of type `application/pdf`, we can use the `pages` field to
      // specify which pages to process. The service can process up to 5 pages per document file.
      // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#google.cloud.vision.v1.AnnotateFileRequest
      AnnotateFileRequest fileRequest =
          AnnotateFileRequest.newBuilder()
              .setInputConfig(inputConfig)
              .addFeatures(feature)
              .addPages(1) // Process the first page
              .addPages(2) // Process the second page
              .addPages(-1) // Process the last page
              .build();

      // Add each `AnnotateFileRequest` object to the batch request.
      BatchAnnotateFilesRequest request =
          BatchAnnotateFilesRequest.newBuilder().addRequests(fileRequest).build();

      // Make the synchronous batch request.
      BatchAnnotateFilesResponse response = imageAnnotatorClient.batchAnnotateFiles(request);

      // Process the results, just get the first result, since only one file was sent in this
      // sample.
      for (AnnotateImageResponse imageResponse :
          response.getResponsesList().get(0).getResponsesList()) {
        System.out.format("Full text: %s%n", imageResponse.getFullTextAnnotation().getText());
        for (Page page : imageResponse.getFullTextAnnotation().getPagesList()) {
          for (Block block : page.getBlocksList()) {
            System.out.format("%nBlock confidence: %s%n", block.getConfidence());
            for (Paragraph par : block.getParagraphsList()) {
              System.out.format("\tParagraph confidence: %s%n", par.getConfidence());
              for (Word word : par.getWordsList()) {
                System.out.format("\t\tWord confidence: %s%n", word.getConfidence());
                for (Symbol symbol : word.getSymbolsList()) {
                  System.out.format(
                      "\t\t\tSymbol: %s, (confidence: %s)%n",
                      symbol.getText(), symbol.getConfidence());
                }
              }
            }
          }
        }
      }
    }
  }
}

Node.js

このサンプルを試す前に、Vision クイックスタート: クライアントライブラリの使用にある Node.js の設定を完了してください。詳細については、Vision Node.js API のリファレンスドキュメントをご覧ください。

Vision に対する認証を行うには、アプリケーションのデフォルト認証情報を設定します。詳細については、ローカル開発環境の認証を設定するをご覧ください。

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const fileName = 'path/to/your/file.pdf';

// Imports the Google Cloud client libraries
const {ImageAnnotatorClient} = require('@google-cloud/vision').v1;
const fs = require('fs').promises;

// Instantiates a client
const client = new ImageAnnotatorClient();

// You can send multiple files to be annotated, this sample demonstrates how to do this with
// one file. If you want to use multiple files, you have to create a request object for each file that you want annotated.
async function batchAnnotateFiles() {
  // First Specify the input config with the file's path and its type.
  // Supported mime_type: application/pdf, image/tiff, image/gif
  // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#inputconfig
  const inputConfig = {
    mimeType: 'application/pdf',
    content: await fs.readFile(fileName),
  };

  // Set the type of annotation you want to perform on the file
  // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#google.cloud.vision.v1.Feature.Type
  const features = [{type: 'DOCUMENT_TEXT_DETECTION'}];

  // Build the request object for that one file. Note: for additional files you have to create
  // additional file request objects and store them in a list to be used below.
  // Since we are sending a file of type `application/pdf`, we can use the `pages` field to
  // specify which pages to process. The service can process up to 5 pages per document file.
  // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#google.cloud.vision.v1.AnnotateFileRequest
  const fileRequest = {
    inputConfig: inputConfig,
    features: features,
    // Annotate the first two pages and the last one (max 5 pages)
    // First page starts at 1, and not 0. Last page is -1.
    pages: [1, 2, -1],
  };

  // Add each `AnnotateFileRequest` object to the batch request.
  const request = {
    requests: [fileRequest],
  };

  // Make the synchronous batch request.
  const [result] = await client.batchAnnotateFiles(request);

  // Process the results, just get the first result, since only one file was sent in this
  // sample.
  const responses = result.responses[0].responses;

  for (const response of responses) {
    console.log(`Full text: ${response.fullTextAnnotation.text}`);
    for (const page of response.fullTextAnnotation.pages) {
      for (const block of page.blocks) {
        console.log(`Block confidence: ${block.confidence}`);
        for (const paragraph of block.paragraphs) {
          console.log(` Paragraph confidence: ${paragraph.confidence}`);
          for (const word of paragraph.words) {
            const symbol_texts = word.symbols.map(symbol => symbol.text);
            const word_text = symbol_texts.join('');
            console.log(
              `  Word text: ${word_text} (confidence: ${word.confidence})`
            );
            for (const symbol of word.symbols) {
              console.log(
                `   Symbol: ${symbol.text} (confidence: ${symbol.confidence})`
              );
            }
          }
        }
      }
    }
  }
}

batchAnnotateFiles();

Python

このサンプルを試す前に、Vision クイックスタート: クライアントライブラリの使用にある Python の設定を完了してください。詳細については、Vision Python API のリファレンスドキュメントをご覧ください。



from google.cloud import vision_v1


def sample_batch_annotate_files(file_path="path/to/your/document.pdf"):
    """Perform batch file annotation."""
    client = vision_v1.ImageAnnotatorClient()

    # Supported mime_type: application/pdf, image/tiff, image/gif
    mime_type = "application/pdf"
    with open(file_path, "rb") as f:
        content = f.read()
    input_config = {"mime_type": mime_type, "content": content}
    features = [{"type_": vision_v1.Feature.Type.DOCUMENT_TEXT_DETECTION}]

    # The service can process up to 5 pages per document file. Here we specify
    # the first, second, and last page of the document to be processed.
    pages = [1, 2, -1]
    requests = [{"input_config": input_config, "features": features, "pages": pages}]

    response = client.batch_annotate_files(requests=requests)
    for image_response in response.responses[0].responses:
        print(f"Full text: {image_response.full_text_annotation.text}")
        for page in image_response.full_text_annotation.pages:
            for block in page.blocks:
                print(f"\nBlock confidence: {block.confidence}")
                for par in block.paragraphs:
                    print(f"\tParagraph confidence: {par.confidence}")
                    for word in par.words:
                        print(f"\t\tWord confidence: {word.confidence}")
                        for symbol in word.symbols:
                            print(
                                "\t\t\tSymbol: {}, (confidence: {})".format(
                                    symbol.text, symbol.confidence
                                )
                            )

Cloud Storage 上のファイルを使用する

次のサンプルコートでは、Cloud Storage 上のファイルに対して任意の特徴のアノテーションを取得できます。

REST

サイズの小さいバッチファイルを使用して、PDF / TIFF / GIF の特徴検出をオンラインで実行するには、POST リクエストを作成し、該当するリクエストの本文を提供します。

リクエストのデータを使用する前に、次のように置き換えます。

CLOUD_STORAGE_FILE_URI: Cloud Storage バケット内の有効なファイル（PDF/TIFF）へのパス。少なくとも、ファイルに対する読み取り権限が必要です。例:
- ```
gs://cloud-samples-data/vision/document_understanding/custom_0773375000.pdf
```
PROJECT_ID: 実際の Google Cloud プロジェクト ID。

フィールド固有の考慮事項:

inputConfig.mimeType - application/pdf、image/tiff、image/gif のいずれか。
pages - 特徴検出を行うファイルの特定のページを指定します。

HTTP メソッドと URL:

POST https://vision.googleapis.com/v1/files:annotate

リクエストの本文（JSON）:

{
  "requests": [
    {
      "inputConfig": {
        "gcsSource": {
          "uri": "CLOUD_STORAGE_FILE_URI"
        },
        "mimeType": "application/pdf"
      },
      "features": [
        {
          "type": "DOCUMENT_TEXT_DETECTION"
        }
      ],
      "pages": [
        1,2,3,4,5
      ]
    }
  ]
}

リクエストを送信するには、次のいずれかのオプションを選択します。

curl

リクエスト本文を request.json という名前のファイルに保存して、次のコマンドを実行します。

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "x-goog-user-project: PROJECT_ID" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://vision.googleapis.com/v1/files:annotate"

PowerShell

リクエスト本文を request.json という名前のファイルに保存して、次のコマンドを実行します。

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = "PROJECT_ID" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://vision.googleapis.com/v1/files:annotate" | Select-Object -Expand Content

レスポンス:

annotate リクエストに成功すると、JSON 形式のレスポンスがすぐに返されます。

次の JSON レスポンスは、単一ページ（ページ 2）のものです。わかりやすく説明するため、ここでは一部を省略しています。

レスポンス

{
  "responses": [
    {
      "responses": [
        {
          "fullTextAnnotation": {
            "pages": [
              {
                "property": {
                  "detectedLanguages": [
                    {
                      "languageCode": "en",
                      "confidence": 0.99
                    },
                    {
                      "languageCode": "pl",
                      "confidence": 0.01
                    }
                  ]
                },
                "width": 1342,
                "height": 2234,
                "blocks": [
                  {
                    "boundingBox": {
                      "vertices": [
                      ...
                      ]
                    },
                    "paragraphs": [
                      {
                        "boundingBox": {
                          "vertices": [
                          ...
                          ]
                        },
                        "words": [
                          {
                            "property": {
                              "detectedLanguages": [
                                {
                                  "languageCode": "en"
                                }
                              ]
                            },
                            "boundingBox": {
                              "vertices": [
                              ...
                              ]
                            },
                            "symbols": [
                              {
                                "property": {
                                  "detectedLanguages": [
                                    {
                                      "languageCode": "en"
                                    }
                                  ],
                                  "detectedBreak": {
                                    "type": "SPACE"
                                  }
                                },
                                "boundingBox": {
                                  "vertices": [
                                ...
                                  ]
                                },
                                "text": "#",
                                "confidence": 0.07
                              }
                            ],
                            "confidence": 0.07
                          },
                          ...
                    ],
                    "blockType": "TEXT",
                    "confidence": 0.88
                  },
                  ...
            ...
            "text": "# THE STATE OF TEXAS\n0\nOIL, GAS AND MINERAL LEASE\n
            COUNTY OF MAVERICK\nTHIS AGREEMENT made this 14 day of_June\n1954,
            between Norvel J. Chittim and his wife, Lieschen G. Chittim;\nMary
            Anne Chittim Parker, joined herein pro forma by her husband,\nJoseph
            Bright Parker; Dorothea Chittim Oppenheimer, joined herein\nji pro
            forma by her husband, Fred J. Oppenheimer; Tuleta Chittim\nWright,
            joined herein pro forma by her husband, Gilbert G. Wright,\nJr.;
            Gilbert G. Wright, III; Delă Wright White, joined herein pro\nforma
            by her husband, John H. White; Anne Wright Basse, joined\nherein
            pro forma by her husband, E. A. Basse, Jr.; Norvel J.\nChittim,
            Independent Executor and Trustee for Estate of Marstella\nChittim,
            Deceased; Mary Louise Roswell, joined herein pro forma by\nher
            husband, Charles M. 'Roswell; and James M. Chittim and his wife\n
            Thelma Neal Chittim; as LESSORS, and W. L. Scheig of San Antonio,\n
            Texas, as LESSEE,\n10\nW ITNESS ETH:\nLessors, in consideration of
            $10.00, cash in hand paid, i\nof the royalties herein provided,
            and of the agreement of Lessee\nherein contained, hereby grant,
            lease and let exclusively unto\nLessee the tracts of land
            hereinafter described for the purpose of\ntesting for mineral
            indications, and in such tests use the Seismo-\ngraph, Torsion
            Balance, Core Drill, or any other tools, machinery,\nequipment
            or explosive necessary and proper; and also prospecting,\ndrilling
            and mining for and producing oil, gas and other minerals i\n
            (except metallic minerals), laying pipe lines, building tanks,\n
            power stations, telephone lines and other structures thereon to\n
            produce, save, take care of, treat, transport and own said pro-\n
            ducts and housing its employees (Lessee to conduct its geophysical\n
            work in such manner as not to damage the buildings, water tanks\n
            or wells of Lessors, or the livestock of Lessors or Lessors' ten-\n
            ants, ) said lands being situated in Maverick, Zavalla and Dimmit\n
            Counties, Texas, to-wit:\n3 -1.\n"
          },
          "context": {
            "pageNumber": 2
          }
        }
      ]
    }
  ]
}

Java

import com.google.cloud.vision.v1.AnnotateFileRequest;
import com.google.cloud.vision.v1.AnnotateImageResponse;
import com.google.cloud.vision.v1.BatchAnnotateFilesRequest;
import com.google.cloud.vision.v1.BatchAnnotateFilesResponse;
import com.google.cloud.vision.v1.Block;
import com.google.cloud.vision.v1.Feature;
import com.google.cloud.vision.v1.GcsSource;
import com.google.cloud.vision.v1.ImageAnnotatorClient;
import com.google.cloud.vision.v1.InputConfig;
import com.google.cloud.vision.v1.Page;
import com.google.cloud.vision.v1.Paragraph;
import com.google.cloud.vision.v1.Symbol;
import com.google.cloud.vision.v1.Word;
import java.io.IOException;

public class BatchAnnotateFilesGcs {

  public static void batchAnnotateFilesGcs() throws IOException {
    String gcsUri = "gs://cloud-samples-data/vision/document_understanding/kafka.pdf";
    batchAnnotateFilesGcs(gcsUri);
  }

  public static void batchAnnotateFilesGcs(String gcsUri) throws IOException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (ImageAnnotatorClient imageAnnotatorClient = ImageAnnotatorClient.create()) {
      // You can send multiple files to be annotated, this sample demonstrates how to do this with
      // one file. If you want to use multiple files, you have to create a `AnnotateImageRequest`
      // object for each file that you want annotated.
      // First specify where the vision api can find the image
      GcsSource gcsSource = GcsSource.newBuilder().setUri(gcsUri).build();

      // Specify the input config with the file's uri and its type.
      // Supported mime_type: application/pdf, image/tiff, image/gif
      // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#inputconfig
      InputConfig inputConfig =
          InputConfig.newBuilder().setMimeType("application/pdf").setGcsSource(gcsSource).build();

      // Set the type of annotation you want to perform on the file
      // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#google.cloud.vision.v1.Feature.Type
      Feature feature = Feature.newBuilder().setType(Feature.Type.DOCUMENT_TEXT_DETECTION).build();

      // Build the request object for that one file. Note: for additional file you have to create
      // additional `AnnotateFileRequest` objects and store them in a list to be used below.
      // Since we are sending a file of type `application/pdf`, we can use the `pages` field to
      // specify which pages to process. The service can process up to 5 pages per document file.
      // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#google.cloud.vision.v1.AnnotateFileRequest
      AnnotateFileRequest fileRequest =
          AnnotateFileRequest.newBuilder()
              .setInputConfig(inputConfig)
              .addFeatures(feature)
              .addPages(1) // Process the first page
              .addPages(2) // Process the second page
              .addPages(-1) // Process the last page
              .build();

      // Add each `AnnotateFileRequest` object to the batch request.
      BatchAnnotateFilesRequest request =
          BatchAnnotateFilesRequest.newBuilder().addRequests(fileRequest).build();

      // Make the synchronous batch request.
      BatchAnnotateFilesResponse response = imageAnnotatorClient.batchAnnotateFiles(request);

      // Process the results, just get the first result, since only one file was sent in this
      // sample.
      for (AnnotateImageResponse imageResponse :
          response.getResponsesList().get(0).getResponsesList()) {
        System.out.format("Full text: %s%n", imageResponse.getFullTextAnnotation().getText());
        for (Page page : imageResponse.getFullTextAnnotation().getPagesList()) {
          for (Block block : page.getBlocksList()) {
            System.out.format("%nBlock confidence: %s%n", block.getConfidence());
            for (Paragraph par : block.getParagraphsList()) {
              System.out.format("\tParagraph confidence: %s%n", par.getConfidence());
              for (Word word : par.getWordsList()) {
                System.out.format("\t\tWord confidence: %s%n", word.getConfidence());
                for (Symbol symbol : word.getSymbolsList()) {
                  System.out.format(
                      "\t\t\tSymbol: %s, (confidence: %s)%n",
                      symbol.getText(), symbol.getConfidence());
                }
              }
            }
          }
        }
      }
    }
  }
}

Node.js

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const gcsSourceUri = 'gs://cloud-samples-data/vision/document_understanding/kafka.pdf';

// Imports the Google Cloud client libraries
const {ImageAnnotatorClient} = require('@google-cloud/vision').v1;

// Instantiates a client
const client = new ImageAnnotatorClient();

// You can send multiple files to be annotated, this sample demonstrates how to do this with
// one file. If you want to use multiple files, you have to create a request object for each file that you want annotated.
async function batchAnnotateFiles() {
  // First Specify the input config with the file's uri and its type.
  // Supported mime_type: application/pdf, image/tiff, image/gif
  // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#inputconfig
  const inputConfig = {
    mimeType: 'application/pdf',
    gcsSource: {
      uri: gcsSourceUri,
    },
  };

  // Set the type of annotation you want to perform on the file
  // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#google.cloud.vision.v1.Feature.Type
  const features = [{type: 'DOCUMENT_TEXT_DETECTION'}];

  // Build the request object for that one file. Note: for additional files you have to create
  // additional file request objects and store them in a list to be used below.
  // Since we are sending a file of type `application/pdf`, we can use the `pages` field to
  // specify which pages to process. The service can process up to 5 pages per document file.
  // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#google.cloud.vision.v1.AnnotateFileRequest
  const fileRequest = {
    inputConfig: inputConfig,
    features: features,
    // Annotate the first two pages and the last one (max 5 pages)
    // First page starts at 1, and not 0. Last page is -1.
    pages: [1, 2, -1],
  };

  // Add each `AnnotateFileRequest` object to the batch request.
  const request = {
    requests: [fileRequest],
  };

  // Make the synchronous batch request.
  const [result] = await client.batchAnnotateFiles(request);

  // Process the results, just get the first result, since only one file was sent in this
  // sample.
  const responses = result.responses[0].responses;

  for (const response of responses) {
    console.log(`Full text: ${response.fullTextAnnotation.text}`);
    for (const page of response.fullTextAnnotation.pages) {
      for (const block of page.blocks) {
        console.log(`Block confidence: ${block.confidence}`);
        for (const paragraph of block.paragraphs) {
          console.log(` Paragraph confidence: ${paragraph.confidence}`);
          for (const word of paragraph.words) {
            const symbol_texts = word.symbols.map(symbol => symbol.text);
            const word_text = symbol_texts.join('');
            console.log(
              `  Word text: ${word_text} (confidence: ${word.confidence})`
            );
            for (const symbol of word.symbols) {
              console.log(
                `   Symbol: ${symbol.text} (confidence: ${symbol.confidence})`
              );
            }
          }
        }
      }
    }
  }
}

batchAnnotateFiles();

Python


from google.cloud import vision_v1


def sample_batch_annotate_files(
    storage_uri="gs://cloud-samples-data/vision/document_understanding/kafka.pdf",
):
    """Perform batch file annotation."""
    mime_type = "application/pdf"

    client = vision_v1.ImageAnnotatorClient()

    gcs_source = {"uri": storage_uri}
    input_config = {"gcs_source": gcs_source, "mime_type": mime_type}
    features = [{"type_": vision_v1.Feature.Type.DOCUMENT_TEXT_DETECTION}]

    # The service can process up to 5 pages per document file.
    # Here we specify the first, second, and last page of the document to be
    # processed.
    pages = [1, 2, -1]
    requests = [{"input_config": input_config, "features": features, "pages": pages}]

    response = client.batch_annotate_files(requests=requests)
    for image_response in response.responses[0].responses:
        print(f"Full text: {image_response.full_text_annotation.text}")
        for page in image_response.full_text_annotation.pages:
            for block in page.blocks:
                print(f"\nBlock confidence: {block.confidence}")
                for par in block.paragraphs:
                    print(f"\tParagraph confidence: {par.confidence}")
                    for word in par.words:
                        print(f"\t\tWord confidence: {word.confidence}")
                        for symbol in word.symbols:
                            print(
                                "\t\t\tSymbol: {}, (confidence: {})".format(
                                    symbol.text, symbol.confidence
                                )
                            )

試してみる

小規模なバッチファイルを使用して、オンラインで特徴検出を試してみましょう。

すでに指定した PDF ファイルを使用することも、独自のファイルを指定することもできます。

このリクエストでは次の 3 つの機能タイプが指定されています。

DOCUMENT_TEXT_DETECTION
LABEL_DETECTION
CROP_HINTS

リクエスト内のオブジェクト（{"type": "FEATURE_NAME"}）を変更することで、他の機能タイプを追加または削除できます。

[実行] を選択してリクエストを送信します。

リクエストの本文:

{
  "requests": [
    {
      "inputConfig": {
        "gcsSource": {
          "uri": "gs://cloud-samples-data/vision/document_understanding/custom_0773375000.pdf"
        },
        "mimeType": "application/pdf"
      },
      "features": [
        {
          "type": "DOCUMENT_TEXT_DETECTION"
        },
        {
          "type": "LABEL_DETECTION"
        },
        {
          "type": "CROP_HINTS"
        }
      ],
      "pages": [
        1,
        2,
        3,
        4,
        5
      ]
    }
  ]
}

小さなバッチファイルのアノテーションをオンラインで生成する コレクションでコンテンツを整理 必要に応じて、コンテンツの保存と分類を行います。

ページ 1

ページ 2

ページ 3

ページ 4

ページ 5

制限事項

認証

Google Cloud プロジェクトと認証を設定する

現在サポートされている機能タイプ

サンプルコード

ローカルに保存されたファイルを使用する

REST

curl

PowerShell

レスポンス

Java

Node.js

Python

Cloud Storage 上のファイルを使用する

REST

curl

PowerShell

レスポンス

Java

Node.js

Python

試してみる

小さなバッチファイルのアノテーションをオンラインで生成する