Essayez Gemini 1.5 Pro, notre modèle multimodal le plus avancé dans Vertex AI et découvrez ce que vous pouvez compiler avec une fenêtre de contexte d'un million de jetons. Essayez Gemini 1.5 Pro, notre modèle multimodal le plus avancé dans Vertex AI et découvrez ce que vous pouvez compiler avec une fenêtre de contexte d'un million de jetons.

Annotation de petits lots de fichiers en ligne

Requêtes en ligne (synchrones) : une requête d'annotation en ligne (images:annotate ou files:annotate) renvoie immédiatement les annotations intégrées à l'utilisateur. Les requêtes d'annotation en ligne limitent le nombre de fichiers que vous pouvez annoter dans une seule requête. Avec une requête images:annotate, vous ne pouvez spécifier qu'un petit nombre d'images (<=16) à annoter. Avec une requête files:annotate, vous ne pouvez spécifier qu'un seul fichier et un petit nombre de pages (<=5) à annoter dans ce fichier.
Requêtes hors connexion (asynchrone) : une requête d'annotation hors connexion (images:asyncBatchAnnotate ou files:asyncBatchAnnotate) lance une opération de longue durée et ne renvoie pas immédiatement de réponse à l'appelant. Une fois l'opération de longue durée terminée, les annotations sont stockées sous forme de fichiers dans un bucket Cloud Storage que vous spécifiez. Une requête images:asyncBatchAnnotate vous permet de spécifier jusqu'à 2 000 images par requête. Une requête files:asyncBatchAnnotate vous permet de spécifier des lots de fichiers plus volumineux et davantage de pages (<=2 000) par fichier à annoter en une seule fois, contrairement aux requêtes en ligne.

L'API Cloud Vision peut fournir une annotation en ligne (immédiate) de plusieurs pages ou images à partir de fichiers PDF, TIFF ou GIF stockés dans Cloud Storage.

Vous pouvez demander la détection de caractéristiques et l'annotation en ligne de cinq images (GIF; "image/gif") ou pages (PDF, "application/pdf" ou TIFF; "image/tiff") de votre choix pour chaque fichier.

Les exemples d'annotations de cette page concernent DOCUMENT_TEXT_DETECTION, mais l'annotation en ligne par petits lots est disponible pour toutes les caractéristiques de Vision.

Remarque : L'API Cloud Vision accepte également l'annotation de fichiers PDF/TIFF asynchrone hors connexion, qui n'est pour l'instant disponible que pour le type de caractéristiques DOCUMENT_TEXT_DETECTION.

Les requêtes asynchrones hors connexion renvoient des fichiers JSON de réponse dans votre bucket Cloud Storage et acceptent des fichiers de 2 000 pages au maximum. Pour plus d'informations, consultez la section Détecter le texte dans les fichiers (PDF ou TIFF).

Cinq premières pages d'un fichier PDF — gs://cloud-samples-data/vision/document_understanding/custom_0773375000.pdf

Page 1


...
"text": "á\n7.1.15\nOIL, GAS AND MINERAL LEASE
\nNORVEL J. CHITTIM, ET AL\n.\n.
\nTO\nW. L. SCHEIG\n"
},
"context": {"pageNumber": 1}
...

Page 2


...
"text": "...\n.\n*\n.\n.\n.\nA\nNY\nALA...\n7
\n| THE STATE OF TEXAS
\nOIL, GAS AND MINERAL LEASE
\nCOUNTY OF MAVERICK ]
\nTHIS AGREEMENT made this 14 day of_June
\n1954, between Norvel J. Chittim and his wife, Lieschen G. Chittim;
\nMary Anne Chittim Parker, joined herein pro forma by her husband,
\nJoseph Bright Parker; Dorothea Chittim Oppenheimer, joined herein
\npro forma by her husband, Fred J. Oppenheimer; Tuleta Chittim
\nWright, joined herein pro forma by her husband, Gilbert G. Wright,
\nJr.; Gilbert G. Wright, III; Dela Wright White, joined herein pro
\nforma by her husband, John H. White; Anne Wright Basse, joined
\nherein pro forma by her husband, E. A. Basse, Jr.; Norvel J.
\nChittim, Independent Executor and Trustee for Estate of Marstella
\nChittim, Deceased; Mary Louise Roswell, joined herein pro forma by
\nher husband, Charles M. 'Roswell; and James M. Chittim and his wife,
\nThelma Neal Chittim; as LESSORS, and W. L. Scheig of San Antonio,
\nTexas, as LESSEE,


\nW I T N E s s E T H:
\n1. Lessors, in consideration of $10.00, cash in hand paid,
\nof the royalties herein provided, and of the agreement of Lessee
\nherein contained, hereby grant, lease and let exclusively unto
\nLessee the tracts of land hereinafter described for the purpose of
\ntesting for mineral indications, and in such tests use the Seismo-
\ngraph, Torsion Balance, Core Drill, or any other tools, machinery,
\nequipment or explosive necessary and proper; and also prospecting,
\ndrilling and mining for and producing oil, gas and other minerals
\n(except metallic minerals), laying pipe lines, building tanks,
\npower stations, telephone lines and other structures thereon to
\nproduce, save, take care of, treat, transport and own said pro-
\nducts and housing its employees (Lessee to conduct its geophysical
\nwork in such manner as not to damage the buildings, water tanks
\nor wells of Lessors, or the livestock of Lessors or Lessors' ten- !
\nants, )said lands being situated in Maverick, Zavalla and Dimmit
\nCounties, Texas, to-wit:\n3-1.\n"
},
"context": {"pageNumber": 2}
...

Page 3


...
"text": "Being a tract consisting of 140,769.86 acres, more or
\nless, out of what is known as the \"Chittim Ranch\" in said counties,
\nas designated and described in Exhibit \"A\" hereto attached and
\nmade a part hereof as if fully written herein. It being under-
\nstood that the acreage intended to be included in this lease aggre-
\ngates approximately 140,769.86 acres whether it actually comprises
\nmore or less, but for the purpose of calculating the payments
\nhereinafter provided for, it is agreed that the land included with-
\nin the terms of this lease is One hundred forty thousand seven
\nhundred sixty-nine and eighty-six one hundredths (140,769.86) acres,
\nand that each survey listed above contains the acreage stated above.
\nIt is understood that tract designated \"TRACT II\" in
\nExhibit \"A\" is subject to a one-sixteenth (1/16) royalty reserved.
\nto the State of Texas, and the rights of the State of Texas must
\nbe respected in the development of the said property.


\n2. Subject to the other provisions hereof, this lease shall
\nbe for a term of ten (10) years from date hereof (called \"Primary
\nTerm\"), and as long thereafter as oil, gas or other minerals
\n(except metallic minerals) are produced from said land hereunder
\nin paying quantities, subject, however, to all of the terms and
\nprovisions of this lease. After expiration of the primary term,
\nthis lease shall terminate as to all lands included herein, save
\nand except as to those tracts which lessee maintains in force and
\neffect according to the requirements hereof.
\n3. The royalties to be paid by Lessee are (a) on oil, one-
\neighth (1/8) of that produced and saved from said land, the same to
\nbe delivered at the well or to the credit of Lessors into the pipe i
\nline to which the well may be connected; (b) on gas, including
\ni casinghead gas or other gaseous or vaporous substance, produced
\nfrom the leased premises and sold or used by Lessee off the leased
\npremises or in the manufacture of gasoline or other products, the
\nmarket value, at the mouth of the well, of one-eighth (1/8) of
\n.\n3-2-\n?\n"
},
"context": {"pageNumber": 3}
...

Page 4


...
"text": "•\n:\n.\nthe gas or casinghead gas so used or sold. On all gas or casing-
\nhead gas sold at the well, the royalty shall be one-eighth (1/8)
\nof the amounts realized from such sales. While gas from any well
\nproducing gas only is being used or sold by. Lessee, Lessor may have
\nenough of said gas for all stoves and inside lights in the prin-
\ncipal dwelling house on the leased premises by making Lessors' own
\nconnections with the well and by assuming all risk and paying all
\nexpenses. And (c) on all other minerals (except metallic minerals)
\nmined and marketed, one tenth (1/10). either in kind or value at the
\nwell or mine at Lessee's election.
\nFor the purpose of royalty payments under 3 (b) hereof,
\nall liquid hydrocarbons (including distillate) recovered and saved
n| by Lessee in separators or traps on the leased premises shall be
\nconsidered as oil. Should such a plant be constructed by another
\nthan Lessee to whom Lessee should sell or deliver the gas or cas-
\ninghead gas produced from the leased premises for processing, then
\nthe royalty thereon shall be one-eighth (1/8) of the amounts
\nrealized by Lessee from such sales or deliveries.


\nOr if such plant is owned or constructed or operated by
\nLessee, then the royalty shall be on the basis of one-eighth (1/8) |
\nof the prevailing price in the area for such products..
\nThe provisions of this paragraph shall control as to any
\nconflict with Paragraph 3 (b). Lessors shall also be entitled to
\nsaid royalty interest in all residue gas .obtained, saved and mar-
\nketed from said premises, or used off the premises, or that may be
\nreplaced in the reservoir by 'any recycling process, settlement
\ntherefor to be made to Lessors when such gas is marketed or used
\noff the premises. !
\nIf at the expiration of the primary term of this lease
\nLessee has not found and produced oil or gas in paying quantities
\nin any formation lying fifty (50) feet below the base of what is
\nknown as the Rhodessa section at the particular point where the
\nwell is drilled, then, subject to the further provisions hereof,
\nthis lease shall terminate as to all horizons below fifty (50)
\nI feet below the Rhodessa section. And if at the expiration of the
\n3 -3-\n"
},
"context": {"pageNumber": 4}
...

Page 5


...
"text": ".\n.\n:\nI\n.\n.\n.:250:-....\n.\n...\n.\n....\n....\n..\n..\n. ..
\n.\n..\n.\n...\n...\n.-\n.\n.\n..\n..\n17\n.\n:\n-\n-\n-\n.\n..\n.
\nprimary term production of oil or gas in paying quantities is not
\nfound in the Jurassic, then this lease shall terminate as to the
\nJurassic and lower formations unless Lessee shall have completed
\nat least two (2) tests in the Jurassic. And after the primary
\nterm Lessee shall complete at least one (1) Jurassic test each
\nthree years on said property as to which this lease is still in
\neffect, until paying production is obtained in or below the
\nJurassic, or upon failure so to do Lessee shall release this
\nlease as to all formations below the top of the Jurassic. Upon
\ncompliance with the above provisions as to Jurassic tests, and
\nif production is found in the Jurassic, then, subject to the
\nother provisions hereof, this lease shall be effective as to all
\nhorizons, including the Jurassic..
\n5. It is understood and expressly agreed that the consider-
\niation first recited in this lease, the down cash payment, receipt
\nof which is hereby acknowledged by Lessors, is full and adequate
\nconsideration to maintain this lease in full force and effect for
\na period of one year from the date hereof, and does not impose
\nany obligation on the part of Lessee to drill and develop this
\nlease during the said term of one year from date of this lease.


\n6. This lease shall terminate as to both parties unless
\non or before one year from this date, Lessee shall pay to or ten- !
\nder to Lessors or to the credit of Lessors, in the National Bank
\nof Commerce, at San Antonio, Texas, (which bank and its successors
\nare Lessors' agent, and shall continue as the depository for all \"
\nrental payable hereunder regardless of changes in ownership of
\nsaid land or the rental), the sum of One Dollar ($1.00) per acre
\nas to all acreage then covered by this lease, and not surrendered,
\nor maintained by production of oil, gas or other minerals, or by
\ndrilling-reworking operations, all as hereinafter fully set out, :
\nwhich shall maintain this lease in full force and effect for
\nanother twelve-month period, without imposing any obligation on
\nthe part of Lessee to drill and develop this lease. In like
\nmanner, and upon like payment or tender annually, Lessee may
\nmaintain this lease .in full force and effect for successive
\ntwelve-month periods during the primary term, without imposing
\n.\n--.\n.\n.\n.\n-\n::\n---
\n-\n3\n.\n..-\n-\n-\n:.\n.\n::\n.
\n3-4-\n"
},
"context": {"pageNumber": 5}
...

Limites

Cinq pages au maximum seront annotées. Les utilisateurs peuvent spécifier les cinq pages à annoter.

Authentication

Les clés API ne sont pas compatibles avec les requêtes files:annotate.

Configurer votre authentification et votre projet Google Cloud

Si vous n'avez pas encore créé de projet Google Cloud, faites-le maintenant. Développez cette section pour obtenir la marche à suivre.

Connectez-vous à votre compte Google Cloud. Si vous débutez sur Google Cloud, créez un compte pour évaluer les performances de nos produits en conditions réelles. Les nouveaux clients bénéficient également de 300 $ de crédits gratuits pour exécuter, tester et déployer des charges de travail.

Dans Google Cloud Console, sur la page de sélection du projet, sélectionnez ou créez un projet Google Cloud.

Accéder au sélecteur de projet

Vérifiez que la facturation est activée pour votre projet Google Cloud.

Activez Vision API.

Activer l'API

Installez Google Cloud CLI.

Pour initialiser gcloudCLI, exécutez la commande suivante :

gcloud init

Dans Google Cloud Console, sur la page de sélection du projet, sélectionnez ou créez un projet Google Cloud.

Accéder au sélecteur de projet

Vérifiez que la facturation est activée pour votre projet Google Cloud.

Activez Vision API.

Activer l'API

Installez Google Cloud CLI.

Pour initialiser gcloudCLI, exécutez la commande suivante :

gcloud init

Types de fonctionnalités actuellement compatibles

Type de caractéristique
`CROP_HINTS`	Suggère des sommets pour cadrer une zone d'une image.
`DOCUMENT_TEXT_DETECTION`	Exécute la reconnaissance optique de caractères (OCR) sur des images de texte dense, telles que des documents (PDF/TIFF) et des images contenant du texte manuscrit. `TEXT_DETECTION` peut être utilisé pour les images de texte épars. Prioritaire lorsque `DOCUMENT_TEXT_DETECTION` et `TEXT_DETECTION` sont présents.
`FACE_DETECTION`	Détecte les visages dans l'image.
`IMAGE_PROPERTIES`	Calcule un ensemble de propriétés d'image, telles que les couleurs dominantes de l'image.
`LABEL_DETECTION`	Ajoute des libellés en fonction du contenu de l'image.
`LANDMARK_DETECTION`	Détecte les points de repères géographiques dans l'image.
`LOGO_DETECTION`	Détecte les logos d'entreprise dans l'image.
`OBJECT_LOCALIZATION`	Détecte et extrait plusieurs objets dans une image.
`SAFE_SEARCH_DETECTION`	Exécute des requêtes SafeSearch pour détecter du contenu potentiellement dangereux ou indésirable.
`TEXT_DETECTION`	Effectue une reconnaissance optique des caractères (OCR) sur le texte de l'image. La détection de texte est optimisée pour les zones de texte épars dans une image plus grande. Si l'image est un document au format PDF ou TIFF au texte dense, ou s'il contient du texte manuscrit, utilisez plutôt `DOCUMENT_TEXT_DETECTION`.
`WEB_DETECTION`	Détecte des entités thématiques, telles que des actualités, des événements ou des célébrités dans l'image, et trouve des images similaires sur le Web en s'appuyant sur la puissance de la fonctionnalité de Recherche d'images de Google.

Exemple de code

Vous pouvez envoyer une requête d'annotation à l'aide d'un fichier stocké localement ou utiliser un fichier stocké dans Cloud Storage.

Utiliser un fichier stocké localement

Utilisez les exemples de code suivants pour obtenir une annotation de caractéristiques pour un fichier stocké localement.

REST

Pour effectuer la détection en ligne de caractéristiques PDF, TIFF et GIF pour un petit lot de fichiers, envoyez une requête POST et fournissez le corps de requête approprié :

Avant d'utiliser les données de requête ci-dessous, effectuez les remplacements suivants :

BASE64_ENCODED_FILE : représentation en base64 (chaîne ASCII) de vos données d'image binaires. Cette chaîne doit ressembler à la chaîne suivante :
- JVBERi0xLjUNCiW1tbW1...ydHhyZWYNCjk5NzM2OQ0KJSVFT0Y=
Consultez la section encodage en base64 pour plus d'informations.
PROJECT_ID : ID de votre projet Google Cloud.

Remarque sur les champs :

inputConfig.mimeType : l'un des éléments suivants : "application/pdf", "image/tiff" ou "image/gif".
pages : indique les pages spécifiques du fichier pour la détection de caractéristiques.

Méthode HTTP et URL :

POST https://vision.googleapis.com/v1/files:annotate

Corps JSON de la requête :

{
  "requests": [
    {
      "inputConfig": {
        "content": "BASE64_ENCODED_FILE",
        "mimeType": "application/pdf"
      },
      "features": [
        {
          "type": "DOCUMENT_TEXT_DETECTION"
        }
      ],
      "pages": [
        1,2,3,4,5
      ]
    }
  ]
}

Pour envoyer votre requête, choisissez l'une des options suivantes :

curl

Remarque : La commande suivante suppose que vous êtes connecté à la CLI gcloud avec votre compte utilisateur en exécutant la commande gcloud init ou gcloud auth login, ou en utilisant Cloud Shell, qui vous connecte automatiquement à la CLI gcloud. Vous pouvez exécuter gcloud auth list pour vérifier le compte actuellement actif.

Enregistrez le corps de la requête dans un fichier nommé request.json, puis exécutez la commande suivante :

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "x-goog-user-project: PROJECT_ID" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://vision.googleapis.com/v1/files:annotate"

PowerShell

Remarque : La commande suivante suppose que vous vous êtes connecté à la CLI gcloud avec votre compte utilisateur en exécutant la commande gcloud init ou gcloud auth login. Vous pouvez exécuter gcloud auth list pour vérifier le compte actuellement actif.

Enregistrez le corps de la requête dans un fichier nommé request.json, puis exécutez la commande suivante :

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = "PROJECT_ID" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://vision.googleapis.com/v1/files:annotate" | Select-Object -Expand Content

Réponse :

Une requête annotate réussie renvoie immédiatement une réponse JSON.

Pour cette caractéristique DOCUMENT_TEXT_DETECTION, la réponse JSON est semblable à celle de la requête de détection de texte d'un document sur une image. La réponse contient des cadres de délimitation pour les blocs ventilés par paragraphes, mots et symboles individuels. L'ensemble du texte est également détecté. La réponse contient également un champ context indiquant l'emplacement du fichier PDF ou TIFF spécifié et le numéro de page du résultat dans le fichier.

La réponse JSON suivante ne concerne qu'une seule page (page 2) et a été raccourcie pour plus de clarté.

Réponse



    {
      "responses": [
        {
          "responses": [
            {
              "fullTextAnnotation": {
                "pages": [
                  {
                    "property": {
                      "detectedLanguages": [
                        {
                          "languageCode": "en",
                          "confidence": 0.99
                        },
                        {
                          "languageCode": "pl",
                          "confidence": 0.01
                        }
                      ]
                    },
                    "width": 1342,
                    "height": 2234,
                    "blocks": [
                      {
                        "boundingBox": {
                          "vertices": [
                          ...
                          ]
                        },
                        "paragraphs": [
                          {
                            "boundingBox": {
                              "vertices": [
                              ...
                              ]
                            },
                            "words": [
                              {
                                "property": {
                                  "detectedLanguages": [
                                    {
                                      "languageCode": "en"
                                    }
                                  ]
                                },
                                "boundingBox": {
                                  "vertices": [
                                  ...
                                  ]
                                },
                                "symbols": [
                                  {
                                    "property": {
                                      "detectedLanguages": [
                                        {
                                          "languageCode": "en"
                                        }
                                      ],
                                      "detectedBreak": {
                                        "type": "SPACE"
                                      }
                                    },
                                    "boundingBox": {
                                      "vertices": [
                                    ...
                                      ]
                                    },
                                    "text": "#",
                                    "confidence": 0.07
                                  }
                                ],
                                "confidence": 0.07
                              },
                              ...
                        ],
                        "blockType": "TEXT",
                        "confidence": 0.88
                      },
                      ...
                ...
                "text": "# THE STATE OF TEXAS\n0\nOIL, GAS AND MINERAL LEASE\n
                COUNTY OF MAVERICK\nTHIS AGREEMENT made this 14 day of_June\n1954,
                between Norvel J. Chittim and his wife, Lieschen G. Chittim;\nMary
                Anne Chittim Parker, joined herein pro forma by her husband,\nJoseph
                Bright Parker; Dorothea Chittim Oppenheimer, joined herein\nji pro
                forma by her husband, Fred J. Oppenheimer; Tuleta Chittim\nWright,
                joined herein pro forma by her husband, Gilbert G. Wright,\nJr.;
                Gilbert G. Wright, III; Delă Wright White, joined herein pro\nforma
                by her husband, John H. White; Anne Wright Basse, joined\nherein
                pro forma by her husband, E. A. Basse, Jr.; Norvel J.\nChittim,
                Independent Executor and Trustee for Estate of Marstella\nChittim,
                Deceased; Mary Louise Roswell, joined herein pro forma by\nher
                husband, Charles M. 'Roswell; and James M. Chittim and his wife\n
                Thelma Neal Chittim; as LESSORS, and W. L. Scheig of San Antonio,\n
                Texas, as LESSEE,\n10\nW ITNESS ETH:\nLessors, in consideration of
                $10.00, cash in hand paid, i\nof the royalties herein provided,
                and of the agreement of Lessee\nherein contained, hereby grant,
                lease and let exclusively unto\nLessee the tracts of land
                hereinafter described for the purpose of\ntesting for mineral
                indications, and in such tests use the Seismo-\ngraph, Torsion
                Balance, Core Drill, or any other tools, machinery,\nequipment
                or explosive necessary and proper; and also prospecting,\ndrilling
                and mining for and producing oil, gas and other minerals i\n
                (except metallic minerals), laying pipe lines, building tanks,\n
                power stations, telephone lines and other structures thereon to\n
                produce, save, take care of, treat, transport and own said pro-\n
                ducts and housing its employees (Lessee to conduct its geophysical\n
                work in such manner as not to damage the buildings, water tanks\n
                or wells of Lessors, or the livestock of Lessors or Lessors' ten-\n
                ants, ) said lands being situated in Maverick, Zavalla and Dimmit\n
                Counties, Texas, to-wit:\n3 -1.\n"
              },
              "context": {
                "pageNumber": 2
              }
            }
          ]
        }
      ]
    }

Java

Avant d'essayer cet exemple, suivez les instructions de configuration pour Java du guide de démarrage rapide de Vision à l'aide des bibliothèques clientes. Pour en savoir plus, consultez la documentation de référence de l'API Cloud Vision en langage Java.

import com.google.cloud.vision.v1.AnnotateFileRequest;
import com.google.cloud.vision.v1.AnnotateImageResponse;
import com.google.cloud.vision.v1.BatchAnnotateFilesRequest;
import com.google.cloud.vision.v1.BatchAnnotateFilesResponse;
import com.google.cloud.vision.v1.Block;
import com.google.cloud.vision.v1.Feature;
import com.google.cloud.vision.v1.ImageAnnotatorClient;
import com.google.cloud.vision.v1.InputConfig;
import com.google.cloud.vision.v1.Page;
import com.google.cloud.vision.v1.Paragraph;
import com.google.cloud.vision.v1.Symbol;
import com.google.cloud.vision.v1.Word;
import com.google.protobuf.ByteString;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;

public class BatchAnnotateFiles {

  public static void batchAnnotateFiles() throws IOException {
    String filePath = "path/to/your/file.pdf";
    batchAnnotateFiles(filePath);
  }

  public static void batchAnnotateFiles(String filePath) throws IOException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (ImageAnnotatorClient imageAnnotatorClient = ImageAnnotatorClient.create()) {
      // You can send multiple files to be annotated, this sample demonstrates how to do this with
      // one file. If you want to use multiple files, you have to create a `AnnotateImageRequest`
      // object for each file that you want annotated.
      // First read the files contents
      Path path = Paths.get(filePath);
      byte[] data = Files.readAllBytes(path);
      ByteString content = ByteString.copyFrom(data);

      // Specify the input config with the file's contents and its type.
      // Supported mime_type: application/pdf, image/tiff, image/gif
      // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#inputconfig
      InputConfig inputConfig =
          InputConfig.newBuilder().setMimeType("application/pdf").setContent(content).build();

      // Set the type of annotation you want to perform on the file
      // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#google.cloud.vision.v1.Feature.Type
      Feature feature = Feature.newBuilder().setType(Feature.Type.DOCUMENT_TEXT_DETECTION).build();

      // Build the request object for that one file. Note: for additional file you have to create
      // additional `AnnotateFileRequest` objects and store them in a list to be used below.
      // Since we are sending a file of type `application/pdf`, we can use the `pages` field to
      // specify which pages to process. The service can process up to 5 pages per document file.
      // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#google.cloud.vision.v1.AnnotateFileRequest
      AnnotateFileRequest fileRequest =
          AnnotateFileRequest.newBuilder()
              .setInputConfig(inputConfig)
              .addFeatures(feature)
              .addPages(1) // Process the first page
              .addPages(2) // Process the second page
              .addPages(-1) // Process the last page
              .build();

      // Add each `AnnotateFileRequest` object to the batch request.
      BatchAnnotateFilesRequest request =
          BatchAnnotateFilesRequest.newBuilder().addRequests(fileRequest).build();

      // Make the synchronous batch request.
      BatchAnnotateFilesResponse response = imageAnnotatorClient.batchAnnotateFiles(request);

      // Process the results, just get the first result, since only one file was sent in this
      // sample.
      for (AnnotateImageResponse imageResponse :
          response.getResponsesList().get(0).getResponsesList()) {
        System.out.format("Full text: %s%n", imageResponse.getFullTextAnnotation().getText());
        for (Page page : imageResponse.getFullTextAnnotation().getPagesList()) {
          for (Block block : page.getBlocksList()) {
            System.out.format("%nBlock confidence: %s%n", block.getConfidence());
            for (Paragraph par : block.getParagraphsList()) {
              System.out.format("\tParagraph confidence: %s%n", par.getConfidence());
              for (Word word : par.getWordsList()) {
                System.out.format("\t\tWord confidence: %s%n", word.getConfidence());
                for (Symbol symbol : word.getSymbolsList()) {
                  System.out.format(
                      "\t\t\tSymbol: %s, (confidence: %s)%n",
                      symbol.getText(), symbol.getConfidence());
                }
              }
            }
          }
        }
      }
    }
  }
}

Node.js

Avant d'essayer cet exemple, suivez les instructions de configuration pour Node.js décrites dans le guide de démarrage rapide de Vision à l'aide des bibliothèques clientes. Pour en savoir plus, consultez la documentation de référence de l'API Cloud Vision en langage Node.js.

Pour vous authentifier auprès de Vision, configurez le service Identifiants par défaut de l'application. Pour en savoir plus, consultez Configurer l'authentification pour un environnement de développement local.

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const fileName = 'path/to/your/file.pdf';

// Imports the Google Cloud client libraries
const {ImageAnnotatorClient} = require('@google-cloud/vision').v1;
const fs = require('fs').promises;

// Instantiates a client
const client = new ImageAnnotatorClient();

// You can send multiple files to be annotated, this sample demonstrates how to do this with
// one file. If you want to use multiple files, you have to create a request object for each file that you want annotated.
async function batchAnnotateFiles() {
  // First Specify the input config with the file's path and its type.
  // Supported mime_type: application/pdf, image/tiff, image/gif
  // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#inputconfig
  const inputConfig = {
    mimeType: 'application/pdf',
    content: await fs.readFile(fileName),
  };

  // Set the type of annotation you want to perform on the file
  // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#google.cloud.vision.v1.Feature.Type
  const features = [{type: 'DOCUMENT_TEXT_DETECTION'}];

  // Build the request object for that one file. Note: for additional files you have to create
  // additional file request objects and store them in a list to be used below.
  // Since we are sending a file of type `application/pdf`, we can use the `pages` field to
  // specify which pages to process. The service can process up to 5 pages per document file.
  // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#google.cloud.vision.v1.AnnotateFileRequest
  const fileRequest = {
    inputConfig: inputConfig,
    features: features,
    // Annotate the first two pages and the last one (max 5 pages)
    // First page starts at 1, and not 0. Last page is -1.
    pages: [1, 2, -1],
  };

  // Add each `AnnotateFileRequest` object to the batch request.
  const request = {
    requests: [fileRequest],
  };

  // Make the synchronous batch request.
  const [result] = await client.batchAnnotateFiles(request);

  // Process the results, just get the first result, since only one file was sent in this
  // sample.
  const responses = result.responses[0].responses;

  for (const response of responses) {
    console.log(`Full text: ${response.fullTextAnnotation.text}`);
    for (const page of response.fullTextAnnotation.pages) {
      for (const block of page.blocks) {
        console.log(`Block confidence: ${block.confidence}`);
        for (const paragraph of block.paragraphs) {
          console.log(` Paragraph confidence: ${paragraph.confidence}`);
          for (const word of paragraph.words) {
            const symbol_texts = word.symbols.map(symbol => symbol.text);
            const word_text = symbol_texts.join('');
            console.log(
              `  Word text: ${word_text} (confidence: ${word.confidence})`
            );
            for (const symbol of word.symbols) {
              console.log(
                `   Symbol: ${symbol.text} (confidence: ${symbol.confidence})`
              );
            }
          }
        }
      }
    }
  }
}

batchAnnotateFiles();

Python

Avant d'essayer cet exemple, suivez les instructions de configuration pour Python décrites dans le guide de démarrage rapide de Vision à l'aide des bibliothèques clientes. Pour en savoir plus, consultez la documentation de référence de l'API Cloud Vision en langage Python.



from google.cloud import vision_v1

def sample_batch_annotate_files(file_path="path/to/your/document.pdf"):
    """Perform batch file annotation."""
    client = vision_v1.ImageAnnotatorClient()

    # Supported mime_type: application/pdf, image/tiff, image/gif
    mime_type = "application/pdf"
    with open(file_path, "rb") as f:
        content = f.read()
    input_config = {"mime_type": mime_type, "content": content}
    features = [{"type_": vision_v1.Feature.Type.DOCUMENT_TEXT_DETECTION}]

    # The service can process up to 5 pages per document file. Here we specify
    # the first, second, and last page of the document to be processed.
    pages = [1, 2, -1]
    requests = [{"input_config": input_config, "features": features, "pages": pages}]

    response = client.batch_annotate_files(requests=requests)
    for image_response in response.responses[0].responses:
        print(f"Full text: {image_response.full_text_annotation.text}")
        for page in image_response.full_text_annotation.pages:
            for block in page.blocks:
                print(f"\nBlock confidence: {block.confidence}")
                for par in block.paragraphs:
                    print(f"\tParagraph confidence: {par.confidence}")
                    for word in par.words:
                        print(f"\t\tWord confidence: {word.confidence}")
                        for symbol in word.symbols:
                            print(
                                "\t\t\tSymbol: {}, (confidence: {})".format(
                                    symbol.text, symbol.confidence
                                )
                            )

Utiliser un fichier sur Cloud Storage

Utilisez les exemples de code suivants pour obtenir une annotation de caractéristiques pour un fichier stocké sur Cloud Storage.

REST

Pour effectuer la détection en ligne de caractéristiques PDF, TIFF et GIF pour un petit lot de fichiers, envoyez une requête POST et fournissez le corps de requête approprié :

Avant d'utiliser les données de requête ci-dessous, effectuez les remplacements suivants :

CLOUD_STORAGE_FILE_URI : chemin d'accès à un fichier valide (PDF/TIFF) dans un bucket Cloud Storage. Il faut au minimum disposer des droits en lecture sur le fichier. Exemple :
- ```
gs://cloud-samples-data/vision/document_understanding/custom_0773375000.pdf
```
PROJECT_ID : ID de votre projet Google Cloud.

Remarque sur les champs :

inputConfig.mimeType : l'un des éléments suivants : "application/pdf", "image/tiff" ou "image/gif".
pages : indique les pages spécifiques du fichier pour la détection de caractéristiques.

Méthode HTTP et URL :

POST https://vision.googleapis.com/v1/files:annotate

Corps JSON de la requête :

{
  "requests": [
    {
      "inputConfig": {
        "gcsSource": {
          "uri": "CLOUD_STORAGE_FILE_URI"
        },
        "mimeType": "application/pdf"
      },
      "features": [
        {
          "type": "DOCUMENT_TEXT_DETECTION"
        }
      ],
      "pages": [
        1,2,3,4,5
      ]
    }
  ]
}

Pour envoyer votre requête, choisissez l'une des options suivantes :

curl

Enregistrez le corps de la requête dans un fichier nommé request.json, puis exécutez la commande suivante :

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "x-goog-user-project: PROJECT_ID" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://vision.googleapis.com/v1/files:annotate"

PowerShell

Enregistrez le corps de la requête dans un fichier nommé request.json, puis exécutez la commande suivante :

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = "PROJECT_ID" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://vision.googleapis.com/v1/files:annotate" | Select-Object -Expand Content

Réponse :

Une requête annotate réussie renvoie immédiatement une réponse JSON.

La réponse JSON suivante ne concerne qu'une seule page (page 2) et a été raccourcie pour plus de clarté.

Réponse



    {
      "responses": [
        {
          "responses": [
            {
              "fullTextAnnotation": {
                "pages": [
                  {
                    "property": {
                      "detectedLanguages": [
                        {
                          "languageCode": "en",
                          "confidence": 0.99
                        },
                        {
                          "languageCode": "pl",
                          "confidence": 0.01
                        }
                      ]
                    },
                    "width": 1342,
                    "height": 2234,
                    "blocks": [
                      {
                        "boundingBox": {
                          "vertices": [
                          ...
                          ]
                        },
                        "paragraphs": [
                          {
                            "boundingBox": {
                              "vertices": [
                              ...
                              ]
                            },
                            "words": [
                              {
                                "property": {
                                  "detectedLanguages": [
                                    {
                                      "languageCode": "en"
                                    }
                                  ]
                                },
                                "boundingBox": {
                                  "vertices": [
                                  ...
                                  ]
                                },
                                "symbols": [
                                  {
                                    "property": {
                                      "detectedLanguages": [
                                        {
                                          "languageCode": "en"
                                        }
                                      ],
                                      "detectedBreak": {
                                        "type": "SPACE"
                                      }
                                    },
                                    "boundingBox": {
                                      "vertices": [
                                    ...
                                      ]
                                    },
                                    "text": "#",
                                    "confidence": 0.07
                                  }
                                ],
                                "confidence": 0.07
                              },
                              ...
                        ],
                        "blockType": "TEXT",
                        "confidence": 0.88
                      },
                      ...
                ...
                "text": "# THE STATE OF TEXAS\n0\nOIL, GAS AND MINERAL LEASE\n
                COUNTY OF MAVERICK\nTHIS AGREEMENT made this 14 day of_June\n1954,
                between Norvel J. Chittim and his wife, Lieschen G. Chittim;\nMary
                Anne Chittim Parker, joined herein pro forma by her husband,\nJoseph
                Bright Parker; Dorothea Chittim Oppenheimer, joined herein\nji pro
                forma by her husband, Fred J. Oppenheimer; Tuleta Chittim\nWright,
                joined herein pro forma by her husband, Gilbert G. Wright,\nJr.;
                Gilbert G. Wright, III; Delă Wright White, joined herein pro\nforma
                by her husband, John H. White; Anne Wright Basse, joined\nherein
                pro forma by her husband, E. A. Basse, Jr.; Norvel J.\nChittim,
                Independent Executor and Trustee for Estate of Marstella\nChittim,
                Deceased; Mary Louise Roswell, joined herein pro forma by\nher
                husband, Charles M. 'Roswell; and James M. Chittim and his wife\n
                Thelma Neal Chittim; as LESSORS, and W. L. Scheig of San Antonio,\n
                Texas, as LESSEE,\n10\nW ITNESS ETH:\nLessors, in consideration of
                $10.00, cash in hand paid, i\nof the royalties herein provided,
                and of the agreement of Lessee\nherein contained, hereby grant,
                lease and let exclusively unto\nLessee the tracts of land
                hereinafter described for the purpose of\ntesting for mineral
                indications, and in such tests use the Seismo-\ngraph, Torsion
                Balance, Core Drill, or any other tools, machinery,\nequipment
                or explosive necessary and proper; and also prospecting,\ndrilling
                and mining for and producing oil, gas and other minerals i\n
                (except metallic minerals), laying pipe lines, building tanks,\n
                power stations, telephone lines and other structures thereon to\n
                produce, save, take care of, treat, transport and own said pro-\n
                ducts and housing its employees (Lessee to conduct its geophysical\n
                work in such manner as not to damage the buildings, water tanks\n
                or wells of Lessors, or the livestock of Lessors or Lessors' ten-\n
                ants, ) said lands being situated in Maverick, Zavalla and Dimmit\n
                Counties, Texas, to-wit:\n3 -1.\n"
              },
              "context": {
                "pageNumber": 2
              }
            }
          ]
        }
      ]
    }

Java

import com.google.cloud.vision.v1.AnnotateFileRequest;
import com.google.cloud.vision.v1.AnnotateImageResponse;
import com.google.cloud.vision.v1.BatchAnnotateFilesRequest;
import com.google.cloud.vision.v1.BatchAnnotateFilesResponse;
import com.google.cloud.vision.v1.Block;
import com.google.cloud.vision.v1.Feature;
import com.google.cloud.vision.v1.GcsSource;
import com.google.cloud.vision.v1.ImageAnnotatorClient;
import com.google.cloud.vision.v1.InputConfig;
import com.google.cloud.vision.v1.Page;
import com.google.cloud.vision.v1.Paragraph;
import com.google.cloud.vision.v1.Symbol;
import com.google.cloud.vision.v1.Word;
import java.io.IOException;

public class BatchAnnotateFilesGcs {

  public static void batchAnnotateFilesGcs() throws IOException {
    String gcsUri = "gs://cloud-samples-data/vision/document_understanding/kafka.pdf";
    batchAnnotateFilesGcs(gcsUri);
  }

  public static void batchAnnotateFilesGcs(String gcsUri) throws IOException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (ImageAnnotatorClient imageAnnotatorClient = ImageAnnotatorClient.create()) {
      // You can send multiple files to be annotated, this sample demonstrates how to do this with
      // one file. If you want to use multiple files, you have to create a `AnnotateImageRequest`
      // object for each file that you want annotated.
      // First specify where the vision api can find the image
      GcsSource gcsSource = GcsSource.newBuilder().setUri(gcsUri).build();

      // Specify the input config with the file's uri and its type.
      // Supported mime_type: application/pdf, image/tiff, image/gif
      // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#inputconfig
      InputConfig inputConfig =
          InputConfig.newBuilder().setMimeType("application/pdf").setGcsSource(gcsSource).build();

      // Set the type of annotation you want to perform on the file
      // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#google.cloud.vision.v1.Feature.Type
      Feature feature = Feature.newBuilder().setType(Feature.Type.DOCUMENT_TEXT_DETECTION).build();

      // Build the request object for that one file. Note: for additional file you have to create
      // additional `AnnotateFileRequest` objects and store them in a list to be used below.
      // Since we are sending a file of type `application/pdf`, we can use the `pages` field to
      // specify which pages to process. The service can process up to 5 pages per document file.
      // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#google.cloud.vision.v1.AnnotateFileRequest
      AnnotateFileRequest fileRequest =
          AnnotateFileRequest.newBuilder()
              .setInputConfig(inputConfig)
              .addFeatures(feature)
              .addPages(1) // Process the first page
              .addPages(2) // Process the second page
              .addPages(-1) // Process the last page
              .build();

      // Add each `AnnotateFileRequest` object to the batch request.
      BatchAnnotateFilesRequest request =
          BatchAnnotateFilesRequest.newBuilder().addRequests(fileRequest).build();

      // Make the synchronous batch request.
      BatchAnnotateFilesResponse response = imageAnnotatorClient.batchAnnotateFiles(request);

      // Process the results, just get the first result, since only one file was sent in this
      // sample.
      for (AnnotateImageResponse imageResponse :
          response.getResponsesList().get(0).getResponsesList()) {
        System.out.format("Full text: %s%n", imageResponse.getFullTextAnnotation().getText());
        for (Page page : imageResponse.getFullTextAnnotation().getPagesList()) {
          for (Block block : page.getBlocksList()) {
            System.out.format("%nBlock confidence: %s%n", block.getConfidence());
            for (Paragraph par : block.getParagraphsList()) {
              System.out.format("\tParagraph confidence: %s%n", par.getConfidence());
              for (Word word : par.getWordsList()) {
                System.out.format("\t\tWord confidence: %s%n", word.getConfidence());
                for (Symbol symbol : word.getSymbolsList()) {
                  System.out.format(
                      "\t\t\tSymbol: %s, (confidence: %s)%n",
                      symbol.getText(), symbol.getConfidence());
                }
              }
            }
          }
        }
      }
    }
  }
}

Node.js

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const gcsSourceUri = 'gs://cloud-samples-data/vision/document_understanding/kafka.pdf';

// Imports the Google Cloud client libraries
const {ImageAnnotatorClient} = require('@google-cloud/vision').v1;

// Instantiates a client
const client = new ImageAnnotatorClient();

// You can send multiple files to be annotated, this sample demonstrates how to do this with
// one file. If you want to use multiple files, you have to create a request object for each file that you want annotated.
async function batchAnnotateFiles() {
  // First Specify the input config with the file's uri and its type.
  // Supported mime_type: application/pdf, image/tiff, image/gif
  // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#inputconfig
  const inputConfig = {
    mimeType: 'application/pdf',
    gcsSource: {
      uri: gcsSourceUri,
    },
  };

  // Set the type of annotation you want to perform on the file
  // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#google.cloud.vision.v1.Feature.Type
  const features = [{type: 'DOCUMENT_TEXT_DETECTION'}];

  // Build the request object for that one file. Note: for additional files you have to create
  // additional file request objects and store them in a list to be used below.
  // Since we are sending a file of type `application/pdf`, we can use the `pages` field to
  // specify which pages to process. The service can process up to 5 pages per document file.
  // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#google.cloud.vision.v1.AnnotateFileRequest
  const fileRequest = {
    inputConfig: inputConfig,
    features: features,
    // Annotate the first two pages and the last one (max 5 pages)
    // First page starts at 1, and not 0. Last page is -1.
    pages: [1, 2, -1],
  };

  // Add each `AnnotateFileRequest` object to the batch request.
  const request = {
    requests: [fileRequest],
  };

  // Make the synchronous batch request.
  const [result] = await client.batchAnnotateFiles(request);

  // Process the results, just get the first result, since only one file was sent in this
  // sample.
  const responses = result.responses[0].responses;

  for (const response of responses) {
    console.log(`Full text: ${response.fullTextAnnotation.text}`);
    for (const page of response.fullTextAnnotation.pages) {
      for (const block of page.blocks) {
        console.log(`Block confidence: ${block.confidence}`);
        for (const paragraph of block.paragraphs) {
          console.log(` Paragraph confidence: ${paragraph.confidence}`);
          for (const word of paragraph.words) {
            const symbol_texts = word.symbols.map(symbol => symbol.text);
            const word_text = symbol_texts.join('');
            console.log(
              `  Word text: ${word_text} (confidence: ${word.confidence})`
            );
            for (const symbol of word.symbols) {
              console.log(
                `   Symbol: ${symbol.text} (confidence: ${symbol.confidence})`
              );
            }
          }
        }
      }
    }
  }
}

batchAnnotateFiles();

Python


from google.cloud import vision_v1

def sample_batch_annotate_files(
    storage_uri="gs://cloud-samples-data/vision/document_understanding/kafka.pdf",
):
    """Perform batch file annotation."""
    mime_type = "application/pdf"

    client = vision_v1.ImageAnnotatorClient()

    gcs_source = {"uri": storage_uri}
    input_config = {"gcs_source": gcs_source, "mime_type": mime_type}
    features = [{"type_": vision_v1.Feature.Type.DOCUMENT_TEXT_DETECTION}]

    # The service can process up to 5 pages per document file.
    # Here we specify the first, second, and last page of the document to be
    # processed.
    pages = [1, 2, -1]
    requests = [{"input_config": input_config, "features": features, "pages": pages}]

    response = client.batch_annotate_files(requests=requests)
    for image_response in response.responses[0].responses:
        print(f"Full text: {image_response.full_text_annotation.text}")
        for page in image_response.full_text_annotation.pages:
            for block in page.blocks:
                print(f"\nBlock confidence: {block.confidence}")
                for par in block.paragraphs:
                    print(f"\tParagraph confidence: {par.confidence}")
                    for word in par.words:
                        print(f"\t\tWord confidence: {word.confidence}")
                        for symbol in word.symbols:
                            print(
                                "\t\t\tSymbol: {}, (confidence: {})".format(
                                    symbol.text, symbol.confidence
                                )
                            )

Essayer

Essayez ci-dessous la détection de caractéristiques en ligne par petits lots.

Vous pouvez utiliser le fichier PDF déjà spécifié ou fournir votre propre fichier.

Trois types d'éléments sont spécifiés pour cette requête :

DOCUMENT_TEXT_DETECTION
LABEL_DETECTION
CROP_HINTS

Vous pouvez ajouter ou supprimer d'autres types de caractéristiques en modifiant l'objet approprié dans la requête ({"type": "FEATURE_NAME"}).

Pour envoyer la requête, cliquez sur Exécuter.

Corps de la requête :

{
  "requests": [
    {
      "inputConfig": {
        "gcsSource": {
          "uri": "gs://cloud-samples-data/vision/document_understanding/custom_0773375000.pdf"
        },
        "mimeType": "application/pdf"
      },
      "features": [
        {
          "type": "DOCUMENT_TEXT_DETECTION"
        },
        {
          "type": "LABEL_DETECTION"
        },
        {
          "type": "CROP_HINTS"
        }
      ],
      "pages": [
        1,
        2,
        3,
        4,
        5
      ]
    }
  ]
}