Batch-Bildannotation offline

Die Vision API kann mit jedem beliebigen Feature-Typ von Vision asynchrone Offline-Erkennungsdienste und Annotationen für große Batches von Bilddateien ausführen. Sie können beispielsweise ein oder mehrere Vision API-Features für einen einzelnen Batch von Bildern angeben, zum Beispiel TEXT_DETECTION, LABEL_DETECTION und LANDMARK_DETECTION.

Die Ausgabedaten einer Offline-Batchanfrage werden in eine JSON-Datei geschrieben, die im angegebenen Cloud Storage-Bucket erstellt wird.

Online-Anfragen (synchron): Eine Anfrage für Online-Annotationen (images:annotate oder files:annotate) gibt sofort Inline-Annotationen an den Nutzer zurück. Bei Online-Annotationsanfragen ist die Anzahl der Dateien begrenzt, die in einer einzelnen Anfrage annotiert werden können. Mit einer images:annotate-Anfrage können Sie nur eine kleine Anzahl von Bildern (<= 16) angeben, die annotiert werden sollen. Mit einer files:annotate-Anfrage können Sie nur eine einzelne Datei und nur eine geringe Anzahl von Seiten (<= 5) in der Datei angeben, die annotiert werden soll.
Offline-Anfragen (asynchron): Eine Anfrage für Offline-Annotationen (images:asyncBatchAnnotate oder files:asyncBatchAnnotate) startet einen Vorgang mit langer Ausführungszeit und gibt nicht sofort eine Antwort an den Aufrufer zurück. Wenn der Vorgang mit langer Ausführungszeit abgeschlossen ist, werden die Annotationen in einem von Ihnen angegebenen Cloud Storage-Bucket als Dateien gespeichert. Sie können mit einer images:asyncBatchAnnotate-Anfrage bis zu 2.000 Bilder pro Anfrage angeben. Eine files:asyncBatchAnnotate-Anfrage gestattet es Ihnen, zeitgleich größere Batches von Dateien und mehr Seiten pro Datei (<= 2.000) für die Annotation anzugeben, als dies bei Online-Anfragen möglich ist.

Beschränkungen

Die Vision API akzeptiert bis zu 2.000 Bilddateien. Bei einem größeren Batch von Bilddateien wird ein Fehler zurückgegeben.

Derzeit unterstützte Featuretypen

Featuretyp
`CROP_HINTS`	Ermittelt Vorschläge für Eckpunkte für einen Bildausschnitt.
`DOCUMENT_TEXT_DETECTION`	Führt in Bildern mit hohem Textanteil eine OCR durch, zum Beispiel Dokumente (PDF/TIFF) und Bilder mit Handschrift. `TEXT_DETECTION` kann für Bilder mit wenig Text verwendet werden. Hat Vorrang, wenn `DOCUMENT_TEXT_DETECTION` und `TEXT_DETECTION` vorhanden sind.
`FACE_DETECTION`	Erkennt Gesichter im Bild.
`IMAGE_PROPERTIES`	Berechnet eine Reihe von Bildeigenschaften, zum Beispiel die dominanten Farben des Bildes.
`LABEL_DETECTION`	Fügt Labels ausgehend vom Bildinhalt hinzu.
`LANDMARK_DETECTION`	Erkennt geografische Sehenswürdigkeiten im Bild.
`LOGO_DETECTION`	Erkennt Firmenlogos im Bild.
`OBJECT_LOCALIZATION`	Erkennt und extrahiert mehrere Objekte in einem Bild.
`SAFE_SEARCH_DETECTION`	Führt SafeSearch aus, um potenziell unsichere oder unerwünschte Inhalte zu erkennen.
`TEXT_DETECTION`	Führt eine OCR für Text im Bild durch. Die Texterkennung ist für Bereiche mit wenig Text innerhalb eines größeren Bildes optimiert. Verwenden Sie `DOCUMENT_TEXT_DETECTION`, wenn das Bild ein Dokument ist (PDF/TIFF), viel Text oder Handschrift enthält.
`WEB_DETECTION`	Mit der Google Bildersuche lassen sich thematische Entitäten wie Nachrichten, Veranstaltungen oder Prominente im Bild erkennen und nach ähnlichen Bildern im Web suchen.

Beispielcode

Verwenden Sie die folgenden Codebeispiele, um Offline-Annotationsdienste für einen Batch von Bilddateien in Cloud Storage auszuführen.

Hinweis: In den folgenden Codebeispielen entspricht jedes Anfrageelement (requests_element/requestsElement) einem einzelnen Bild. Wenn Sie weitere Bilder annotieren möchten, erstellen Sie für jedes Bild ein Anfrageelement und fügen es dem Anfragearray hinzu (requests).

Java

Bevor Sie dieses Beispiel anwenden, folgen Sie der Anleitung für die Einrichtung von Java in der Vision API-Kurzanleitung zur Verwendung von Clientbibliotheken. Weitere Informationen finden Sie in der Java-Referenzdokumentation zur Vision API.

import com.google.cloud.vision.v1.AnnotateImageRequest;
import com.google.cloud.vision.v1.AsyncBatchAnnotateImagesRequest;
import com.google.cloud.vision.v1.AsyncBatchAnnotateImagesResponse;
import com.google.cloud.vision.v1.Feature;
import com.google.cloud.vision.v1.GcsDestination;
import com.google.cloud.vision.v1.Image;
import com.google.cloud.vision.v1.ImageAnnotatorClient;
import com.google.cloud.vision.v1.ImageSource;
import com.google.cloud.vision.v1.OutputConfig;
import java.io.IOException;
import java.util.concurrent.ExecutionException;

public class AsyncBatchAnnotateImages {

  public static void asyncBatchAnnotateImages()
      throws InterruptedException, ExecutionException, IOException {
    String inputImageUri = "gs://cloud-samples-data/vision/label/wakeupcat.jpg";
    String outputUri = "gs://YOUR_BUCKET_ID/path/to/save/results/";
    asyncBatchAnnotateImages(inputImageUri, outputUri);
  }

  public static void asyncBatchAnnotateImages(String inputImageUri, String outputUri)
      throws IOException, ExecutionException, InterruptedException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (ImageAnnotatorClient imageAnnotatorClient = ImageAnnotatorClient.create()) {

      // You can send multiple images to be annotated, this sample demonstrates how to do this with
      // one image. If you want to use multiple images, you have to create a `AnnotateImageRequest`
      // object for each image that you want annotated.
      // First specify where the vision api can find the image
      ImageSource source = ImageSource.newBuilder().setImageUri(inputImageUri).build();
      Image image = Image.newBuilder().setSource(source).build();

      // Set the type of annotation you want to perform on the image
      // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#google.cloud.vision.v1.Feature.Type
      Feature feature = Feature.newBuilder().setType(Feature.Type.LABEL_DETECTION).build();

      // Build the request object for that one image. Note: for additional images you have to create
      // additional `AnnotateImageRequest` objects and store them in a list to be used below.
      AnnotateImageRequest imageRequest =
          AnnotateImageRequest.newBuilder().setImage(image).addFeatures(feature).build();

      // Set where to store the results for the images that will be annotated.
      GcsDestination gcsDestination = GcsDestination.newBuilder().setUri(outputUri).build();
      OutputConfig outputConfig =
          OutputConfig.newBuilder()
              .setGcsDestination(gcsDestination)
              .setBatchSize(2) // The max number of responses to output in each JSON file
              .build();

      // Add each `AnnotateImageRequest` object to the batch request and add the output config.
      AsyncBatchAnnotateImagesRequest request =
          AsyncBatchAnnotateImagesRequest.newBuilder()
              .addRequests(imageRequest)
              .setOutputConfig(outputConfig)
              .build();

      // Make the asynchronous batch request.
      AsyncBatchAnnotateImagesResponse response =
          imageAnnotatorClient.asyncBatchAnnotateImagesAsync(request).get();

      // The output is written to GCS with the provided output_uri as prefix
      String gcsOutputUri = response.getOutputConfig().getGcsDestination().getUri();
      System.out.format("Output written to GCS with prefix: %s%n", gcsOutputUri);
    }
  }
}

Node.js

Bevor Sie dieses Beispiel ausprobieren, folgen Sie der Node.js-Einrichtungsanleitung in der Vision-Kurzanleitung zur Verwendung von Clientbibliotheken. Weitere Informationen finden Sie in der Node.js-Referenzdokumentation zur Vision API.

Richten Sie zur Authentifizierung bei Vision die Standardanmeldedaten für Anwendungen ein. Weitere Informationen finden Sie unter ADC für eine lokale Entwicklungsumgebung einrichten.

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const inputImageUri = 'gs://cloud-samples-data/vision/label/wakeupcat.jpg';
// const outputUri = 'gs://YOUR_BUCKET_ID/path/to/save/results/';

// Imports the Google Cloud client libraries
const {ImageAnnotatorClient} = require('@google-cloud/vision').v1;

// Instantiates a client
const client = new ImageAnnotatorClient();

// You can send multiple images to be annotated, this sample demonstrates how to do this with
// one image. If you want to use multiple images, you have to create a request object for each image that you want annotated.
async function asyncBatchAnnotateImages() {
  // Set the type of annotation you want to perform on the image
  // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#google.cloud.vision.v1.Feature.Type
  const features = [{type: 'LABEL_DETECTION'}];

  // Build the image request object for that one image. Note: for additional images you have to create
  // additional image request objects and store them in a list to be used below.
  const imageRequest = {
    image: {
      source: {
        imageUri: inputImageUri,
      },
    },
    features: features,
  };

  // Set where to store the results for the images that will be annotated.
  const outputConfig = {
    gcsDestination: {
      uri: outputUri,
    },
    batchSize: 2, // The max number of responses to output in each JSON file
  };

  // Add each image request object to the batch request and add the output config.
  const request = {
    requests: [
      imageRequest, // add additional request objects here
    ],
    outputConfig,
  };

  // Make the asynchronous batch request.
  const [operation] = await client.asyncBatchAnnotateImages(request);

  // Wait for the operation to complete
  const [filesResponse] = await operation.promise();

  // The output is written to GCS with the provided output_uri as prefix
  const destinationUri = filesResponse.outputConfig.gcsDestination.uri;
  console.log(`Output written to GCS with prefix: ${destinationUri}`);
}

asyncBatchAnnotateImages();

Python

Bevor Sie dieses Beispiel ausprobieren, folgen Sie der Python-Einrichtungsanleitung in der Vision-Kurzanleitung zur Verwendung von Clientbibliotheken. Weitere Informationen finden Sie in der Python-Referenzdokumentation zur Vision API.

Richten Sie zur Authentifizierung bei Vision die Standardanmeldedaten für Anwendungen ein. Weitere Informationen finden Sie unter ADC für eine lokale Entwicklungsumgebung einrichten.


from google.cloud import vision_v1


def sample_async_batch_annotate_images(
    input_image_uri="gs://cloud-samples-data/vision/label/wakeupcat.jpg",
    output_uri="gs://your-bucket/prefix/",
):
    """Perform async batch image annotation."""
    client = vision_v1.ImageAnnotatorClient()

    source = {"image_uri": input_image_uri}
    image = {"source": source}
    features = [
        {"type_": vision_v1.Feature.Type.LABEL_DETECTION},
        {"type_": vision_v1.Feature.Type.IMAGE_PROPERTIES},
    ]

    # Each requests element corresponds to a single image.  To annotate more
    # images, create a request element for each image and add it to
    # the array of requests
    requests = [{"image": image, "features": features}]
    gcs_destination = {"uri": output_uri}

    # The max number of responses to output in each JSON file
    batch_size = 2
    output_config = {"gcs_destination": gcs_destination, "batch_size": batch_size}

    operation = client.async_batch_annotate_images(
        requests=requests, output_config=output_config
    )

    print("Waiting for operation to complete...")
    response = operation.result(90)

    # The output is written to GCS with the provided output_uri as prefix
    gcs_output_uri = response.output_config.gcs_destination.uri
    print(f"Output written to GCS with prefix: {gcs_output_uri}")

Antwort

Eine erfolgreiche Anfrage gibt JSON-Antwortdateien in dem Cloud Storage-Bucket zurück, den Sie im Codebeispiel angegeben haben. Die Anzahl der Antworten pro JSON-Datei wird durch den Wert für batch_size im Codebeispiel bestimmt.

Die zurückgegebene Antwort ähnelt den regulären Antworten der Vision API-Features, je nachdem, welche Features Sie für ein Bild anfordern.

Die folgenden Antworten enthalten LABEL_DETECTION- und TEXT_DETECTION-Anmerkungen für image1.png, IMAGE_PROPERTIES-Anmerkungen für image2.jpg und OBJECT_LOCALIZATION-Anmerkungen für image3.jpg.

Die Antwort enthält außerdem ein context-Feld mit dem URI der Datei.

`offline_batch_output/output-1-to-2.json`

{
  "responses": [
    {
      "labelAnnotations": [
        {
          "mid": "/m/07s6nbt",
          "description": "Text",
          "score": 0.93413997,
          "topicality": 0.93413997
        },
        {
          "mid": "/m/0dwx7",
          "description": "Logo",
          "score": 0.8733531,
          "topicality": 0.8733531
        },
        ...
        {
          "mid": "/m/03bxgrp",
          "description": "Company",
          "score": 0.5682425,
          "topicality": 0.5682425
        }
      ],
      "textAnnotations": [
        {
          "locale": "en",
          "description": "Google\n",
          "boundingPoly": {
            "vertices": [
              {
                "x": 72,
                "y": 40
              },
              {
                "x": 613,
                "y": 40
              },
              {
                "x": 613,
                "y": 233
              },
              {
                "x": 72,
                "y": 233
              }
            ]
          }
        },
        ...
                ],
                "blockType": "TEXT"
              }
            ]
          }
        ],
        "text": "Google\n"
      },
      "context": {
        "uri": "gs://cloud-samples-data/vision/document_understanding/image1.png"
      }
    },
    {
      "imagePropertiesAnnotation": {
        "dominantColors": {
          "colors": [
            {
              "color": {
                "red": 229,
                "green": 230,
                "blue": 238
              },
              "score": 0.2744754,
              "pixelFraction": 0.075339235
            },
            ...
            {
              "color": {
                "red": 86,
                "green": 87,
                "blue": 95
              },
              "score": 0.025770646,
              "pixelFraction": 0.13109145
            }
          ]
        }
      },
      "cropHintsAnnotation": {
        "cropHints": [
          {
            "boundingPoly": {
              "vertices": [
                {},
                {
                  "x": 1599
                },
                {
                  "x": 1599,
                  "y": 1199
                },
                {
                  "y": 1199
                }
              ]
            },
            "confidence": 0.79999995,
            "importanceFraction": 1
          }
        ]
      },
      "context": {
        "uri": "gs://cloud-samples-data/vision/document_understanding/image2.jpg"
      }
    }
  ]
}

`offline_batch_output/output-3-to-3.json`

{
  "responses": [
    {
      "context": {
        "uri": "gs://cloud-samples-data/vision/document_understanding/image3.jpg"
      },
      "localizedObjectAnnotations": [
        {
          "mid": "/m/0bt9lr",
          "name": "Dog",
          "score": 0.9669734,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.6035543,
                "y": 0.1357359
              },
              {
                "x": 0.98546547,
                "y": 0.1357359
              },
              {
                "x": 0.98546547,
                "y": 0.98426414
              },
              {
                "x": 0.6035543,
                "y": 0.98426414
              }
            ]
          }
        },
        ...
        {
          "mid": "/m/0jbk",
          "name": "Animal",
          "score": 0.58003056,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.014534635,
                "y": 0.1357359
              },
              {
                "x": 0.37197515,
                "y": 0.1357359
              },
              {
                "x": 0.37197515,
                "y": 0.98426414
              },
              {
                "x": 0.014534635,
                "y": 0.98426414
              }
            ]
          }
        }
      ]
    }
  ]
}