Method: files.asyncBatchAnnotate

Run asynchronous image detection and annotation for a list of generic files, such as PDF files, which may contain multiple pages and multiple images per page. Progress and results can be retrieved through the google.longrunning.Operations interface. Operation.metadata contains OperationMetadata (metadata). Operation.response contains AsyncBatchAnnotateFilesResponse (results).

HTTP request

POST https://vision.googleapis.com/v1p3beta1/files:asyncBatchAnnotate

The URL uses gRPC Transcoding syntax.

Request body

The request body contains data with the following structure:

JSON representation
{
  "requests": [
    {
      object(AsyncAnnotateFileRequest)
    }
  ]
}
Fields
requests[]

object(AsyncAnnotateFileRequest)

Individual async file annotation requests for this batch.

Response body

If successful, the response body contains an instance of Operation.

Authorization Scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-platform
  • https://www.googleapis.com/auth/cloud-vision

For more information, see the Authentication Overview.

AsyncAnnotateFileRequest

An offline file annotation request.

JSON representation
{
  "inputConfig": {
    object(InputConfig)
  },
  "features": [
    {
      object(Feature)
    }
  ],
  "imageContext": {
    object(ImageContext)
  },
  "outputConfig": {
    object(OutputConfig)
  }
}
Fields
inputConfig

object(InputConfig)

Required. Information about the input file.

features[]

object(Feature)

Required. Requested features.

imageContext

object(ImageContext)

Additional context that may accompany the image(s) in the file.

outputConfig

object(OutputConfig)

Required. The desired output location and metadata (e.g. format).

InputConfig

The desired input location and metadata.

JSON representation
{
  "gcsSource": {
    object(GcsSource)
  },
  "mimeType": string
}
Fields
gcsSource

object(GcsSource)

The Google Cloud Storage location to read the input from.

mimeType

string

The type of the file. Currently only "application/pdf" and "image/tiff" are supported. Wildcards are not supported.

GcsSource

The Google Cloud Storage location where the input will be read from.

JSON representation
{
  "uri": string
}
Fields
uri

string

Google Cloud Storage URI for the input file. This must only be a Google Cloud Storage object. Wildcards are not currently supported.

OutputConfig

The desired output location and metadata.

JSON representation
{
  "gcsDestination": {
    object(GcsDestination)
  },
  "batchSize": number
}
Fields
gcsDestination

object(GcsDestination)

The Google Cloud Storage location to write the output(s) to.

batchSize

number

The max number of response protos to put into each output JSON file on Google Cloud Storage. The valid range is [1, 100]. If not specified, the default value is 20.

For example, for one pdf file with 100 pages, 100 response protos will be generated. If batchSize = 20, then 5 json files each containing 20 response protos will be written under the prefix gcsDestination.uri.

Currently, batchSize only applies to GcsDestination, with potential future support for other output configurations.

GcsDestination

The Google Cloud Storage location where the output will be written to.

JSON representation
{
  "uri": string
}
Fields
uri

string

Google Cloud Storage URI where the results will be stored. Results will be in JSON format and preceded by its corresponding input URI. This field can either represent a single file, or a prefix for multiple outputs. Prefixes must end in a /.

Examples:

If multiple outputs, each response is still AnnotateFileResponse, each of which contains some subset of the full list of AnnotateImageResponse. Multiple outputs can happen if, for example, the output JSON is too large and overflows into multiple sharded files.