Method: projects.locations.recognizers.batchRecognize

Performs batch asynchronous speech recognition: send a request with N audio files and receive a long running operation that can be polled to see when the transcriptions are finished.

HTTP request

POST https://{endpoint}/v2/{recognizer=projects/*/locations/*/recognizers/*}:batchRecognize

Where {endpoint} is one of the supported service endpoints.

The URLs use gRPC Transcoding syntax.

Path parameters

Parameters

Parameters
`recognizer`	`string` Required. The name of the Recognizer to use during recognition. The expected format is `projects/{project}/locations/{location}/recognizers/{recognizer}`. The {recognizer} segment may be set to `_` to use an empty implicit Recognizer.

recognizer

string

Required. The name of the Recognizer to use during recognition. The expected format is projects/{project}/locations/{location}/recognizers/{recognizer}. The {recognizer} segment may be set to _ to use an empty implicit Recognizer.

Request body

The request body contains data with the following structure:

JSON representation

JSON representation
{ "config": { object (`RecognitionConfig`) }, "configMask": string, "files": [ { object (`BatchRecognizeFileMetadata`) } ], "recognitionOutputConfig": { object (`RecognitionOutputConfig`) }, "processingStrategy": enum (`ProcessingStrategy`) }

{
  "config": {
    object (RecognitionConfig)
  },
  "configMask": string,
  "files": [
    {
      object (BatchRecognizeFileMetadata)
    }
  ],
  "recognitionOutputConfig": {
    object (RecognitionOutputConfig)
  },
  "processingStrategy": enum (ProcessingStrategy)
}

Fields
`config`	`object (RecognitionConfig)` Features and audio metadata to use for the Automatic Speech Recognition. This field in combination with the `configMask` field can be used to override parts of the `defaultRecognitionConfig` of the Recognizer resource.
`configMask`	`string (FieldMask format)` The list of fields in `config` that override the values in the `defaultRecognitionConfig` of the recognizer during this recognition request. If no mask is provided, all given fields in `config` override the values in the recognizer for this recognition request. If a mask is provided, only the fields listed in the mask override the config in the recognizer for this recognition request. If a wildcard (`*`) is provided, `config` completely overrides and replaces the config in the recognizer for this recognition request. This is a comma-separated list of fully qualified names of fields. Example: `"user.displayName,photo"`.
`files[]`	`object (BatchRecognizeFileMetadata)` Audio files with file metadata for ASR. The maximum number of files allowed to be specified is 5.
`recognitionOutputConfig`	`object (RecognitionOutputConfig)` Configuration options for where to output the transcripts of each file.
`processingStrategy`	`enum (ProcessingStrategy)` Processing strategy to use for this request.

Response body

If successful, the response body contains an instance of Operation.

Authorization scopes

Requires the following OAuth scope:

https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the recognizer resource:

speech.recognizers.recognize

For more information, see the IAM documentation.

BatchRecognizeFileMetadata

Metadata about a single file in a batch for recognizers.batchRecognize.

JSON representation

JSON representation
{ "config": { object (`RecognitionConfig`) }, "configMask": string, // Union field `audio_source` can be only one of the following: "uri": string // End of list of possible types for union field `audio_source`. }

{
  "config": {
    object (RecognitionConfig)
  },
  "configMask": string,

  // Union field audio_source can be only one of the following:
  "uri": string
  // End of list of possible types for union field audio_source.
}

Fields
`config`	`object (RecognitionConfig)` Features and audio metadata to use for the Automatic Speech Recognition. This field in combination with the `configMask` field can be used to override parts of the `defaultRecognitionConfig` of the Recognizer resource as well as the `config` at the request level.
`configMask`	`string (FieldMask format)` The list of fields in `config` that override the values in the `defaultRecognitionConfig` of the recognizer during this recognition request. If no mask is provided, all non-default valued fields in `config` override the values in the recognizer for this recognition request. If a mask is provided, only the fields listed in the mask override the config in the recognizer for this recognition request. If a wildcard (`*`) is provided, `config` completely overrides and replaces the config in the recognizer for this recognition request. This is a comma-separated list of fully qualified names of fields. Example: `"user.displayName,photo"`.
Union field `audio_source`. The audio source, which is a Google Cloud Storage URI. `audio_source` can be only one of the following:
`uri`	`string` Cloud Storage URI for the audio file.

RecognitionOutputConfig

Configuration options for the output(s) of recognition.

JSON representation

JSON representation
{ "outputFormatConfig": { object (`OutputFormatConfig`) }, // Union field `output` can be only one of the following: "gcsOutputConfig": { object (`GcsOutputConfig`) }, "inlineResponseConfig": { object (`InlineOutputConfig`) } // End of list of possible types for union field `output`. }

{
  "outputFormatConfig": {
    object (OutputFormatConfig)
  },

  // Union field output can be only one of the following:
  "gcsOutputConfig": {
    object (GcsOutputConfig)
  },
  "inlineResponseConfig": {
    object (InlineOutputConfig)
  }
  // End of list of possible types for union field output.
}

Fields
`outputFormatConfig`	`object (OutputFormatConfig)` Optional. Configuration for the format of the results stored to `output`. If unspecified transcripts will be written in the `NATIVE` format only.
Union field `output`. `output` can be only one of the following:
`gcsOutputConfig`	`object (GcsOutputConfig)` If this message is populated, recognition results are written to the provided Google Cloud Storage URI.
`inlineResponseConfig`	`object (InlineOutputConfig)` If this message is populated, recognition results are provided in the `BatchRecognizeResponse` message of the Operation when completed. This is only supported when calling `recognizers.batchRecognize` with just one audio file.

GcsOutputConfig

Output configurations for Cloud Storage.

JSON representation
{ "uri": string }

Fields

Fields
`uri`	`string` The Cloud Storage URI prefix with which recognition results will be written.

uri

string

The Cloud Storage URI prefix with which recognition results will be written.

InlineOutputConfig

This type has no fields.

Output configurations for inline response.

OutputFormatConfig

Configuration for the format of the results stored to output.

JSON representation
{ "native": { object (`NativeOutputFileFormatConfig`) }, "vtt": { object (`VttOutputFileFormatConfig`) }, "srt": { object (`SrtOutputFileFormatConfig`) } }

Fields

Fields
`native`	`object (NativeOutputFileFormatConfig)` Configuration for the native output format. If this field is set or if no other output format field is set, then transcripts will be written to the sink in the native format.
`vtt`	`object (VttOutputFileFormatConfig)` Configuration for the VTT output format. If this field is set, then transcripts will be written to the sink in the VTT format.
`srt`	`object (SrtOutputFileFormatConfig)` Configuration for the SRT output format. If this field is set, then transcripts will be written to the sink in the SRT format.

native

object (NativeOutputFileFormatConfig)

Configuration for the native output format. If this field is set or if no other output format field is set, then transcripts will be written to the sink in the native format.

vtt

object (VttOutputFileFormatConfig)

Configuration for the VTT output format. If this field is set, then transcripts will be written to the sink in the VTT format.

srt

object (SrtOutputFileFormatConfig)

Configuration for the SRT output format. If this field is set, then transcripts will be written to the sink in the SRT format.

NativeOutputFileFormatConfig

This type has no fields.

Output configurations for serialized BatchRecognizeResults protos.

VttOutputFileFormatConfig

This type has no fields.

Output configurations for WebVTT formatted subtitle file.

SrtOutputFileFormatConfig

This type has no fields.

Output configurations SubRip Text formatted subtitle file.

ProcessingStrategy

Possible processing strategies for batch requests.

Enums
`PROCESSING_STRATEGY_UNSPECIFIED`	Default value for the processing strategy. The request is processed as soon as its received.
`DYNAMIC_BATCHING`	If selected, processes the request during lower utilization periods for a price discount. The request is fulfilled within 24 hours.