This document lists the quotas and system limits that apply to Speech-to-Text. Quotas specify the amount of a countable, shared resource that you can use, and they are defined by Google Cloud services such as Speech-to-Text. System limits are fixed values that cannot be changed.
Google Cloud uses quotas to help ensure fairness and reduce spikes in resource use and availability. A quota restricts how much of a Google Cloud resource your Google Cloud project can use. Quotas apply to a range of resource types, including hardware, software, and network components. For example, quotas can restrict the number of API calls to a service, the number of load balancers used concurrently by your project, or the number of projects that you can create. Quotas protect the community of Google Cloud users by preventing the overloading of services. Quotas also help you to manage your own Google Cloud resources.
The Cloud Quotas system does the following:
- Monitors your consumption of Google Cloud products and services
- Restricts your consumption of those resources
- Provides a way to request changes to the quota value
In most cases, when you attempt to consume more of a resource than its quota allows, the system blocks access to the resource, and the task that you're trying to perform fails.
Quotas generally apply at the Google Cloud project level. Your use of a resource in one project doesn't affect your available quota in another project. Within a Google Cloud project, quotas are shared across all applications and IP addresses.
To adjust most quotas, use the Google Cloud console. For more information, see Request a quota adjustment.
There are also system limits on Speech-to-Text resources. System limits can't be changed.
This document contains the current API restrictions and usage limits for Cloud Speech-to-Text. This page will be updated to reflect any changes to these restrictions and usage limits. We reserve the right to change these limits.
Content Limits
Content to Speech-to-Text is provided as audio data, either directly within
the content
field of the request or referenced within a Google Cloud Storage
URI in the uri
field of the request. There is a limit of 10 MB on all single
requests sent to the API using local files. In the case of the
Recognize
and
LongRunningRecognize
methods, this limit applies to the size of the request sent. In the case of the
StreamingRecognize
method, the 10 MB limit applies to both the initial StreamingRecognize
request
and the size of each individual message in the stream. Exceeding this limit will
throw an error. There is no size limit on requests sent using audio data stored
in a Google Cloud Storage bucket.
The API contains the following limits on the size of this content (and are subject to change).
Content Limit | Audio Length |
---|---|
Synchronous Requests | ~1 Minute |
Asynchronous Requests | ~480 Minutes* |
Streaming Requests | ~5 Minutes** |
* Audio longer than ~1 minute must
use the uri
field to reference
an audio file in Google Cloud Storage.
** If you need to stream content for more than 5 minutes, see the
endless streaming tutorial.
For StreamingRecognize
requests, audio must be sent at a rate that
approximates real time.
Attempting to process content in excess of these content limits will produce an error. For more information, see Error messages and Troubleshooting.
Within any request, you may also supply a PhraseSet resource containing a list of phrases specific to the request. (A single word counts as a phrase in this context.) The following limits apply to such a context:
Speech Adaptation Limit | Value |
---|---|
Phrases per request | 5000 |
Total characters per request | 100,000 |
Characters per phrase | 100 |
Request Limits
The current API usage limits for Speech-to-Text are as follows (and are subject to change):
Type of Limit | Usage Limit |
---|---|
Recognition Requests per 60 seconds* | 900 |
Adaptation resource requests per 60 seconds* | 10 |
Processing per day | 480 hours of audio |
* Each StreamingRecognize
session is considered a single request even though
it includes multiple frames of StreamingRecognizeRequest
audio within the
stream.
Requests and/or attempts at audio processing in excess of these limits will produce an error. For more information, see Error messages and Troubleshooting.
These limits apply to each Speech-to-Text developer project, and are shared across all applications and IP addresses using a given a developer project.