This document lists the quotas and system limits that apply to Gemini for Google Cloud.
- Quotas specify the amount of a countable, shared resource that you can use. Quotas are defined by Google Cloud services such as Gemini for Google Cloud.
- System limits are fixed values that cannot be changed.
Google Cloud uses quotas to help ensure fairness and reduce spikes in resource use and availability. A quota restricts how much of a Google Cloud resource your Google Cloud project can use. Quotas apply to a range of resource types, including hardware, software, and network components. For example, quotas can restrict the number of API calls to a service, the number of load balancers used concurrently by your project, or the number of projects that you can create. Quotas protect the community of Google Cloud users by preventing the overloading of services. Quotas also help you to manage your own Google Cloud resources.
The Cloud Quotas system does the following:
- Monitors your consumption of Google Cloud products and services
- Restricts your consumption of those resources
- Provides a way to request changes to the quota value
In most cases, when you attempt to consume more of a resource than its quota allows, the system blocks access to the resource, and the task that you're trying to perform fails.
Quotas generally apply at the Google Cloud project level. Your use of a resource in one project doesn't affect your available quota in another project. Within a Google Cloud project, quotas are shared across all applications and IP addresses.
There are also system limits on Gemini resources. System limits can't be changed.
Requests per second
Gemini for Google Cloud enforces quotas on requests per second for each user in a project.
Quota | Value |
---|---|
Requests per second | 2 |
Requests per day
Gemini for Google Cloud enforces quotas for the total number of requests per day for each user in a project.
Quota | Value |
---|---|
Requests per day for Gemini Code Assist or Gemini in BigQuery code requests, such as code generation and code completion. | 6000 |
Requests per day for chat, visualization, data insight table scans, and other requests that display responses in the Gemini pane in the Google Cloud console and IDEs. | 240 |
Quotas for Gemini Code Assist
Gemini Code Assist enforces quotas for certain features.
Quota | Value |
---|---|
Local codebase awareness | 128,000 token context window |
Code customization repositories | 950 |
Quotas for Gemini in BigQuery
For customers using Gemini in BigQuery with BigQuery Enterprise Plus edition, quotas are provided based on the daily average use of Enterprise Plus slot-hours for the last full calendar month. This quota applies to the organization level and is available to all projects in that organization that have Enterprise Plus edition slots assigned. Quotas are rounded up to the nearest 100 slot-hour usage.
Quotas per 100 slot-hours (Enterprise Plus edition daily average usage) | Value |
---|---|
Code completion requests per day | 150 |
Code generation requests per day | 10 |
Requests per day for chat, visualization, table scans, and other requests that display responses in the Gemini pane in the Google Cloud console. | 5 |
Example: An organization that has an Enterprise Plus edition reservation with 100 slots as its baseline will use an average of 2,400 slot-hours each day (100 slots * 24 hours = 2,400 slot-hours). As a result, in the following month they get the following daily quotas:
- 3,600 code completion requests per day
- 240 code generation requests per day
- 120 chat, visualization, and data insights table scans per day
If your organization has not purchased any BigQuery Enterprise Plus edition reservations until now, then after you purchase an Enterprise Plus edition reservation, you will receive the default quota of the following for the first full calendar month:
- 7,500 code completion requests per day
- 500 code generation requests per day
- 250 chat, visualizations, and data insights table scans per day
If you start using Enterprise Plus edition reservations mid-month, then the default quota applies until the end of the following month.
Request a quota increase
To adjust most quotas, use the Google Cloud console. For more information, see Request a quota adjustment.