This document lists the quotas and system limits that apply to Vertex AI Agent Builder. Quotas specify the amount of a countable, shared resource that you can use, and they are defined by Google Cloud services such as Vertex AI Agent Builder. System limits are fixed values that cannot be changed.
Google Cloud uses quotas to help ensure fairness and reduce spikes in resource use and availability. A quota restricts how much of a Google Cloud resource your Google Cloud project can use. Quotas apply to a range of resource types, including hardware, software, and network components. For example, quotas can restrict the number of API calls to a service, the number of load balancers used concurrently by your project, or the number of projects that you can create. Quotas protect the community of Google Cloud users by preventing the overloading of services. Quotas also help you to manage your own Google Cloud resources.
The Cloud Quotas system does the following:
- Monitors your consumption of Google Cloud products and services
- Restricts your consumption of those resources
- Provides a way to request changes to the quota value
In most cases, when you attempt to consume more of a resource than its quota allows, the system blocks access to the resource, and the task that you're trying to perform fails.
Quotas generally apply at the Google Cloud project level. Your use of a resource in one project doesn't affect your available quota in another project. Within a Google Cloud project, quotas are shared across all applications and IP addresses.
There are also system limits on Vertex AI Agent Builder resources. System limits can't be changed.
Allocation quotas
The following quotas don't reset over time and instead are released when you release the resource:
Quotas | Value |
---|---|
Number of documents per project | 1,000,000 |
Number of documents per organization | 5,000,000 |
Number of pending import long running operations per project | 300 |
Number of pending import long running operations per organization | 1500 |
Number of pending purge documents long running operations per project | 100 |
Number of pending purge documents long running operations per organization | 500 |
Number of user events per project | 40,000,000,000 |
Number of user events per organization | 200,000,000,000 |
Request quotas
The following quotas apply to Vertex AI Agent Builder requests:
Quotas | Value |
---|---|
Complete query requests per minute per project | 300 |
Complete query requests per minute per organization | 1,500 |
Conversational search read requests per minute per project | 300 |
Conversational search read requests per minute per organization | 500 |
Conversational search write requests per minute per project | 300 |
Conversational search write requests per minute per organization | 500 |
Document batch requests per minute per project | 100 |
Document batch requests per minute per organization | 500 |
Document read requests per minute per project | 300 |
Document read requests per minute per organization | 1500 |
Document write requests per minute per project | 12,000 |
Document write requests per minute per organization | 60,000 |
Evaluation create requests per day per organization | 5 |
Evaluation create requests per day per project | 5 |
Evaluation read requests per minute per organization | 500 |
Evaluation read requests per minute per project | 100 |
Evaluation write requests per minute per organization | 500 |
Evaluation write requests per minute per project | 100 |
LLM query requests (search summarization, multi-turn search) per minute per project | 15 |
LLM query requests (search summarization, multi-turn search) per minute per organization | 75 |
Number of pending FHIR/BQ streaming writes per minute | 6,000 |
Number of sample query sets per organization | 500 |
Number of sample query sets per project | 100 |
Ranking API requests per minute per project | 500 |
Recommend requests per minute per project | 60,000 |
Recommend requests per minute per organization | Unlimited |
Sample query read requests per minute per organization | 1000 |
Sample query read requests per minute per project | 200 |
Sample query set read requests per minute per organization | 500 |
Sample query set read requests per minute per project | 100 |
Sample query set write requests per minute per organization | 500 |
Sample query set write requests per minute per project | 100 |
Sample query write requests per minute per organization | 1000 |
Sample query write requests per minute per project | 200 |
Schema read requests per minute per project | 100 |
Schema read requests per minute per organization | 500 |
Schema write requests per minute per project | 100 |
Schema write requests per minute per organization | 500 |
Search requests per minute per project | 300 |
Search requests per minute per organization | Unlimited |
User event batch requests (such as import and purge) per minute per project | 100 |
User event batch requests (such as import and purge) per minute per organization | 500 |
User event collect requests per minute per project per user | 240 |
User event collect requests per minute per organization per user | 1200 |
User event write requests per minute per project | 60,000 |
User event write requests per minute per organization | 300,000 |
Quota for web page indexing
When you have a data store with Advanced website indexing turned on, every web page that you index counts towards the "Number of documents per project" quota in the Allocation quotas list. You can also see the number of pages in your project and the page quota for that project in the Project pages vs quota field in the Data page for a data store.
If you add websites to a data store in a project and the web pages in those websites cause the quota for the project to be exceeded, the websites are not indexed. If you have websites in your data store that are already indexed, those websites continue to be indexed as before. You can request to upgrade your quota at any time.
Request a quota increase
To adjust most quotas, use the Google Cloud console. For more information, see Request a quota adjustment.