This document lists the quotas and limits that apply to Vertex AI Agent Builder.
A quota restricts how much of a shared Google Cloud resource your Google Cloud project can use, including hardware, software, and network components. Therefore, quotas are a part of a system that does the following:
- Monitors your use or consumption of Google Cloud products and services.
- Restricts your consumption of those resources, for reasons that include ensuring fairness and reducing spikes in usage.
- Maintains configurations that automatically enforce prescribed restrictions.
- Provides a means to request or make changes to the quota.
In most cases, when a quota is exceeded, the system immediately blocks access to the relevant Google resource, and the task that you're trying to perform fails. In most cases, quotas apply to each Google Cloud project and are shared across all applications and IP addresses that use that Google Cloud project.
There are also limits on Vertex AI Agent Builder resources. These limits are unrelated to the quota system. Limits cannot be changed unless otherwise stated.
Allocation quotas
The following quotas don't reset over time and instead are released when you release the resource:
Quotas | Value |
---|---|
Number of documents per project | 1,000,000 |
Number of documents per organization | 5,000,000 |
Number of pending import long running operations per project | 300 |
Number of pending import long running operations per organization | 1500 |
Number of pending purge documents long running operations per project | 100 |
Number of pending purge documents long running operations per organization | 500 |
Number of user events per project | 40,000,000,000 |
Number of user events per organization | 200,000,000,000 |
Request quotas
The following quotas apply to Vertex AI Agent Builder requests:
Quotas | Value |
---|---|
Complete query requests per minute per project | 300 |
Complete query requests per minute per organization | 1,500 |
Conversational search read requests per minute per project | 300 |
Conversational search read requests per minute per organization | 500 |
Conversational search write requests per minute per project | 300 |
Conversational search write requests per minute per organization | 500 |
Document batch requests per minute per project | 100 |
Document batch requests per minute per organization | 500 |
Document read requests per minute per project | 300 |
Document read requests per minute per organization | 1500 |
Document write requests per minute per project | 12,000 |
Document write requests per minute per organization | 60,000 |
LLM query requests (search summarization, multi-turn search) per minute per project | 15 |
LLM query requests (search summarization, multi-turn search) per minute per organization | 75 |
Number of pending FHIR/BQ streaming writes per minute | 6,000 |
Recommend requests per minute per project | 60,000 |
Recommend requests per minute per organization | Unlimited |
Schema read requests per minute per project | 100 |
Schema read requests per minute per organization | 500 |
Schema write requests per minute per project | 100 |
Schema write requests per minute per organization | 500 |
Search requests per minute per project | 300 |
Search requests per minute per organization | Unlimited |
User event batch requests (such as import and purge) per minute per project | 100 |
User event batch requests (such as import and purge) per minute per organization | 500 |
User event collect requests per minute per project per user | 240 |
User event collect requests per minute per organization per user | 1200 |
User event write requests per minute per project | 60,000 |
User event write requests per minute per organization | 300,000 |
Quota for web page indexing
When you have a data store with Advanced website indexing turned on, every web page that you index counts towards the "Number of documents per project" quota in the Allocation quotas list. You can also see the number of pages in your project and the page quota for that project in the Project pages vs quota field in the Data page for a data store.
If you add websites to a data store in a project and the web pages in those websites cause the quota for the project to be exceeded, the websites are not indexed. If you have websites in your data store that are already indexed, those websites continue to be indexed as before. You can request to upgrade your quota at any time.
Request a quota increase
To increase or decrease most quotas, use the Google Cloud console. For more information, see Request a higher quota.