Quotas and limits

This document lists the quotas and limits that apply to Vertex AI Agent Builder. For more information on quotas, see Virtual Private Cloud quotas.

A quota restricts how much of a shared Google Cloud resource your Google Cloud project can use, including hardware, software, and network components. Therefore, quotas are a part of a system that does the following:

  • Monitors your use or consumption of Google Cloud products and services.
  • Restricts your consumption of those resources, for reasons that include ensuring fairness and reducing spikes in usage.
  • Maintains configurations that automatically enforce prescribed restrictions.
  • Provides a means to request or make changes to the quota.

In most cases, when a quota is exceeded, the system immediately blocks access to the relevant Google resource, and the task that you're trying to perform fails. In most cases, quotas apply to each Google Cloud project and are shared across all applications and IP addresses that use that Google Cloud project.

There are also limits on Vertex AI Agent Builder resources. These limits are unrelated to the quota system. Limits cannot be changed unless otherwise stated.

Allocation quotas

The following quotas don't reset over time and instead are released when you release the resource:

Quotas Value
Number of documents per project 1,000,000
Number of documents per organization 5,000,000
Number of pending import long running operations per project 300
Number of pending import long running operations per organization 1500
Number of pending purge documents long running operations per project 100
Number of pending purge documents long running operations per organization 500
Number of user events per project 40,000,000,000
Number of user events per organization 200,000,000,000

Request quotas

The following quotas apply to Vertex AI Agent Builder requests:

Quotas Value
Complete query requests per minute per project 300
Complete query requests per minute per organization 1,500
Conversational search read requests per minute per project 300
Conversational search read requests per minute per organization 500
Conversational search write requests per minute per project 300
Conversational search write requests per minute per organization 500
Document batch requests per minute per project 100
Document batch requests per minute per organization 500
Document read requests per minute per project 300
Document read requests per minute per organization 1500
Document write requests per minute per project 12,000
Document write requests per minute per organization 60,000
LLM query requests (search summarization, multi-turn search) per minute per project 15
LLM query requests (search summarization, multi-turn search) per minute per organization 75
Number of pending FHIR/BQ streaming writes per minute 6,000
Recommend requests per minute per project 60,000
Recommend requests per minute per organization Unlimited
Schema read requests per minute per project 100
Schema read requests per minute per organization 500
Schema write requests per minute per project 100
Schema write requests per minute per organization 500
Search requests per minute per project 300
Search requests per minute per organization Unlimited
User event batch requests (such as import and purge) per minute per project 100
User event batch requests (such as import and purge) per minute per organization 500
User event collect requests per minute per project per user 240
User event collect requests per minute per organization per user 1200
User event write requests per minute per project 60,000
User event write requests per minute per organization 300,000

Quota for web page indexing

When you have a data store with Advanced website indexing turned on, every web page that you index counts towards the "Number of documents per project" quota in the Allocation quotas list. You can also see the number of pages in your project and the page quota for that project in the Project pages vs quota field in the Data page for a data store.

If you add websites to a data store in a project and the web pages in those websites cause the quota for the project to be exceeded, the websites are not indexed. If you have websites in your data store that are already indexed, those websites continue to be indexed as before. You can request to upgrade your quota at any time.

Request a quota increase

To increase or decrease most quotas, use the Google Cloud console. For more information, see Request a higher quota.