Quotas and limits

This document contains the commonly encountered quotas and limits for use of Dialogflow. Information on how to find a complete list of quotas and limits is provided below. We reserve the right to change these constraints, and this page will be updated to reflect any changes.

Quotas

Quotas are default constraints applied to your project. When using generative features, each combination of model and region has its own generative quota. If you are using a paid edition, you can request a quota increase. The purpose of quotas is to throttle service load per client, which protects a service from being overloaded and a client from unexpected resource usage.

Dialogflow quotas vary, depending on the edition of your agent. Quotas apply to each project and are shared across all applications and IP addresses using a project.

Quota time boundary details:

Per-month quotas are replenished on the 1st of each month at 12:00 AM Pacific Time.
Per-day quotas are replenished daily at 12:00 AM Pacific Time. When a Pacific Time daylight savings transition increases the length of a day, a proportional quota bonus will be granted. However, there is no equivalent deduction when a shorter day occurs due to the daylight savings transition.
Per-minute quotas are refreshed every 60 seconds on the minute.

The quotas listed in the quotas table are the commonly encountered quotas. To review all quotas, see the Google Cloud Console Dialogflow quotas page.

Generative quotas

When your agent uses Conversational Agents (Dialogflow CX) generative features, each combination of model and region have unique quotas.

Quota identifier

The ConversationalAgentLlmTokenConsumption quota is a regional quota that tracks token usage for large language models (LLMs) in Conversational Agents (Dialogflow CX). This quota is measured across different LLM base models, and currently supports the following models:

text-bison-002
gemini-1.0-pro-001
gemini-1.5-pro
gemini-1.5-pro-002
gemini-1.5-flash
gemini-1.5-flash-002

The ConversationalAgentLlmTokenConsumption quota tracks LLM token usage across the following Conversational Agents (Dialogflow CX) features:

Generators
Generative Fallback
Playbook
Datastore

The default limit for the ConversationalAgentLlmTokenConsumpiton quota is 600,000 tokens per minute.

Multi-region quota increase request

If you need to request a quota increase for a multi-region, submit your request for one of the following specific regions:

For agents in the US and Global multi-regions, use us-central1.
For agents in the EU multi-region, use europe-west4.

Visualize token usage by region and model

To gain insights into your LLM token consumption across different regions and models, follow these steps using Metrics Explorer:

Access Metrics Explorer: Open Metrics Explorer.
Select the Metric: Choose Conversational Agent LLM tokens quota usage as the metric you want to analyze.
Configure Aggregation and Aligner:
- Set the aggregation function to Sum to obtain the total token usage.
- Click Configure Aligner to enable the calculation of changes between consecutive data points in your time series. This helps in understanding usage trends over time.
Define Aggregation Dimensions: To break down the token usage, select location and base_model as the dimensions for aggregation. This will allow you to see how token consumption varies across different regions and models.
Set Aligner Function: Choose Sum as the aligner function to calculate the total change in token usage between consecutive data points.

Quota increase request

You can request a quota increase if necessary.

If you use different projects for Dialogflow agents and for authenticating with service accounts, request a quota increase for the consumer project. For more information, see Using multiple projects for Dialogflow ES or Using multiple projects for Conversational Agents (Dialogflow CX).

For Dialogflow ES agents, make sure to upgrade the agent to the Dialogflow Essentials edition before requesting a quota increase.

Quotas table

The following terms are used to describe pricing and quotas:

Request: A request is defined as any API call to the Dialogflow service, whether direct with API usage or indirect with integration or console usage. Depending on the task and design of the agent, the number of requests needed for an end-user to accomplish a task with a Dialogflow agent can vary greatly.
Session: A session is a conversation between an end-user and a Dialogflow agent. A session remains active and its data is stored for 30 minutes after the last request is sent for the session. A session can be either a chat session or a voice session.
- Chat session: A chat session only uses text for both requests and responses.
- Voice session: A voice session uses audio for requests, responses, or both.
Consumer projects and resource projects: If you use multiple projects, it is possible that the project associated with your request authentication (consumer project) is not the same project that is associated with the agent in the request (resource project). In this case, the consumer project is used to determine prices and quotas. For more information, see Using multiple projects for Dialogflow ES or Using multiple projects for Conversational Agents (Dialogflow CX).

The following tables provide a quota comparison for editions by agent type. Unless a feature is indicated as included, quotas are cumulative for all features used by a request.

Conversational Agents (Dialogflow CX) Agent

Feature	Conversational Agents (Dialogflow CX) Edition
Text (includes all DetectIntent, StreamingDetectIntent, and FulfillIntent requests that do not contain audio)	1200 requests per minute
Audio input/output (speech recognition, speech-to-text, STT, speech synthesis, text-to-speech, TTS, telephony)	600 requests per minute
Generative features (playbooks, datastores, generators, generative fallback)	600,000 tokens per minute, see generative quotas
Dialogflow CX Phone Gateway Includes audio input and output.	100 total phone minutes per minute ‡
Design-time write requests For example, calls to build or update an agent.	60 requests per minute
Design-time read requests For example, calls to list or get agent resources.	200 requests per minute
Other session requests For example, setting or getting session entities.	100 requests per minute

Symbol	Description
‡	"Phone minutes per minute" is the total sum of time per minute that is used by all users making calls through the phone system. For example: in one 60-second time period one user is on for 60 seconds, a second for 30 seconds, and a third for 30 seconds. This is calculated as 2 minutes per minute. More minutes per minute allows more users to use the system simultaneously.

Dialogflow ES Agent

Feature	Trial Edition	Essentials Edition
Text (includes all DetectIntent and StreamingDetectIntent requests that do not contain audio)	180 requests per minute ¶	600 requests per minute ¶
Audio input (also known as speech recognition, speech-to-text, STT)	100 requests per minute † 1000 requests per day † 15,000 requests per month † Maximum 60 seconds of audio length per request †	300 requests per minute † Maximum 60 seconds of audio length per request †
Audio output (also known as speech synthesis, text-to-speech, TTS)	Same as audio input	Same as audio input
Knowledge connectors (Beta)	Maximum 10 MB total document size 1000 requests per month 100 requests per day	Unlimited #
Sentiment analysis	Not available	Unlimited #
Dialogflow ES phone gateway Includes audio input and output.	Toll-free number: Not available 3 total phone minutes per minute ‡ 30 phone minutes per day 500 phone minutes per month Phone number reserved for 30 days	100 total phone minutes per minute ‡
Mega agent	1000 requests per day	Unlimited #
Design-time write requests For example, calls to build or update an agent.	60 requests per minute	60 requests per minute
Design-time read requests For example, calls to list or get agent resources.	60 requests per minute	60 requests per minute
Other session requests For example, setting or getting session entities or updating/querying context.	100 requests per minute	100 requests per minute

Symbol	Description
†	Each StreamingDetectIntent stream is considered a single request, even though the stream might include multiple frames of `StreamingDetectIntentRequest` audio within the stream.
‡	"Phone minutes per minute" is the total sum of time per minute that is used by all users making calls through the phone system. For example: in one 60-second time period one user is on for 60 seconds, a second for 30 seconds, and a third for 30 seconds. This is calculated as 2 minutes per minute. More minutes per minute allows more users to use the system simultaneously.
#	Features marked as unlimited are still potentially limited by use of other features. For example, if your edition provides unlimited quota for knowledge connectors, a text request that uses knowledge connectors is still limited by the quota for text requests.
¶	Google Assistant audio input and output is considered a text request.

Agent Assist

If you are using Agent Assist, the quotas listed under the Conversational Agents (Dialogflow CX) Agent and Dialogflow ES Agent tabs also apply, according to which agent type you're using.

Feature	Trial Edition	Essentials Edition
Agent Assist conversation other operations Includes all requests that manage Conversations and Participants.	Not available	300 requests per minute
Agent Assist analyze text/audio operations Includes all AnalyzeContent or StreamingAnalyzeContent requests for any conversation stage. This includes all such requests involving text, speech synthesis, and/or speech recognition.	Not available	300 requests per minute
Agent Assist conversation summary suggestion polling requests Includes all requests for getting conversation summaries.	Not available	300 requests per minute

Limits

Limits are fixed constraints, which cannot be increased. Many resources and fields have count, duration, or length limitations, which are fixed constraints for the service implementation.

The following tables list commonly encountered limits. Feature-specific documentation and API reference documentation may provide additional limits.

Count limits

Conversational Agents (Dialogflow CX) Agent

Description	Limit
Maximum number of agents per Google Cloud project	1000
Maximum number of phone numbers per project	5
Maximum number of flows per agent	50
Maximum number of entity types per agent	250
Maximum number of webhooks per agent	100
Maximum number of environments per agent	20
Maximum number of test cases per agent	2000
Maximum number of changelog per agent	20,000
Maximum number of pages per flow	250
Maximum number of route groups per flow	100
Maximum number of versions per flow	20
Maximum number of intents per agent	10,000
Maximum number of referenced intents per flow	2000
Maximum number of training phrases per intent and per language (except "Default Negative Intent")	2000
Maximum number of training phrases per flow and per language	100,000
Maximum number of parameters per intent	20
Maximum number of entity entries per entity	30,000
Maximum number of entity synonyms per entity entry	200
Maximum number of entity reference values and synonyms per agent and per language	1,000,000
Maximum number of parameters per page	20
Maximum number of routes per page	2000
Maximum number of event handlers per page	100
Maximum number of routes per route group	2000
Maximum number of playbooks per agent	50
Maximum number of playbook invocations in one conversational turn	3
Maximum number of LLM calls per playbook invocation in same conversational turn	10
Maximum number of playbook examples per agent	With the default example retrieval strategy, the system will automatically limit the number of examples to fit in the token limit of the model that is being used, based on the relevance of the examples to the session context.
Maximum number of versions per playbook	100
Maximum number of tools per agent	100

Dialogflow ES Agent

Description	Limit
Maximum number of agents per region and per Google Cloud project ¶	1
Maximum number of phone numbers per project	5
Maximum number of intents	2000
Maximum number of entity types	250
Maximum number of training phrases per intent and per language	2000
Maximum number of training phrases per agent and per language	100,000
Maximum number of entity entries	30000
Maximum number of entity synonyms per entity entry	200
Maximum number of entity reference values and synonyms per agent and per language	1,000,000
Maximum number of parameters per intent	20
Maximum number of input contexts per intent	5
Maximum number of output contexts per intent	30
Maximum number of text responses per intent and per language	30
Maximum number of prompts per parameter and per language	30
Maximum number of environments per agent	10
Maximum number of versions per agent	1000

Agent Assist

If you are using Agent Assist, the limits listed under the Conversational Agents (Dialogflow CX) Agent and Dialogflow ES tabs also apply according to which agent type you're using.

Description	Limit
Minimum number of conversations required for model training	30,000
Maximum number of models trained concurrently per project	1
Maximum number of phone numbers per project	5
Maximum number of training node GPU hours per month	4200

Duration limits

Conversational Agents (Dialogflow CX) Agent

Description	Limit
Maximum timeout per webhook	30 seconds
Maximum audio input duration	120 seconds

Dialogflow ES Agent

Description	Limit
Maximum webhook timeout	5 seconds
Maximum telephony call duration	3.5 hours

Length limits

Conversational Agents (Dialogflow CX) Agent

Description	Limit
Maximum training phrase length	768 characters
Maximum text response length	4,000 characters
Maximum text input length for detect intent matched by non-generative intents or parameters	256 characters

Dialogflow ES Agent

Description	Limit
Minimum agent name length	1 character
Maximum agent name length	150 characters
Maximum agent description length	500 characters
Maximum intent name length	100 characters
Maximum entity type name length	30 characters
Maximum training phrase length	768 characters
Maximum action name length	250 characters
Maximum context name length	250 characters
Maximum entry value length	512 characters
Maximum parameter name length	30 characters
Maximum text response length	4000 characters
Maximum detect intent text input length *	256 characters
Maximum event name length	150 characters

Size limits

Conversational Agents (Dialogflow CX) Agent

Description	Limit
Maximum speech synthesis request size	See the TTS API content limits

Dialogflow ES Agent

Description	Limit

Maximum agent size (unzipped content) for agent import/restore | 50 MB

Maximum speech synthesis request size | See the TTS API content limits