Dynamic shared quota

This page explains dynamic shared quota (DSQ) and how DSQ is different from Provisioned Throughput.

Introduction to dynamic shared quota

Dynamic shared quota (DSQ) distributes available on-demand capacity among all queries being processed by Google Cloud services for specific models. This capability eliminates the need to set quota limits and to submit quota increase requests (QIRs).

DSQ processes requests from all customers to the same regional or multi-regional endpoints. Quotas are removed, and available capacity is distributed to each project.

Provisioned Throughput is the only way to ensure high availability for your application and to get predictable service levels for your production workloads. For more information about Provisioned Throughput, see Provisioned Throughput.

Supported models

This section lists models that support dynamic shared quota (DSQ), which is enabled by default in these models.

Google models

The following table lists the Google models (and versions) that support DSQ:

Model DSQ release date Status
Gemini 1.5 Flash (gemini-1.5-flash-002) September 24, 2024 Live
Gemini 1.5 Pro (gemini-1.5-pro-002) September 24, 2024 Live

DSQ quotas aren't listed in the Quotas & System Limits page in the Google Cloud console.

Troubleshoot DSQ errors

When there isn't enough capacity to serve your query, you might receive a 429 error. To troubleshoot errors that might occur, see Error code 429.

What's next