Quotas and limits

This document lists the quotas and system limits that apply to Dataflow.

Quotas specify the amount of a countable, shared resource that you can use. Quotas are defined by Google Cloud services such as Dataflow.
System limits are fixed values that cannot be changed.

Google Cloud uses quotas to help ensure fairness and reduce spikes in resource use and availability. A quota restricts how much of a Google Cloud resource your Google Cloud project can use. Quotas apply to a range of resource types, including hardware, software, and network components. For example, quotas can restrict the number of API calls to a service, the number of load balancers used concurrently by your project, or the number of projects that you can create. Quotas protect the community of Google Cloud users by preventing the overloading of services. Quotas also help you to manage your own Google Cloud resources.

The Cloud Quotas system does the following:

Monitors your consumption of Google Cloud products and services
Restricts your consumption of those resources
Provides a way to request changes to the quota value

In most cases, when you attempt to consume more of a resource than its quota allows, the system blocks access to the resource, and the task that you're trying to perform fails.

Quotas generally apply at the Google Cloud project level. Your use of a resource in one project doesn't affect your available quota in another project. Within a Google Cloud project, quotas are shared across all applications and IP addresses.

To adjust most quotas, use the Google Cloud console. For more information, see Request a quota adjustment.

There are also system limits on Dataflow resources. System limits can't be changed.

The Dataflow managed service has the following quotas and limits:

Each Google Cloud project can make up to 3,000,000 requests per minute.
Each Dataflow job can use a maximum of 2,000 Compute Engine instances. Without specifying a worker zone, each streaming job using Streaming Engine or batch job using service-based Dataflow Shuffle can use a maximum of 4,000 Compute Engine instances.
Each Google Cloud project can run at most 25 concurrent Dataflow jobs by default.
Each Dataflow worker has a maximum limit of logs that it can output in a time interval. See logging documentation for the exact limit.
If you opt-in to organization level quotas, each organization can run at most 125 concurrent Dataflow jobs by default.
Each user can make up to 15,000 monitoring requests per minute.
Each user can make up to 60 job creation requests per minute.
Each user can make up to 60 job template requests per minute.
Each user can make up to 60 job update requests per minute.
Each Google Cloud project gets the following shuffle slots in each region:
- asia-east1: 48 slots
- asia-northeast1: 24 slots
- asia-northeast3: 32 slots
- asia-south1: 64 slots
- asia-southeast1: 64 slots
- australia-southeast1: 24 slots
- europe-west1: 640 slots
- europe-west2: 32 slots
- europe-west3: 40 slots
- europe-west4: 512 slots
- northamerica-northeast1: 512 slots
- us-central1: 640 slots
- us-east1: 640 slots
- us-east4: 64 slots
- us-west1: 384 slots
- us-west2: 24 slots
- us-west3: 24 slots
- others: 16 slots
16 slots are sufficient to shuffle approximately 10 TB of data concurrently.
Dataflow batch jobs will be cancelled after 10 days.

Compute Engine quotas

When you run your pipeline on the Dataflow service, Dataflow creates Compute Engine instances to run your pipeline code.

Compute Engine quota is specified per region. Review your project's Compute Engine quota and request the following adjustments if needed:

CPUs: In the following regions, the default machine types for Dataflow are n1-standard-1 for batch, n1-standard-2 for jobs that use Streaming Engine, n1-standard-4 for streaming jobs that don't use Streaming Engine, and n1-standard-2 for jobs that use Flexible Resource Scheduling (FlexRS). FlexRS uses 90% preemptible VMs and 10% regular VMs.
- asia-east1
- asia-east2
- asia-northeast1
- asia-northeast2
- asia-northeast3
- asia-south1
- asia-south2
- asia-southeast1
- asia-southeast2
- australia-southeast1
- australia-southeast2
- europe-central2
- europe-north1
- europe-west1
- europe-west2
- europe-west3
- europe-west4
- europe-west5
- europe-west6
- northamerica-northeast1
- northamerica-northeast2
- southamerica-east1
- us-central1
- us-central2
- us-east1
- us-east4
- us-west1
- us-west2
- us-west3
- us-west4
For other regions, the default machine types are e2-standard-2 for batch, e2-standard-2 for jobs that use Streaming Engine, e2-standard-4 for streaming jobs that don't use Streaming Engine, and e2-standard-2 for jobs that use FlexRS.

Compute Engine calculates the number of CPUs by summing each instance's total CPU count. For example, running 10 n1-standard-4 instances counts as 40 CPUs. For a mapping of machine types to CPU count, see Compute Engine machine types.
In-Use IP Addresses: The number of in-use IP addresses in your project must be sufficient to accommodate the desired number of instances. To use 10 Compute Engine instances, you'll need 10 in-use IP addresses.
Persistent Disk: Dataflow attaches Persistent Disk to each instance.
- The default disk size is 250 GB for batch and 400 GB for streaming pipelines. For 10 instances, by default you need 2,500 GB of Persistent Disk for a batch job.
- The default disk size is 25 GB for Dataflow Shuffle batch pipelines.
- The default disk size is 30 GB for Streaming Engine streaming pipelines.
- The Dataflow service is currently limited to 15 persistent disks per worker instance when running a streaming job. Each persistent disk is local to an individual Compute Engine virtual machine. A 1:1 ratio between workers and disks is the minimum resource allotment.
- Compute Engine usage is based on the average number of workers, whereas Persistent Disk usage is based on the exact value of --maxNumWorkers. Persistent Disks are redistributed such that each worker has an equal number of attached disks.
Regional Managed Instance Groups: Dataflow deploys your Compute Engine instances as a Regional Managed Instance Group. You'll need to ensure you have the following related quota available:
- One Instance Group per Dataflow job
- One Instance Template per Dataflow job
- One Regional Managed Instance Group per Dataflow job
If Managed Instance Groups is missing for a streaming job for more than 7 days, the job is cancelled.
If Managed Instance Groups is missing for a batch job for more than 1 hour, the job is cancelled.

Note: If you're using a Free Trial project, your project has a maximum of 8 cores available. You must specify a combination of numWorkers, workerMachineType, and maxNumWorkers that fits within your trial limit.

Additional quotas

Depending on which sources and sinks you are using, you might also need additional quota.

Pub/Sub: If you are using Pub/Sub, you might need additional quota. When planning for quota, note that processing 1 message from Pub/Sub involves 3 operations. If you use custom timestamps, you should double your expected number of operations, since Dataflow will create a separate subscription to track custom timestamps.
BigQuery: If you are using the streaming API for BigQuery, quota limits and other restrictions apply.

Find and increase quotas

You can check your current usage of Dataflow-specific quota:

In the Google Cloud console, go to the APIs & services.
Go to API & Services
To check your current Shuffle slots quota usage, on the Quotas tab, find the Shuffle slots line in the table, and in the Usage Chart column, click Show usage chart.

If you want to increase your job quota, contact Google Cloud Support, and we will increase the limit to a value that better suits your needs. The default quota is 25 concurrent Dataflow jobs for your project or 125 concurrent Dataflow jobs for your organization.

Additionally, you can increase your Shuffle slots quota for batch jobs by submitting a support request and specifying the expected maximum concurrent Shuffle dataset size for all jobs in your project. Prior to requesting additional Shuffle quota, please run your pipeline using Dataflow Shuffle and check the actual Shuffle quota usage.

For streaming jobs, you can increase your Streaming Engine throughput by submitting a support request to Google Cloud Platform Support. In your request, specify the maximum amount of data you want to shuffle between workers every minute for each region in which your job runs.

The Dataflow service also exercises various components of the Google Cloud, such as BigQuery, Cloud Storage, Pub/Sub, and Compute Engine. These (and other Google Cloud services) employ quotas to cap the maximum number of resources you can use within a project. When you use Dataflow, you might need to adjust your quota settings for these services.

Dataflow Prime

Quotas and limits are the same for Dataflow and Dataflow Prime. If you have quotas for Dataflow, then you don't need additional quota to run your jobs using Dataflow Prime.

Limits

This section describes practical production limits for Dataflow.

Limit	Amount
Maximum number of workers per pipeline.	2,000
Maximum size for a job creation request. Pipeline descriptions with a lot of steps and very verbose names may reach this limit.	10 MB
Maximum size for a template launch request.	1 MB
Maximum number of side input shards.	20,000
Maximum size for a single element (except where stricter conditions apply, for example Streaming Engine).	2 GB
Maximum key size in batch pipelines.	1.5 MB
Maximum number of log entries in a given time period, per worker.	15,000 messages every 30 seconds
Maximum number of custom metrics per project.	100
Length of time that recommendations will be stored.	30 days

Streaming Engine Limits	Amount
Maximum bytes for Pub/Sub messages.	7 MB
Maximum size for a single element value.	80 MB
Maximum size of a large key. Keys over 64 KB cause decreased performance.	2 MB
Maximum size for a side input.	80 MB
Maximum length for state tags used by `TagValue` and `TagBag`.	64 KB