Cloud TPU quotas

This document lists the quotas that apply to Cloud TPU. For information about Cloud TPU pricing, see Cloud TPU pricing.

Google Cloud uses quotas to help ensure fairness and reduce spikes in resource use and availability. A quota restricts how much of a Google Cloud resource your Google Cloud project can use. Quotas apply to a range of resource types, including hardware, software, and network components. For example, quotas can restrict the number of API calls to a service, the number of load balancers used concurrently by your project, or the number of projects that you can create. Quotas protect the community of Google Cloud users by preventing the overloading of services. Quotas also help you to manage your own Google Cloud resources.

The Cloud Quotas system does the following:

  • Monitors your consumption of Google Cloud products and services
  • Restricts your consumption of those resources
  • Provides a way to request changes to the quota value

In most cases, when you attempt to consume more of a resource than its quota allows, the system blocks access to the resource, and the task that you're trying to perform fails.

Quotas generally apply at the Google Cloud project level. Your use of a resource in one project doesn't affect your available quota in another project. Within a Google Cloud project, quotas are shared across all applications and IP addresses.

TPU quota

TPU quotas are limits on the number of Cloud TPU cores you can use with a Google Cloud project. Each version of TPU is associated with its own quota. In addition, each Cloud TPU version quota is divided into on-demand quota and preemptible (or spot) quota.

When you create Cloud TPU resources, by default you are creating on-demand resources. You can create preemptible resources using the --spot parameter when creating resources with the gcloud command. For more information, see Manage TPU resources.

Default Cloud TPU quotas

The following tables show default values for on-demand and preemptible quota for each available zone. These values specify the maximum number of TPU cores you can use within your project.

TPU version Default quota (number of cores) Quota name
v6e 64 cores Preemptible TPU v6e cores per project per zone
v5p 768 cores Preemptible TPU v5p cores per project per zone
v5e 64 cores Preemptible TPU v5 lite pod cores per project per zone
v4 0 cores Preemptible TPU v4 pod cores per project per zone
v3 Pod 32 cores Preemptible TPU v3 pod cores per project per zone
v3 120 cores Preemptible TPU v3 cores per project per zone
v2 Pod 32 cores Preemptible TPU v2 pod cores per project per zone
v2 120 cores Preemptible TPU v2 cores per project per zone
TPU version Default quota (number of cores) Quota name
v6e 32 cores TPU v6e cores per project per zone
v5p 128 cores TPU v5p cores per project per zone
v5e 32 cores TPU v5 lite pod cores per project per zone
v4 0 cores TPU v4 pod cores per project per zone
v3 pod 32 cores TPU v3 pod cores per project per zone
v3 40 cores TPU v3 cores per project per zone
v2 Pod 32 cores TPU v2 cores per project per zone
v2 40 cores TPU v2 pod cores per project per zone

View and request additional quota

You can view the quota allocated for your Google Cloud project on the Quotas page in the Google Cloud console.

You can request additional Cloud TPU quota, from the Quotas page. Find the quota you wish to increase, click the three vertical dots and choose Edit quota. For more information, see Request a higher quota limit. If you request quota below the auto-approve threshold, your request will be automatically approved.

TPU version Auto-approve threshold Quota name
v6e All zones: 0 cores Preemptible TPU v6e cores per project per zone
v5p All zones: 0 cores Preemptible TPU v5p cores per project per zone
v5e
  • us-east5-b: 800 cores
  • us-west4-a: 1600 cores
  • us-west4-b: 3968 cores
  • us-west1-c: 576 cores
  • us-central1-a: 3264 cores
  • europe-west4-a: 4032 cores
Preemptible TPU v5 lite pod cores per project per zone
v4 All zones: 0 cores Preemptible TPU v4 pod cores per project per zone
v3 Pod
  • europe-west4-a: 512 cores
  • us-east1-d: 320 cores
Preemptible TPU v3 pod cores per project per zone
v3
  • us-central1-a: 64 core
  • europe-west4-a: 128 cores
Preemptible TPU v3 cores per project per zone
v2 Pod
  • us-central1-a: 64 cores
  • europe-west4-a: 64 cores
Preemptible TPU v2 pod cores per project per zone
v2
  • us-central1-b: 128 cores
  • us-central1-c: 64 cores
  • us-central1-f: 128 cores
  • europe-west4-a: 32 cores
Preemptible TPU v2 cores per project per zone
TPU version Auto-approve threshold Quota name
v6e All zones: 0 cores TPU v6e cores per project per zone
v5p us-east5-a: 64 core TPU v5p cores per project per zone
v5e All zones: 64 cores TPU v5 lite pod cores per project per zone
v4 All zones: 0 cores TPU v4 pod cores per project per zone
v3 pod
  • europe-west4-a: 128 cores
  • us-east1-d: 64 cores
TPU v3 pod cores per project per zone
v3
  • us-central1-a: 64 cores
  • europe-west4-a: 128 cores
TPU v3 cores per project per zone
v2 Pod
  • us-central1-a: 64 cores
  • europe-west4-a: 64 cores
TPU v2 cores per project per zone
v2
  • us-central1-b: 128 cores
  • us-central1-c: 64 cores
  • us-central1-f: 128 cores
  • europe-west4-a: 32 cores
TPU v2 pod cores per project per zone

You will receive an email stating if your quota request is approved or denied. Google Cloud service quota increases take place gradually. This might result in ongoing rollouts across different regions or resources. During the rollout, the quota value that appears in the Google Cloud console or Cloud Quotas API won't reflect the new, increased quota value until the rollout completes. For more information, see View ongoing rollouts.