Cloud TPU quotas
This document lists the quotas that apply to Cloud TPU. For information about Cloud TPU pricing, see Cloud TPU pricing.
A quota restricts how much of a shared Google Cloud resource your Google Cloud project can use, including hardware, software, and network components. Therefore, quotas are a part of a system that does the following:
- Monitors your use or consumption of Google Cloud products and services.
- Restricts your consumption of those resources, for reasons that include ensuring fairness and reducing spikes in usage.
- Maintains configurations that automatically enforce prescribed restrictions.
- Provides a means to request or make changes to the quota.
In most cases, when a quota is exceeded, the system immediately blocks access to the relevant Google resource, and the task that you're trying to perform fails. In most cases, quotas apply to each Google Cloud project and are shared across all applications and IP addresses that use that Google Cloud project.
TPU quota
There are different quotas for each version of TPU. For example there are different quotas for TPU v2, v3, and so on. For each version of TPU there are different types of quota: on-demand and preemptible (spot). The following table describes the different types of quota.
Quota type | Description | Default value | How to request | Flags for TPU creation |
---|---|---|---|---|
On-demand | The number of on-demand resources for which you have access. On-demand resources won't be preempted, but on-demand quota does not guarantee there will be enough available Cloud TPU resources to satisfy your request. |
v3-8 and v2-8: 16 TensorCores All others: 0 |
See Request additional quota. | No flags needed, selected by default. |
Preemptible | The number of preemptible Cloud TPU resources for which you have access. This quota applies to both preemptible TPUs and TPU Spot VMs. Preemptible resources may be preempted to make room for higher priority jobs. Preemptible quota does not guarantee there will be enough available Cloud TPU resources to satisfy your request. For more information, see Preemptible TPUs and Manage TPU Spot VMs. |
v3-8 and v2-8: 48 TensorCores All others: 0 |
See Request additional quota. |
|
TPU quotas are specified in terms of TPU cores per project per zone or TPU cores per project per region.
TPU v5p quotas
You can use your TPU v5p quota in any combination of cores. For example, if you have quota for 32 cores, you can use this quota to create four TPU slices each with 8 cores.
Preemptible quotas:
- Preemptible TPU v5p cores per project per region
- Preemptible TPU v5p cores per project per zone
On-demand quotas:
- TPU v5p cores per project per region
- TPU v5p cores per project per zone
TPU v5e quotas
TPU v5e can be used for training and serving. There are separate quotas for training and serving as well as single-host (lite cores) and multi-host (lite pod cores).
Serving quotas
Preemptible serving quotas:
- Preemptible TPU v5 lite pod cores for serving per project per region
- Preemptible TPU v5 lite pod cores for serving per project per zone
On-demand serving quotas:
- TPU v5 lite pod cores for serving per project per region
- TPU v5 lite pod cores for serving per project per zone
Training quotas
Preemptible training quotas:
- Preemptible TPU v5 lite cores per project per region
- Preemptible TPU v5 lite cores per project per zone
- Preemptible TPU v5 lite pod cores per project per region
- Preemptible TPU v5 lite pod cores per project per zone
On-demand training quotas:
- TPU v5 lite cores per project per region
- TPU v5 lite cores per project per zone
- TPU v5 lite pod cores per project per region
- TPU v5 lite pod cores per project per zone
TPU v4 quotas
You can use your TPU v4 quota in any combination of cores. For example, if you have quota for 32 cores, you can use this quota to create four TPU slices each with 8 cores.
Preemptible quotas:
- Preemptible TPU v4 pod cores per project per region
- Preemptible TPU v4 pod cores per project per zone
On demand quotas:
- TPU v4 pod cores per project per region
- TPU v4 pod cores per project per zone
TPU v3 quotas
There are separate TPU v3 quotas for single host TPUs (core) and mulithost TPUs (pod). You must use v3 pod quotas to create TPUs with more than 8 cores.
Preemptible quotas:
- Preemptible TPU v3 cores per project per region
- Preemptible TPU v3 cores per project per zone
- Preemptible TPU v3 pod cores per project per region
- Preemptible TPU v3 pod cores per project per zone
On demand quotas:
- TPU v3 cores per project per region
- TPU v3 cores per project per zone
- TPU v3 pod cores per project per region
- TPU v3 pod cores per project per zone
TPU v2 quotas
There are separate TPU v2 quotas for single-host TPUs (core) and multi-host TPUs (pod).
Preemptible quotas:
- Preemptible TPU v2 cores per project per region
- Preemptible TPU v2 cores per project per zone
- Preemptible TPU v2 pod cores per project per region
- Preemptible TPU v2 pod cores per project per zone
On demand quotas:
- TPU v2 cores per project per region
- TPU v2 cores per project per zone
- TPU v2 pod cores per project per region
- TPU v2 pod cores per project per zone
For more information about TPU chips and TensorCores, see TPU System architecture.
View and request additional quota
You can view the quota allocated for your Google Cloud project on the Quotas page in the Google Cloud console. If you need additional Cloud TPU quota, you can request it from the Quotas page. For more information, see Request a higher quota limit.