Quota Policy

Quota allocation

Quota is granted differently based on the TPU version you are using. For TPU v2 and v3 quota is defined in terms of Cloud TPU cores. A single Cloud TPU device comprises 4 TPU chips and 8 cores: 2 cores per TPU chip. TPU v2 and v3 have separate quotas for single devices and for TPU Pods. You cannot use a v2 or v3 TPU Pod quota for single devices. For example, if you have quota for a v3-16 Pod slice, you cannot use it to create two v3-8 single devices.

For TPU v4, quota is solely determined in terms of Cloud TPU chips. All TPU v4s are treated as Pod slices, so there is no concept of a single TPU device. You can use your v4 quota in any way you wish. For example, if you have quota for a v4-32 Pod slice you can use this quota to create four v4-8 Pod slices.

  • A Cloud TPU v2 Pod consists of 64 TPU devices containing 256 TPU chips (512 cores).
  • A Cloud TPU v3 Pod consists of 256 TPU devices containing 1024 TPU chips (2048 cores).
  • A Cloud TPU v4 Pod consists of 1024 TPU devices containing 4096 chips and (8192 cores).

The number of cores also specifies the quota for a particular Cloud TPU. For example, a quota of 8 enables the use of 8 cores. A quota of 16 enables use of up to 16 cores, and so forth.

The notation: version-cores, for example v2-8, indicates the Cloud TPU version and the number of cores. Since the number of cores are also used to specify quota, this notation also describes a Cloud TPU quota allocation. For example, v2-32 indicates a TPU v2 type with 32 cores.

When you create a new Google Cloud project, Cloud TPU allocates a default quota to the project.

TPU v4 quota

To receive quota for TPU v4, please contact your sales representative or fill in this sign-up form.

Quota for single device v2 and v3 TPU types

For single device TPU types, there are quota counts for on-demand core counts, and preemptible TPU core counts.

  • On-demand TPUs: Default quota is 16 cores (2 TPU devices).
  • Preemptible TPUs: Default quota is at least 48 cores (6 TPU devices).

Quota for v2 and v3 TPU Pod types

The default quota for Cloud TPU Pods is 0. To use TPU Pod types, you must request evaluation quota or request additional quota.

Evaluation quota

Request access to evaluation quota so that you can test the performance of TPU Pod types. TPU nodes that you create using evaluation quota are billed in one-second increments but do not guarantee the same level of service as on-demand TPU devices or devices that you create using commitment quota. Evaluation quota persists only for a limited amount of time on your project.

Request queueing quota

There is a quota for the number of requests in the queue, shared across all TPU types. A default quota is available to all projects and additional quota may be requested.

Requesting additional quota

The quota allocated for your Google Cloud project is displayed on the Google Cloud console. If you need additional Cloud TPU quota, you can request it from the Quota page in the Google Cloud console using the following procedure:

  1. Go to the Quotas page.

  2. In the Filter box, select Quota from the dropdown list. This opens a new Properties dropdown menu.
  3. From the Properties menu, select TPU tpu-type Pod cores per project per region where tpu-type is the type of TPUs you are using. For example, v2 or v3.

    Alternatively, you can select TPU tpu-type Pod cores per project per zone.

    For a complete list of TPU types available in each zone, see TPU types and zones.

  4. Select one or more regions or zones where you want to use Cloud TPU Pods.
  5. Click Edit Quotas.
  6. Fill out your name, email, and phone number and click Next.
  7. Enter your request to increase your quota and click Next.
  8. Submit your request.

You will receive a response from the Cloud TPU team within 1 to 2 business days of your request.