Cloud TPU pricing
v2
and v3
TPU pricing and quota are divided into two systems:
- Single device TPU type pricing for individual TPU devices that are available either on-demand or as preemptible devices. You cannot combine multiple single device TPU types to collaborate on a single workload.
- TPU Pod type pricing for clusters of TPU devices that are connected to each other over dedicated high-speed networks. These TPU types are available only if you have evaluation quota or purchase a 1 year or 3 year commitment.
With Cloud TPU v4
, all configurations consist of Pod slices,
so there is only one v4
pricing system.
See the TPU System Architecture documentation for architectural details and the differences between v2, v3, and v4.
Charges for Cloud TPU accrue while your TPU node is in a READY
state.
You receive a bill at the end of each billing cycle that lists usage and
charges for that billing cycle.
Cloud TPU v4 pricing
Cloud TPU v4 is the latest generation of Google’s custom silicon for machine learning and is now available in Preview. It retains backwards compatibility with Cloud TPU v2 and v3, but has a >2x increase over Cloud TPU v3 in raw compute performance per chip. Each TPU v4 chip also contains a single logical core, enabling utilization of a full 32 GiB of memory from one program, compared to 8 GiB on v2 and 16 GiB on v3. Cloud TPU v4 Pod slices are connected with a custom interconnect that uses a 3D mesh topology, an upgrade from the 2D mesh in v2 and v3, and are available in configurations ranging from four chips (one TPU VM) to thousands of chips.
TPU v4 Pods are available in us-central2-b, Google's datacenter operating at 90% carbon-free energy on an hourly basis within the same grid.
Use the Cloud TPU v4 sign up form to learn more about Cloud TPU v4 Pods and to get pre-GA access.
The following table shows the pricing in place for Cloud TPU
v4 configurations. The v4
pricing
is based on the number of chips in the topology. There are 2 cores in
each chip.
TPU v4 Pricing | Price per chip-hour | % discount from on-demand |
---|---|---|
on-demand / evaluation | $3.22 | |
1Y CUD | $2.03 | 37% |
3Y CUD | $1.45 | 55% |
preemptible | $0.97 | 70% |
Cloud TPU v3 and Cloud TPU v4 features and price comparison
Cloud TPU v3 Pod | Cloud TPU v4 Pod | |
---|---|---|
Key specifications | ||
Peak compute per chip | 123 teraflops (bf16) | 275 teraflops (bf16 or int8) |
HBM2 capacity and bandwidth | 32 GiB, 900 GB/s | 32 GiB, 1200 GB/s |
Measured min/mean/max power | 123/220/262 W | 90/170/192 W |
TPU Pod size | 1024 chips | 4096 chips |
Interconnect topology | 2D torus | 3D torus |
Peak compute per Pod | 126 petaflops (bf16) | 1.1 exaflops (bf16 or int8) |
All-reduce bandwidth per Pod | 340 TB/s | 1.1 PB/s |
Bisection bandwidth per Pod | 6.4 TB/s | 24 TB/s |
Pricing per chip-hour | ||
Evaluation | $2.00 | $3.22 |
1Y CUD (37%) | $1.26 | $2.03 |
3Y CUD (55%) | $0.90 | $1.45 |
Preemptible | $0.60 | $0.97 |
Pricing comparison notes
- Cloud TPU v4 Pod pricing is shown for us-central2-b location.
- Cloud TPU v3 Pod pricing is shown for us-east1-d location.
- Each TPU v3 chip has two cores. The pricing per-chip is shown for comparative purposes.
- CUD stands for “committed use discount".
How to purchase v4 quota
Contact your sales team or fill in this order form.
Single device pricing
Single device TPU types are billed in one-second increments and are available at either an on-demand or preemptible price.
Single device TPU types are independent TPU devices without direct network connections to other TPU devices in a Google data center. If your workloads require more TPU cores and a larger pool of memory, use a TPU Pod type instead.
A preemptible TPU is one that Cloud TPU can exit (preempt) at any time if Cloud TPU requires access to the resources for another task. The charges for a preemptible TPU are much lower than those for a normal TPU. You are not charged for preemptible TPUs if they are preempted in the first minute after you create them.
You can configure your TPU nodes with the following single device TPU types:
If you pay in a currency other than USD, the prices listed in your currency on Cloud Platform SKUs apply.
TPU Pod type pricing
TPU Pod types provide access to multiple TPU devices that are all connected on a dedicated high-speed network. These TPU types provide greater compute capacity and a larger pool of TPU memory to a single TPU node. To use TPU Pod types, you must request quota using one of the following options:
- Request access to evaluation quota so that you can test the performance of TPU Pod types. TPU nodes that you create using evaluation quota are billed in one-second increments but do not guarantee the same level of service as on-demand TPU devices or devices that you create using commitment quota. Evaluation quota persists only for a limited amount of time on your project.
- Purchase a 1 year or 3 year commitment and create TPU nodes with up to 2048 cores. Commitments are not billed incrementally. Commitments allow for access to reserved cores for all hours of the day on an on-going month over month basis for the duration of the contract. Commitments bill you a monthly fee for the duration of your commitment term even if you do not use any TPU resources.
You can configure your TPU nodes with the following TPU types:
If you pay in a currency other than USD, the prices listed in your currency on Cloud Platform SKUs apply.
To learn about the differences between different TPU versions and configurations, read the TPU System Architecture documentation.
What's next
- Read the Cloud TPU quota policy page to learn how to request quota for different TPU types.
- Estimate the cost of using Cloud TPU with the Compute Engine pricing calculator.
- Check the regions and zones in which Cloud TPU is available.
- Check the release notes for future updates on pricing.
- Read the Cloud TPU documentation.
- Get started with Cloud TPU.
- Learn about Cloud TPU solutions and use cases.
- If you're enrolled in the TRC program you are granted access to Cloud TPU v2 and v3 for a limited period of time free of charge.