Cloud TPU pricing
Cloud TPUs are custom supercomputers designed to run cutting-edge machine learning models on Google Cloud, with industry-leading price-performance. The exaflops of compute power can help you transform your business or create the next research breakthrough.
Learn more about how Cloud TPU v4 Pods help ML researchers and developers push the boundaries of AI in a sustainable and efficient manner.
Learn which of Cloud TPU products works best for your unique project needs.
How Cloud TPU pricing works
Cloud TPU v4 Pods are the latest generation of Google's custom ML accelerators and are now available in GA. All TPU v4 Pod slice shapes use the same v4 pricing system.
v3 TPU pricing and quota, however, are divided into two systems:
- Single device TPU pricing for individual TPU devices that are available either on-demand or as preemptible devices (70% discount on the evaluation list price). Single device TPU types are independent TPU devices without direct network connections to other TPU devices in a Google data center. If your workloads require more TPU cores and a larger pool of memory, use a TPU Pod slice instead.
- TPU Pod pricing for clusters of TPU devices that are connected to each other over dedicated high-speed networks. These TPU types are available if you have an evaluation quota, preemptible quota (70% discount on the evaluation list price) or purchase a 1 year or 3 year commitment.
For more information about TPU v2, v3, and v4, see TPU System Architecture.
Charges for Cloud TPU accrue while your TPU node is in a
You receive a bill at the end of each billing cycle that lists usage and
charges for that billing cycle.
|Type||TPU products||Billing||Best fit for:|
|3-year commitment (3Y CUD)||TPU v4 Pods, TPU v3 Pods, TPU v2 Pods||Monthly, based on reserved quota||ML users who need consistent access to 512+ cores of capacity|
|1-year commitment (1Y CUD)||TPU v4 Pods, TPU v3 Pods, TPU v2 Pods||Monthly, based on reserved quota||ML users who need consistent access to 32-512 cores of capacity|
|Evaluation (on-demand)||TPU v4 Pods, TPU v3 Pods, TPU v2 Pods, TPU v3*, TPU v2*||Hourly, based on actual usage||ML users who want to run some short-term experiments or benchmarks|
|Preemptible||TPU v4 Pods, TPU v3*, TPU v3 Pods, TPU v2*, TPU v2 Pods||Hourly, based on actual usage||ML users who want to run batch / fault-tolerant workloads|
*Single devices with 8 cores. On SKU page, these devices are called Tpu-v2 or Tpu-v3 Accelerators.
Chips vs Cores vs VMs
1 TPU VM (TPU Virtual Machine) has 4 chips and 8 cores. The billing in the Google Cloud console is displayed in VM-hours (for example, the on-demand price for a single Cloud TPU v4 host, which includes four TPU v4 chips, is displayed as $12.88 per hour). Usage data in the Google Cloud console is also measured in VM-hours.
Free access via TRC
New customers get $300 in free credits to spend on Google Cloud. Get started now.
If you are a researcher, student, tinkerer, artist, or entrepreneur, consider applying to the TPU Research Cloud program. TRC members are granted free access to a large cluster of Cloud TPUs, and they share their work with the world via peer-reviewed publications, open source code, blog posts, videos, and other media. (Here are examples of TRC-supported publications.)
Apply to accelerate your research today!
Cloud TPU v4 pricing
Cloud TPU v4 is the latest generation of Google's custom machine learning accelerators and is now available in GA. It retains backwards compatibility with Cloud TPU v2 and v3, but has a >2x increase over Cloud TPU v3 in raw compute performance per chip. Each TPU v4 chip also contains a single logical core, enabling utilization of a full 32 GiB of memory from one program, compared to 8 GiB on v2 and 16 GiB on v3. Cloud TPU v4 Pod slices are connected with a custom interconnect that uses a 3D mesh topology, an upgrade from the 2D mesh in v2 and v3, and are available in configurations ranging from four chips (one TPU VM) to thousands of chips.
Cloud TPU v4 Pods are available in us-central2-b, Google's data center operating at 90% carbon-free energy on an hourly basis within the same grid. This is the world's largest publicly available ML hub with up to 9 exaflops of peak aggregate performance.
The following table shows the pricing for Cloud TPU v4 configurations. The
v4 pricing is based on the number of chips in the topology. There are 2 cores
in each chip.
|TPU v4 Pricing||Price per chip-hour||% discount from on-demand|
|on-demand / evaluation||$3.22|
|1Y CUD (committed use discount) reservation||$2.03||37%|
|3Y CUD (committed use discount) reservation||$1.45||55%|
Cloud TPU v2 & v3 pricing
Cloud TPU v2 and v3 TPU pricing and quota are divided into two systems:
- Single device TPU Type pricing for individual TPU devices that are available either on-demand or as preemptible devices. You cannot combine multiple single device TPU types to collaborate on a single workload.
- TPU Pod Type pricing for clusters of TPU devices that are connected to each other over dedicated high-speed networks. These TPU types are available only if you have evaluation quota or purchase a 1 year or 3 year commitment.
See the TPU System Architecture documentation for architectural details and the differences between v2, v3, and v4.
Single device pricing
Single device TPU types are billed in one-second increments and are available at either an on-demand or preemptible price.
Single device TPU types are independent TPU devices without direct network connections to other TPU devices in a Google data center. If your workloads require more TPU cores and a larger pool of memory, use a TPU Pod type instead.
A preemptible TPU is a TPU that can be preempted at any time if Cloud TPU requires access to the resources for another task. The charges for a preemptible TPU are much lower than those for a normal TPU. You are not charged for preemptible TPUs if they are preempted in the first minute after you create them.
TPU Pod type pricing
TPU Pod types provide access to multiple TPU devices that are connected on a dedicated high-speed network. These TPU types provide greater compute capacity and a larger pool of TPU memory than a single TPU node. To use TPU Pod types, you must request quota using one of the following options:
- Request access to evaluation quota so that you can test the performance of TPU Pod types. TPU nodes that you create using evaluation quota are billed in one-second increments but do not guarantee the same level of service as on-demand TPU devices or devices that you create using commitment quota. Evaluation quota persists only for a limited amount of time in your project.
- Purchase a 1 year or 3 year commitment and create TPU nodes with up to 2048 cores. Commitments allow for access to reserved cores at any time during the duration of the contract. You are billed a set monthly fee for the duration of your commitment term even if you do not use any TPU resources.
You can configure your TPU nodes with the following TPU types:
If you pay in a currency other than USD, the prices listed in your currency on Cloud Platform SKUs apply.
To learn about the differences between different TPU versions and configurations, read the TPU System Architecture documentation.
Optimize your cost
Cloud TPU v4 provides up to 35% savings on Transformer-based models and up to 50% on ResNet compared to A100 on Azure.
The savings are especially meaningful given that real-world models such as GPT-3 and PaLM are much larger than the BERT and ResNet models used in the MLPerf benchmark: PaLM is a 540 billion parameter model, while the BERT model used in the MLPerf benchmark has only 340 million parameters — a 1000x difference in scale. Based on our experience, the benefits of TPUs will grow significantly with scale and make the case all the more compelling for training on Cloud TPU v4.
Estimate your cost
Estimate the cost of using Cloud TPU with the Compute Engine pricing calculator, choose “Cloud TPU” on the top bar.
Take the next step
- Use the Cloud TPU sign up form to purchase quota and/or learn more about Cloud TPU. Alternatively, contact our sales team!
- Read the Cloud TPU v4 launch blog post
- Watch TPU v4 announcement from Google I/O 2022
- Learn more about TPU v4 record setting MLPerf 2.0 results
- Read the Cloud TPU quota policy page to learn how to request quota for different TPU types.
- Check the regions and zones in which Cloud TPU is available.
- Check the release notes for future updates on pricing.
- Read the Cloud TPU documentation.
- Get started with Cloud TPU.
- Learn about Cloud TPU solutions and use cases.