[[["容易理解","easyToUnderstand","thumb-up"],["確實解決了我的問題","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["難以理解","hardToUnderstand","thumb-down"],["資訊或程式碼範例有誤","incorrectInformationOrSampleCode","thumb-down"],["缺少我需要的資訊/範例","missingTheInformationSamplesINeed","thumb-down"],["翻譯問題","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["上次更新時間:2025-09-04 (世界標準時間)。"],[],[],null,["# TPU v3\n======\n\nThis document describes the architecture and supported configurations of\nCloud TPU v3.\n\nSystem architecture\n-------------------\n\nEach v3 TPU chip contains two TensorCores. Each TensorCore has two matrix-multiply units (MXUs), a\nvector unit, and a scalar unit. The following table shows the key specifications\nand their values for a v3 TPU Pod.\n\nThe following diagram illustrates a TPU v3 chip.\n\nArchitectural details and performance characteristics of TPU v3 are available in\n[A Domain Specific Supercomputer for Training Deep Neural Networks](https://dl.acm.org/doi/pdf/10.1145/3360307).\n\n### Performance benefits of TPU v3 over v2\n\nThe increased FLOPS per TensorCore and memory capacity in TPU v3 configurations\ncan improve the performance of your models in the following ways:\n\n- TPU v3 configurations provide significant performance benefits per\n TensorCore for compute-bound models. Memory-bound models on TPU v2\n configurations might not achieve this same performance improvement if they\n are also memory-bound on TPU v3 configurations.\n\n- In cases where data does not fit into memory on TPU v2 configurations, TPU\n v3 can provide improved performance and reduced recomputation of\n intermediate values (rematerialization).\n\n- TPU v3 configurations can run new models with batch sizes that did not fit\n on TPU v2 configurations. For example, TPU v3 might allow deeper ResNet models and\n larger images with RetinaNet.\n\nModels that are nearly input-bound (\"infeed\") on TPU v2 because training steps\nare waiting for input might also be input-bound with Cloud TPU v3. The\npipeline performance guide can help you resolve infeed issues.\n\nConfigurations\n--------------\n\nA TPU v3 Pod is composed of 1024 chips interconnected with high-speed links. To\ncreate a TPU v3 device or slice, use the `--accelerator-type`\nflag in the TPU creation command (`gcloud compute tpus tpu-vm`). You specify the accelerator type by specifying the\nTPU version and the number of TPU cores. For example, for a single v3 TPU, use\n`--accelerator-type=v3-8`. For a v3 slice with 128 TensorCores, use\n`--accelerator-type=v3-128`.\n\nThe following table lists the supported v3 TPU types:\n\nThe following command shows how to create a v3 TPU slice with 128 TensorCores: \n\n```bash\n $ gcloud compute tpus tpu-vm create tpu-name \\\n --zone=europe-west4-a \\\n --accelerator-type=v3-128 \\\n --version=tpu-ubuntu2204-base\n```\n\nFor more information about managing TPUs, see [Manage TPUs](/tpu/docs/managing-tpus-tpu-vm).\nFor more information about the system architecture of Cloud TPU, see\n[System architecture](/tpu/docs/system-architecture)."]]