[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-09-04。"],[],[],null,["# TPU v6e\n=======\n\nThis document describes the architecture and supported configurations of\nCloud TPU v6e (Trillium).\n\nTrillium is Cloud TPU's latest generation AI accelerator. On all technical\nsurfaces, such as the API and logs, and throughout this document, Trillium will\nbe referred to as v6e.\n\nWith a 256-chip footprint per Pod, v6e shares many similarities with\n[v5e](/tpu/docs/v5e). This system is optimized to be the highest value product for\ntransformer, text-to-image, and convolutional neural network (CNN) training,\nfine-tuning, and serving.\n\nSystem architecture\n-------------------\n\nEach v6e chip contains one TensorCore. Each TensorCore has 2 matrix-multiply\nunits (MXU), a vector unit, and a scalar unit. The following table shows the key\nspecifications and their values for TPU v6e compared to TPU v5e.\n\nSupported configurations\n------------------------\n\nThe following table shows the 2D slice shapes that are supported for v6e:\n\n| **Note:** The 8-chip (2x4) configuration attached to 2 VMs is only supported when using the GKE API.\n\nSlices with 8 chips (`v6e-8`) attached to a single VM are optimized for\ninference, allowing all 8 chips to be used in a single serving workload. You can\nperform multi-host inference using Pathways on Cloud. For more information, see\n[Perform multihost inference using Pathways](/ai-hypercomputer/docs/workloads/pathways-on-cloud/multihost-inference)\n\nFor information about the number of VMs for each topology, see\n[VM Types](#vm-types).\n\n### VM types\n\nEach TPU v6e VM can contain 1, 4, or 8 chips. 4-chip and smaller\nslices have the same non-uniform memory access (NUMA) node. For more information\nabout NUMA nodes, see [Non-uniform memory\naccess](https://en.wikipedia.org/wiki/Non-uniform_memory_access) on Wikipedia.\n\nv6e slices are created using half-host VMs, each with 4 TPU chips. There are two\nexceptions to this rule:\n\n- `v6e-1`: A VM with only a single chip, primarily intended for testing\n- `v6e-8`: A full-host VM that has been optimized for an inference use case with all 8 chips attached to a single VM.\n\nThe following table shows a comparison of TPU v6e VM types:\n\n| **Note:** We don't recommend using a full-host VM (`v6e-8` with one VM) for dual networks due to performance impacts.\n\nSpecify v6e configuration\n-------------------------\n\nWhen you allocate a TPU v6e slice using the TPU API, you specify its size and\nshape using the [`AcceleratorType`](#accelerator-type) parameter.\n\nIf you're using GKE, use the `--machine-type` flag to specify a\nmachine type that supports the TPU you want to use. For more information, see\n[Plan TPUs in GKE](/kubernetes-engine/docs/concepts/plan-tpus) in the GKE\ndocumentation.\n\n### Use `AcceleratorType`\n\nWhen you allocate TPU resources, you use `AcceleratorType` to specify the number\nof TensorCores in a slice. The value you specify for\n`AcceleratorType` is a string with the format: `v$VERSION-$TENSORCORE_COUNT`.\nFor example, `v6e-8` specifies a v6e TPU slice with 8 TensorCores.\n\nThe following example shows how to create a TPU v6e slice with 32 TensorCores\nusing `AcceleratorType`: \n\n### gcloud\n\n```bash\n $ gcloud compute tpus tpu-vm create tpu-name \\\n --zone=zone \\\n --accelerator-type=v6e-32 \\\n --version=v2-alpha-tpuv6e\n```\n\n### Console\n\n1. In the Google Cloud console, go to the **TPUs** page:\n\n [Go to TPUs](https://console.cloud.google.com/compute/tpus)\n2. Click **Create TPU**.\n\n3. In the **Name** field, enter a name for your TPU.\n\n4. In the **Zone** box, select the zone where you want to create the TPU.\n\n5. In the **TPU type** box, select `v6e-32`.\n\n6. In the **TPU software version** box, select `v2-alpha-tpuv6e`. When\n creating a Cloud TPU VM, the TPU software version specifies the\n version of the TPU runtime to install. For more information, see [TPU VM\n images](/tpu/docs/runtimes).\n\n7. Click the **Enable queueing** toggle.\n\n8. In the **Queued resource name** field, enter a name for your queued\n resource request.\n\n9. Click **Create**.\n\nWhat's next\n-----------\n\n- Run [training and inference using TPU v6e](/tpu/docs/v6e-intro)"]]