Connect a TPU Node to a shared VPC network
How you connect a TPU host to a Shared VPC depends upon which TPU architecture (TPU VM or TPU Node) you are using. For more information about TPU architectures, see TPU System Architecture.
Connect a TPU VM to a Shared VPC Network
Configure a VPC host project
When you use the TPU VM architecture, you need to grant the TPU Service Account
in your service project
permissions to manage resources in the host project.
You do this using the "TPU Shared VPC Agent" (roles/tpu.xpnAgent
) role. Run
the following gcloud commands to grant this role binding.
gcloud projects add-iam-policy-binding host-project-id \ --member=serviceAccount:service-your-service-project-number@gcp-sa-tpu.iam.gserviceaccount.com \ --role=roles/tpu.xpnAgent
Create a TPU VM connected to a Shared VPC Network
First determine which accelerator types and versions are available in the zone
gcloud compute tpus accelerator-types list --zone zone
gcloud compute tpus versions list --zone zone
You connect a TPU VM to a shared VPC network when you create your TPU. Specify
your shared VPC using the --network
tag:
gcloud compute tpus tpu-vm create tpu-name \ --zone zone \ --accelerator-type accelerator-type \ --network projects/host-project-id/global/networks/host-network \ --version runtime-version \ --project your-service-project-id
You can verify your TPU VM is connected to your shared VPC using the gcloud compute tpus tpu-vm describe
command:
$ gcloud compute tpus tpu-vm describe tpu-name --zone zone
The response includes the network to which your TPU VM is attached:
acceleratorType: v3-8 apiVersion: V2 cidrBlock: 10.128.0.0/20 createTime: '2022-06-17T21:32:13.859274143Z' health: HEALTHY id: '0000000000000000000' name: projects/my-project/locations/us-central1-b/nodes/my-tpu networkConfig: enableExternalIps: true network: projects/my-project/global/networks/default subnetwork: projects/my-project/regions/us-central1/subnetworks/default networkEndpoints: - accessConfig: externalIp: 000.000.000.000 ipAddress: 10.128.0.104 port: 8470 runtimeVersion: tpu-vm-tf-2.8.0 schedulingConfig: {} serviceAccount: email: 00000000000-compute@developer.gserviceaccount.com scope: - https://www.googleapis.com/auth/devstorage.read_write - https://www.googleapis.com/auth/logging.write - https://www.googleapis.com/auth/service.management - https://www.googleapis.com/auth/servicecontrol - https://www.googleapis.com/auth/cloud-platform - https://www.googleapis.com/auth/pubsub shieldedInstanceConfig: {} state: READY
Delete the TPU VM
When you are done with the TPU VM, make sure to delete it.
gcloud compute tpus tpu-vm delete tpu-name \ --zone zone \
Connecting a TPU Node to a Shared VPC
Configure Private Service Access
Before you use TPU Nodes with Shared VPCs, you need to establish a private service access connection.
Enable the Service Networking API using the following Google Cloud CLI command. This only needs to be done once per Cloud Platform Project.
gcloud services enable servicenetworking.googleapis.com
Allocate a reserved address range for use by Service Networking. The
prefix-length
needs to be 24 or less. For example:gcloud compute addresses create sn-range-1 --global \ --addresses=10.110.0.0 \ --prefix-length=16 \ --purpose=VPC_PEERING \ --network=network-name
Establish a private service access connection.
$ gcloud services vpc-peerings connect --service=servicenetworking.googleapis.com \ --ranges=sn-range-1 \ --network=network-name
Verify the VPC peering has been created. The following command lists all VPC peerings for the specified network.
gcloud services vpc-peerings list --network=network-name
Connect a TPU Node with a Shared VPC Network
You connect a TPU Node to a shared VPC network when you create your TPU. Specify
your shared VPC using the --network
tag:
$ gcloud compute tpus execution-groups create \
--name=tpu-name \
--zone=zone \
--tf-version=2.12.0 \
--machine-type=n1-standard-1 \
--accelerator-type=v3-8 \
--network=network-name
Retrieve information about a specific TPU Node
$ gcloud compute tpus describe tpu-name --zone zone
The response contains the following information:
acceleratorType: v3-8 apiVersion: V1 cidrBlock: 00.0.000.000/29 createTime: '2022-11-30T18:59:20.655858097Z' health: HEALTHY ipAddress: 00.000.0.000 name: projects/ml-writers/locations/us-central1-a/nodes/mikegre-vcp network: global/networks/mikegre-vpc networkEndpoints: - ipAddress: 00.0.000.000 port: 8470 port: '8470' schedulingConfig: {} serviceAccount: service-00000000000@cloud-tpu.iam.gserviceaccount.com state: READY tensorflowVersion: 2.10.0
Delete the TPU Node
When you are done with the TPU Node, make sure to delete it.
$ gcloud compute tpus execution-groups delete tpu-name \
--zone=zone
### Delete the VPC peering
A peering connection can be disconnected using the compute networking API.
These calls should be made in Shared VPC host projects.
1. List all VPC peerings to find the name of the peering to delete.
$ gcloud compute networks peerings list --network=network-name
Delete a VPC peering.
$ gcloud compute networks peerings delete peering-name --network=network-name