Connect a TPU to a shared VPC network

How you connect a TPU host to a Shared VPC depends upon which TPU architecture (TPU VM or TPU Node) you are using. For more information about TPU architectures, see TPU System Architecture.

Connect a TPU VM to a Shared VPC Network

Configure a VPC host project

When you use the TPU VM architecture, you need to grant the TPU Service Account in your service project permissions to manage resources in the host project. You do this using the "TPU Shared VPC Agent" (roles/tpu.xpnAgent) role. Run the following gcloud commands to grant this role binding.

gcloud projects add-iam-policy-binding host-project-id \
--member=serviceAccount:service-your-service-project-number@gcp-sa-tpu.iam.gserviceaccount.com \
--role=roles/tpu.xpnAgent 

Create a TPU VM connected to a Shared VPC Network

First determine which accelerator types and versions are available in the zone

gcloud compute tpus accelerator-types list --zone zone
gcloud compute tpus versions list --zone zone

You connect a TPU VM to a shared VPC network when you create your TPU. Specify your shared VPC using the --network tag:

gcloud compute tpus tpu-vm create tpu-name \
--zone zone \
--accelerator-type accelerator-type \
--network projects/host-project-id/global/networks/host-network \
--version runtime-version \
--project your-service-project-id

You can verify your TPU VM is connected to your shared VPC using the gcloud compute tpus tpu-vm describe command:

$ gcloud compute tpus tpu-vm describe tpu-name --zone zone

The response includes the network to which your TPU VM is attached:

acceleratorType: v3-8
apiVersion: V2
cidrBlock: 10.128.0.0/20
createTime: '2022-06-17T21:32:13.859274143Z'
health: HEALTHY
id: '0000000000000000000'
name: projects/my-project/locations/us-central1-b/nodes/my-tpu
networkConfig:
  enableExternalIps: true
  network: projects/my-project/global/networks/default
  subnetwork: projects/my-project/regions/us-central1/subnetworks/default
networkEndpoints:
- accessConfig:
    externalIp: 000.000.000.000
  ipAddress: 10.128.0.104
  port: 8470
runtimeVersion: tpu-vm-tf-2.8.0
schedulingConfig: {}
serviceAccount:
  email: 00000000000-compute@developer.gserviceaccount.com
  scope:
  - https://www.googleapis.com/auth/devstorage.read_write
  - https://www.googleapis.com/auth/logging.write
  - https://www.googleapis.com/auth/service.management
  - https://www.googleapis.com/auth/servicecontrol
  - https://www.googleapis.com/auth/cloud-platform
  - https://www.googleapis.com/auth/pubsub
shieldedInstanceConfig: {}
state: READY

Delete the TPU VM

When you are done with the TPU VM, make sure to delete it.

gcloud compute tpus tpu-vm delete tpu-name \
--zone zone \

Connecting a TPU Node to a Shared VPC

Configure Private Service Access

Before you use TPU Nodes with Shared VPCs, you need to establish a private service access connection.

  1. Enable the Service Networking API using the following Google Cloud CLI command. This only needs to be done once per Cloud Platform Project.

    gcloud services enable servicenetworking.googleapis.com
    
  2. Allocate a reserved address range for use by Service Networking. The prefix-length needs to be 24 or less. For example:

    gcloud compute addresses create sn-range-1 --global \
    --addresses=10.110.0.0 \
    --prefix-length=16 \
    --purpose=VPC_PEERING \
    --network=network-name
  3. Establish a private service access connection.

    $ gcloud services vpc-peerings connect --service=servicenetworking.googleapis.com \
    --ranges=sn-range-1 \
    --network=network-name
    
  4. Verify the VPC peering has been created. The following command lists all VPC peerings for the specified network.

    gcloud services vpc-peerings list --network=network-name
    

Connect a TPU Node with a Shared VPC Network

You connect a TPU Node to a shared VPC network when you create your TPU. Specify your shared VPC using the --network tag:

$ gcloud compute tpus execution-groups create \
  --name=tpu-name \
  --zone=zone \
  --tf-version=2.12.0 \
  --machine-type=n1-standard-1 \
  --accelerator-type=v3-8 \
  --network=network-name 

Retrieve information about a specific TPU Node

$ gcloud compute tpus describe tpu-name --zone zone

The response contains the following information:

acceleratorType: v3-8
apiVersion: V1
cidrBlock: 00.0.000.000/29
createTime: '2022-11-30T18:59:20.655858097Z'
health: HEALTHY
ipAddress: 00.000.0.000
name: projects/ml-writers/locations/us-central1-a/nodes/mikegre-vcp
network: global/networks/mikegre-vpc
networkEndpoints:
- ipAddress: 00.0.000.000
  port: 8470
port: '8470'
schedulingConfig: {}
serviceAccount: service-00000000000@cloud-tpu.iam.gserviceaccount.com
state: READY
tensorflowVersion: 2.10.0

Delete the TPU Node

When you are done with the TPU Node, make sure to delete it.

$ gcloud compute tpus execution-groups delete tpu-name \
  --zone=zone 
### Delete the VPC peering A peering connection can be disconnected using the compute networking API. These calls should be made in Shared VPC host projects. 1. List all VPC peerings to find the name of the peering to delete.
   $ gcloud compute networks peerings list --network=network-name
   
  1. Delete a VPC peering.

    $ gcloud compute networks peerings delete peering-name --network=network-name