GKE Cluster with Cloud TPU using a Shared VPC

This guide describes how to:

  • Set up a Cloud TPU GKE cluster using a Shared VPC network.
  • Setup the required APIs and IP ranges to ensure communication between the cluster, the Shared VPC, and Google Managed Services.
  • Create secondary CIDR ranges for cluster pods and services.

Concepts

These concepts will be frequently used throughout this guide:

  • Host Project: A project that contains one or more Shared VPC networks. In this guide, this project will contain your Shared VPC.

  • Service Project: A project attached to a Host Project by a Shared VPC administrator. This attachment allows it to participate in the Shared VPC. In this guide, this project will contain your Cloud TPU cluster.

Requirements

Enable APIs

  1. Enable the following APIs on the Google Cloud console for your Host Project:

  2. Enable the following APIs on the Google Cloud console for your Service Project:

Setup IP range for VPC Peering to Google managed services

Follow these steps to reserve an IP range in the Shared VPC network in the Host Project. The range will be used by all Google managed services in this VPC network. Cloud TPU is one of the Google managed services.

  1. List existing IP ranges in the Shared VPC network.

    $ gcloud beta compute networks list-ip-addresses network \
    --project=host-project-id
    
  2. Choose an available range and reserve it in the Shared VPC network.

    $ gcloud beta compute addresses create peering-name \
      --global \
      --prefix-length=16 \
      --network=network \
      --purpose=VPC_PEERING \
      --project=host-project-id
    

    The peering-name specifies the name of the VPC Peering connection. The name will be used in the next step.

  3. Create a VPC Network Peering connection between the Host Project and Google managed services.

    $ gcloud beta services vpc-peerings connect \
      --service=servicenetworking.googleapis.com \
      --network=network \
      --ranges=peering-name \
      --project=host-project-id
    

Create secondary IP ranges for the cluster

In your Shared VPC network, select or create a subnetwork and add two secondary CIDR ranges for the cluster pods and services.

These ranges are for your cluster's pods and services, respectively. The range names will be used in the following steps.

  • subnet will be the subnetwork in the network of your Host Project.

  • tier-1-name will be the name of the secondary range used by GKE Pods in subnet.

  • tier-2-name will be the name of the secondary range used by GKE Services in subnet.

Create a GKE cluster with Cloud TPU

The following command shows how to create a GKE using the existing CIDR ranges in your Shared VPC network, enabling Cloud TPU:

$ gcloud beta container clusters create cluster-name \
  --enable-ip-alias \
  --network projects/host-project-id/global/networks/network \
  --subnetwork projects/host-project-id/regions/region/subnetworks/subnet \
  --cluster-secondary-range-name tier-1-name \
  --services-secondary-range-name tier-2-name \
  --scopes=cloud-platform \
  --enable-tpu \
  --enable-tpu-service-networking \
  --project=service-project-id

Follow the Pod Spec steps in the guide Run Cloud TPU applications on GKE to build a job that uses Cloud TPU resources.

Clean Up

When you've finished with Cloud TPU on GKE, clean up the resources to avoid incurring extra charges to your Cloud Billing account.

  1. Delete the reserved peering IP range.

    $ gcloud beta compute addresses delete peering-name \
      --global \
      --project=host-project-id
    
  2. Follow the instructions on Cleaning up on Setting up Clusters with Shared VPC to delete the cluster and the network resources.