AI Platform Pipelines makes it easier to get started with Kubeflow Pipelines with TensorFlow Extended on Google Kubernetes Engine by saving you the difficulty of:
- Creating a GKE cluster
- Deploying Kubeflow Pipelines to your GKE cluster
- Creating a Cloud Storage bucket to use to store pipeline artifacts
If you prefer, you can use AI Platform Pipelines to deploy Kubeflow Pipelines on an existing cluster that does not already have Kubeflow Pipelines installed. Use this guide to ensure that your cluster is configured correctly to deploy and run Kubeflow Pipelines.
Ensure that your GKE cluster has enough resources for AI Platform Pipelines
To use Google Cloud Marketplace to deploy Kubeflow Pipelines on a GKE cluster, the following must be true:
- Your cluster must have at least 3 nodes. Each node must have at least 2 CPUs and 4 GB of memory available.
- The cluster's access scope must grant full access to all Cloud APIs, or your cluster must use a custom service account.
- The cluster must not already have Kubeflow Pipelines installed.
Use the following instructions to check if your cluster has sufficient resources to install AI Platform Pipelines.
Open AI Platform Pipelines in the Google Cloud console.
In the AI Platform Pipelines toolbar, click New instance. Kubeflow Pipelines opens in Google Cloud Marketplace.
Click Configure. The Deploy Kubeflow Pipelines form opens.
Click Cluster to expand the list. GKE clusters that do not have enough resources or permissions are listed as Ineligible clusters. Each ineligible cluster includes a description of why Kubeflow Pipelines cannot be installed, such as:
- Cluster does not fit the application: Your cluster does not have sufficient resources available to install Kubeflow Pipelines. Allocate more resources to your cluster.
- Insufficient OAuth scope: Your cluster does not have sufficient access to Google Cloud resources and APIs to install Kubeflow Pipelines. Grant more permissions to your cluster.
Allocate more resources to your GKE cluster
To install Kubeflow Pipelines from Google Cloud Marketplace to an existing GKE cluster, your cluster must have at least 3 nodes with 2 CPU and 4 GB available.
Use the following instructions to replace the node pool in your cluster with one that has enough CPU and memory resources for AI Platform Pipelines.
Open Google Kubernetes Engine clusters in the Google Cloud console.
Click your cluster name. The cluster's details appear.
In the GKE toolbar, click Add node pool. The Add a new node pool form opens.
Supply the following information to the Add a new node pool form.
- Number of nodes: Specify the number of nodes in your node pool. Your cluster must have 3 or more nodes to install Kubeflow Pipelines using Google Cloud Marketplace.
Machine type: Specify the Compute Engine machine type to use for instances in the node pool. Select a machine type with at least 2 CPUs and 4 GB of memory, such as
n1-standard-2
.Access scopes: Click Allow full access to all Cloud APIs in Access scopes.
Otherwise, configure your node pool as desired. Learn more about adding node pools to a cluster.
Click Create node pool. Creating the node pool takes several minutes to complete.
For each node pool in the Node pools section, except for the node pool you created in the previous step, click
delete. The Delete a node pool dialog appears to confirm that you want to delete this node pool.Click Delete. Deleting the node pool takes several minutes.
Once you have deleted the old node pools, check that your cluster has sufficient resources and access to install Kubeflow Pipelines from Google Cloud Marketplace.
Grant your GKE cluster access to Google Cloud resources and APIs
There are three ways to grant your ML pipelines access to Google Cloud resources and APIs:
- Grant your Google Kubernetes Engine cluster full access to all Google Cloud APIs. Learn how to configure your cluster with full access to Google Cloud resources in your project.
- Grant your Google Kubernetes Engine cluster granular access to Google Cloud APIs using a service account. Learn how to configure your cluster with granular access to Google Cloud resources.
- Grant your GKE cluster access to using service accounts stored as Kubernetes secrets. Learn more about granting your pipelines access to Google Cloud resources with a Kubernetes secret.
When deploying AI Platform Pipelines, you must grant your GKE cluster full access to Google Cloud resources and APIs or grant your cluster access to Google Cloud using a service account.
Configuring your GKE cluster with full access to Google Cloud APIs
To make it easier for your ML pipelines and other GKE
cluster workloads to access your project's Google Cloud resources, configure
your cluster to the https://www.googleapis.com/auth/cloud-platform
access
scope. This access scope provides full access to the Google Cloud resources
and APIs that you have enabled in your project. If granting this access scope
provides excessive access to Google Cloud, configure granular access using a
service account.
Use the following instructions to replace your cluster's node pool with one that allows all workloads on this cluster to access all Google Cloud APIs that are enabled in your project. Before you change your GKE cluster, discuss these changes with your GKE administrator.
Open Google Kubernetes Engine clusters in the Google Cloud console.
Click your cluster name. The cluster's details appear.
In the GKE toolbar, click Add node pool. The Add a new node pool form opens.
Supply the following information to the Add a new node pool form.
- Number of nodes: Specify the number of nodes in your node pool. Your cluster must have 3 or more nodes to install Kubeflow Pipelines using Google Cloud Marketplace.
Machine type: Specify the Compute Engine machine type to use for instances in the node pool. Select a machine type with at least 2 CPUs and 4 GB of memory, such as
n1-standard-2
.Access scopes: Click Allow full access to all Cloud APIs in Access scopes.
Otherwise, configure your node pool as desired. Learn more about adding node pools to a cluster.
Click Create node pool. Creating the node pool takes several minutes to complete.
For each node pool in the Node pools section, except for the node pool you created in the previous step, click
delete. The Delete a node pool dialog appears to confirm that you want to delete this node pool.Click Delete. Deleting the node pool takes several minutes.
Once you have deleted the old node pools, check that your cluster has sufficient resources and access to install Kubeflow Pipelines from Google Cloud Marketplace.
Configuring your GKE cluster with granular access to Google Cloud APIs
Use the following instructions to configure a service account for your GKE cluster and replace your cluster's node pool with one that uses your service account. By creating a service account, you can granularly manage which Google Cloud resources the workloads on your cluster have access to. Before you change your GKE cluster, discuss these changes with your GKE administrator.
Open a Cloud Shell session.
Cloud Shell opens in a frame at the bottom of the Google Cloud console.
Run the following commands in Cloud Shell to create your service account and grant it sufficient access to run AI Platform Pipelines. Learn more about the roles required to run AI Platform Pipelines with a user-managed service account.
export PROJECT=PROJECT_ID
export SERVICE_ACCOUNT=SERVICE_ACCOUNT_NAME
gcloud iam service-accounts create $SERVICE_ACCOUNT \ --display-name=$SERVICE_ACCOUNT \ --project=$PROJECT
gcloud projects add-iam-policy-binding $PROJECT \ --member="serviceAccount:$SERVICE_ACCOUNT@$PROJECT.iam.gserviceaccount.com" \ --role=roles/logging.logWriter
gcloud projects add-iam-policy-binding $PROJECT \ --member="serviceAccount:$SERVICE_ACCOUNT@$PROJECT.iam.gserviceaccount.com" \ --role=roles/monitoring.metricWriter
gcloud projects add-iam-policy-binding $PROJECT \ --member="serviceAccount:$SERVICE_ACCOUNT@$PROJECT.iam.gserviceaccount.com" \ --role=roles/monitoring.viewer
gcloud projects add-iam-policy-binding $PROJECT \ --member="serviceAccount:$SERVICE_ACCOUNT@$PROJECT.iam.gserviceaccount.com" \ --role=roles/storage.objectViewer
Replace the following:
- SERVICE_ACCOUNT_NAME: The name of the service account to create.
- PROJECT_ID: The Google Cloud project that the service account is created in.
Grant your service account access to any Google Cloud resources or APIs that your ML pipelines require. Learn more about Identity and Access Management roles and managing service accounts.
Grant your user account the Service Account User (
iam.serviceAccountUser
) role on your service account.gcloud iam service-accounts add-iam-policy-binding \ "SERVICE_ACCOUNT_NAME@PROJECT_ID.iam.gserviceaccount.com" \ --member=user:USERNAME \ --role=roles/iam.serviceAccountUser
Replace the following:
- SERVICE_ACCOUNT_NAME: The name of your service account.
- PROJECT_ID: Your Google Cloud project.
- USERNAME: Your username on Google Cloud.
Open Google Kubernetes Engine clusters in the Google Cloud console.
Click your cluster name. The cluster's details appear.
In the GKE toolbar, click Add node pool. The Add a new node pool form opens.
Supply the following information to the Add a new node pool form.
- Number of nodes: Specify the number of nodes in your node pool. Your cluster must have 3 or more nodes to install Kubeflow Pipelines using Google Cloud Marketplace.
Machine type: Specify the Compute Engine machine type to use for instances in the node pool. Select a machine type with at least 2 CPUs and 4 GB of memory, such as
n1-standard-2
.Service account: Select the service account that you created in an earlier step.
Otherwise, configure your node pool as desired. Learn more about adding node pools to a cluster.
Click Create node pool. Creating the node pool takes several minutes to complete.
For each node pool in the Node pools section, except for the node pool you created in the previous step, click
delete. The Delete a node pool dialog appears to confirm that you want to delete this node pool.Click Delete. Deleting the node pool takes several minutes.
Once you have deleted the old node pools, check that your cluster has sufficient resources and access to install Kubeflow Pipelines from Google Cloud Marketplace.
Use a Kubernetes secret to grant your cluster access to Google Cloud resources and APIs
Pipelines that are developed using the use_gcp_secret
operator in the
Kubeflow Pipelines SDK authenticate to
Google Cloud resources using a Kubernetes secret.
Use these instructions to create a service account, grant the account access to the resources used by your pipelines, and then add the service account to your cluster as a Kubernetes secret.
Open Google Kubernetes Engine clusters in the Google Cloud console.
In the row for your cluster, find the cluster name and zone.
Open a Cloud Shell session.
Cloud Shell opens in a frame at the bottom of the Google Cloud console. Use Cloud Shell to complete the rest of this process.
Set the following environment variables.
export PROJECT_ID=PROJECT_ID
export ZONE=ZONE
export CLUSTER=CLUSTER_NAME
export NAMESPACE=NAMESPACE
export SA_NAME=SERVICE_ACCOUNT_NAME
Replace the following:
- PROJECT_ID: The Google Cloud project that your GKE cluster was created in.
- ZONE: The Google Cloud zone that your GKE cluster was created in.
- CLUSTER_NAME: The name of your GKE cluster.
NAMESPACE: The namespace in your GKE cluster where Kubeflow Pipelines is installed.
Namespaces are used to manage resources in large Kubernetes clusters. If your cluster does not use namespaces, enter default as the kubernetes-namespace.
SERVICE_ACCOUNT_NAME: The name of the service account to create for your Kubeflow Pipelines cluster to access Google Cloud resources and APIs.
Create a service account for your cluster.
gcloud iam service-accounts create $SA_NAME \ --display-name $SA_NAME --project "$PROJECT_ID"
To grant your service account access to Google Cloud resources, bind Identity and Access Management roles to the service account. Use the following instructions to grant IAM roles to your service account. Call this command once for each role that you want to grant to your service account.
gcloud projects add-iam-policy-binding $PROJECT_ID \ --member=serviceAccount:$SA_NAME@$PROJECT_ID.iam.gserviceaccount.com \ --role=iam-role
iam-role: The IAM role to grant to your service account. For example, roles/storage.admin grants full control of Cloud Storage buckets and objects in your project.
To learn more about IAM roles, read the guide to understanding IAM roles.
Create a private key for your service account in the current directory.
gcloud iam service-accounts keys create ./service-account-key.json \ --iam-account $SA_NAME@$PROJECT_ID.iam.gserviceaccount.com
Configure
kubectl
to connect to your cluster, then create the user-gcp-sa Kubernetes secret.gcloud container clusters get-credentials "$CLUSTER" --zone "$ZONE" \ --project "$PROJECT_ID"
kubectl create secret generic user-gcp-sa \ --from-file=user-gcp-sa.json=./service-account-key.json \ -n $NAMESPACE --dry-run -o yaml | kubectl apply -f -
Clean up the service account's private key.
rm ./service-account-key.json