Create, upload, and use a pipeline template

Stay organized with collections Save and categorize content based on your preferences.

A pipeline template is a resource that you can use to publish a workflow definition so that it can be reused multiple times, by a single user or by multiple users.

The Kubeflow Pipelines SDK registry client is a new client interface that you can use with a compatible registry server, such as Artifact Registry, for version control of your Kubeflow Pipelines (KFP) templates. For more information, see Use the template in a Kubeflow Pipelines SDK registry client.

This page shows you how to:

  • Create a KFP pipeline template
  • Use the Kubeflow Pipelines SDK registry client to upload the template to a pipeline template repository
  • Use the template in the Kubeflow Pipelines client

Before you begin

Before you build and run your pipeline, use the following instructions to set up your Google Cloud project and development environment in the Google Cloud console.

  1. Install v2.0.0b1 or higher of the Kubeflow Pipelines SDK.
    (Optional) Before installing, run the following command to see which version of the Kubeflow Pipelines SDK is currently installed:

      pip freeze | grep kfp
    
  2. Install v1.15.0 or higher of the Vertex AI SDK for Python.
    (Optional) Before installing, run the following command to see which version of the Vertex AI SDK for Python is currently installed:

      pip freeze | grep google-cloud-aiplatform
    
  3. (Optional) Install 390.0.0 or higher of the Google Cloud CLI.

  4. Enable the Artifact Registry API.

Configuring permissions

If you have not already set up your Google Cloud CLI project for Vertex AI Pipelines, follow the instructions in Configure your Google Cloud project for Vertex AI Pipelines. Make sure to assign the following IAM roles or permissions as needed:

Create a repository in Artifact Registry

Next you'll create a repository in Artifact Registry for your pipeline templates.

Console

  1. In the Google Cloud console, open Repositories.

    Go to Repositories

  2. Click Create Repository.

  3. Specify quickstart-kfp-repo as the repository name.

  4. Under Format, select Kubeflow Pipelines.

  5. Under Location Type, select Region.

  6. Select us-central1.

  7. Click Create.

Google Cloud CLI

  1. If needed, authenticate your Google Cloud CLI with your account.

    Execute the gcloud config set command:

    Linux, macOS, or Cloud Shell

    gcloud auth application-default login
    gcloud auth login
    gcloud config set project PROJECT_ID
    

    Windows (PowerShell)

    gcloud auth application-default login
    gcloud auth login
    gcloud config set project PROJECT_ID
    

    Windows (cmd.exe)

    gcloud auth application-default login
    gcloud auth login
    gcloud config set project PROJECT_ID
    
     

  2. Run the following command to create a repository.

    gcloud artifacts repositories create quickstart-kfp-repo --location=us-central1 --repository-format=KFP
    

Create a template

Create a "hello world" pipeline and compile it into a local YAML file named hello_world_pipeline.yaml:

import os

from kfp import dsl
from kfp import compiler

@dsl.component()
def hello_world(text: str) -> str:
    print(text)
    return text

@dsl.pipeline(name='hello-world', description='A simple intro pipeline')
def pipeline_hello_world(text: str = 'hi there'):
    """Pipeline that passes small pipeline parameter string to consumer op."""

    consume_task = hello_world(
        text=text)  # Passing pipeline parameter as argument to consumer op

compiler.Compiler().compile(
      pipeline_func=pipeline_hello_world,
      package_path='hello_world_pipeline.yaml')

Upload the template

  1. If needed, log in to your Google Cloud CLI account.

    Execute the gcloud config set command:

    Linux, macOS, or Cloud Shell

    gcloud auth application-default login
    gcloud auth login
    gcloud config set project PROJECT_ID
    

    Windows (PowerShell)

    gcloud auth application-default login
    gcloud auth login
    gcloud config set project PROJECT_ID
    

    Windows (cmd.exe)

    gcloud auth application-default login
    gcloud auth login
    gcloud config set project PROJECT_ID
    
     

  2. To configure your Kubeflow Pipelines SDK registry client, run the following commands:

    from kfp.registry import RegistryClient
    
    client = RegistryClient(host=f"https://us-central1-kfp.pkg.dev/PROJECT_ID/quickstart-kfp-repo")
    
  3. Upload the compiled YAML file to your repository in Artifact Registry.

    templateName, versionName = client.upload_pipeline(
      file_name="hello_world_pipeline.yaml",
      tags=["v1", "latest"],
      extra_headers={"description":"This is an example pipeline template."})
    
  4. To verify that the template was uploaded:

    1. Open Vertex AI Pipelines in the Google Cloud console.

      Go to Vertex AI Pipelines

    2. Click SELECT REPOSITORY.

    3. From the list, select the quickstart-kfp-repo repository, and then click Select.

    4. You should find the uploaded template package hello-world from the list. Click the hello-world.

    5. You should find the version 4f245e8f9605 under your package hello-world. To view the pipeline topology, click the version 4f245e8f9605.

Use the template in Vertex AI

After you've uploaded your pipeline template to your repository in Artifact Registry, it is ready to be used in Vertex AI Pipelines.

Create a staging bucket for your template

Before you can use your pipeline template, you'll need to create a Cloud Storage bucket for staging pipeline runs.

To create the bucket, follow the instructions in Configure a Cloud Storage bucket for pipeline artifacts and then run the following command:

STAGING_BUCKET="gs://BUCKET_NAME"

Replace BUCKET_NAME with the name of the bucket you just created.

Create a pipeline run from your template

You can use the Vertex AI SDK for Python or the Google Cloud console to create a pipeline run from your template in Artifact Registry.

Console

  1. Open Vertex AI Pipelines in the Google Cloud console.

    Go to Vertex AI Pipelines

  2. Click SELECT REPOSITORY.

  3. From the list, select the quickstart-kfp-repo repository, and then click Select.

  4. Click the hello-world package.

  5. Next to the 4f245e8f9605 version, click Create Run.

  6. Click Runtime Configuration and enter the following into GCS Output Directory.

    gs://BUCKET_NAME
    
  7. Click Submit.

Vertex AI SDK for Python

Run the following commands, replacing PROJECT_ID with the Google Cloud project that this pipeline runs in.

from google.cloud import aiplatform

# Initialize the aiplatform package
aiplatform.init(
    project="PROJECT_ID",
    location='us-central1',
    staging_bucket=STAGING_BUCKET)

# Create a job via version id.
job = aiplatform.PipelineJob(
    display_name="hello-world-latest",
    template_path="https://us-central1-kfp.pkg.dev/PROJECT_ID/quickstart-kfp-repo/hello-world/" + \
        versionName)
# Or via tag.
job = aiplatform.PipelineJob(
    display_name="hello-world-latest",
    template_path="https://us-central1-kfp.pkg.dev/PROJECT_ID/quickstart-kfp-repo/hello-world/v1")

job.submit()

View created pipeline runs

You can view the runs created by a specific pipeline version in the Vertex AI SDK for Python.

Console

  1. Open Vertex AI Pipelines in the Google Cloud console.

    Go to Vertex AI Pipelines

  2. Click SELECT REPOSITORY.

  3. From the list, select the quickstart-kfp-repo repository, and then click Select.

  4. Click the hello-world package.

  5. Click the 4f245e8f9605 version.

  6. Click View Runs.

Vertex AI SDK for Python

To list the pipelines runs, run the pipelineJobs.list command as shown in one or more of the following examples:

  from google.cloud import aiplatform

  # To filter all runs created from a specific version
  filter = 'template_uri:"https://us-central1-kfp.pkg.dev/PROJECT_ID/quickstart-kfp-repo/hello-world/*" AND ' + \
           'template_metadata.version="%s"' % versionName
  aiplatform.PipelineJob.list(filter=filter)

  # To filter all runs created from a specific version tag
  filter = 'template_uri="https://us-central1-kfp.pkg.dev/PROJECT_ID/quickstart-kfp-repo/hello-world/latest"'
  aiplatform.PipelineJob.list(filter=filter)

  # To filter all runs created from a package
  filter = 'template_uri:"https://us-central1-kfp.pkg.dev/PROJECT_ID/quickstart-kfp-repo/hello-world/*"'
  aiplatform.PipelineJob.list(filter=filter)

  # To filter all runs created from a repo
  filter = 'template_uri:"https://us-central1-kfp.pkg.dev/PROJECT_ID/quickstart-kfp-repo/*"'
  aiplatform.PipelineJob.list(filter=filter)

Use the template in a Kubeflow Pipelines SDK registry client

You can use a Kubeflow Pipelines SDK registry client together with Artifact Registry to download and use your pipeline template.

  • To list the resources in the repository, run the following commands:

    templatePackages = client.list_packages()
    templatePackage = client.get_package(package_name = "hello-world")
    
    versions = client.list_versions(package_name="hello-world")
    version = client.get_version(package_name="hello-world", version=versionName)
    
    tags = client.list_tags(package_name = "hello-world")
    tag = client.get_tag(package_name = "hello-world", tag="latest")
    

    For the complete list of available methods and documents, see the proto files in the Artifact Registry GitHub repo.

  • To download the template to your local file system, run the following commands:

    # Sample 1
    filename = client.download_pipeline(
      package_name = "hello-world",
      version = versionName)
    # Sample 2
    filename = client.download_pipeline(
      package_name = "hello-world",
      tag = "v1")
    # Sample 3
    filename = client.download_pipeline(
      package_name = "hello-world",
      tag = "v1",
      file_name = "hello-world-template.yaml")
    

Use the Artifact Registry REST API

The following sections summarize how to use the Artifact Registry REST API to manage your pipeline templates in your Artifact Registry repository.

Upload a pipeline template using the Artifact Registry REST API

You can upload a pipeline template by creating an HTTP request using the parameter values described in this section, where:

  • PROJECT_ID is the Google Cloud project that this pipeline runs in.
  • REPO_ID is the ID of your Artifact Registry repository.

Example cURL request

curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -F tags=v1,latest \
    -F content=@pipeline_spec.yaml \
    https://us-central1-kfp.pkg.dev/PROJECT_ID/REPO_ID

Constructing the upload request

The request is an HTTP or HTTPS multipart request. It must include the authentication token in the request header. For more information, see gcloud auth print-access-token.

The payload of the request is the contents of the pipeline_spec.yaml file (or .zip package). The recommended size limit is 10 MiB.

The package name is taken from the pipeline_spec.pipeline_info.name entry in the pipeline_spec.yaml file. The package name uniquely identifies the package and is immutable across versions. It can be between 4 and 128 characters long and must match the following regular expression: ^[a-z0-9][a-z0-9-]{3,127}$.

The package tags are a list of up to eight comma-separated tags. Each tag must match the following regular expression: ^[a-zA-Z0-9\-._~:@+]{1,128}$.

If a tag exists and points to a pipeline that's already been uploaded, the tag is updated to point to the pipeline that you're currently uploading. For example, if the latest tag points to a pipeline you've already uploaded, and you upload a new version with --tag=latest, the latest tag is removed from the previously uploaded pipeline and assigned to the new pipeline you're uploading.

If the pipeline you're uploading is identical to a pipeline you've previously uploaded, the upload succeeds. The uploaded pipeline's metadata, including its version tags, is updated to match the parameter values of your upload request.

Upload response

If the upload request succeeds, it returns an HTTP OK status. The body of the response is as follows:

{packageName}/{versionName=sha256:abcdef123456...}

where versionName is the sha256 digest of pipeline_spec.yaml formatted as a hex string.

Download a pipeline template using the Artifact Registry REST API

You can download a pipeline template by creating an HTTP request using the parameter values described in this section, where:

  • PROJECT_ID is the Google Cloud project that this pipeline runs in.
  • REPO_ID is the ID of your Artifact Registry repository.
  • PACKAGE_ID is the package ID of your uploaded template.
  • TAG is the version tag.
  • VERSION is the template version in the format of sha256:abcdef123456....

For standard Artifact Registry download, you should form the download link as follows:

`url = https://us-central1-kfp.pkg.dev/PROJECT_ID/REPO_ID/PACKAGE_ID/VERSION`
`url = https://us-central1-kfp.pkg.dev/PROJECT_ID/REPO_ID/PACKAGE_ID/TAG`

Example cURL requests

curl -X GET \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    https://us-central1-kfp.pkg.dev/PROJECT_ID/REPO_ID/PACKAGE_ID/VERSION

You can replace VERSION with TAG and download the same template, as shown in the following example:

curl -X GET \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    https://us-central1-kfp.pkg.dev/PROJECT_ID/REPO_ID/PACKAGE_ID/TAG

Download response

If the download request succeeds, it returns an HTTP OK status. The body of the response is the contents of the pipeline_spec.yaml file.