Run a pipeline

Vertex AI Pipelines lets you run machine learning (ML) pipelines that were built using the Kubeflow Pipelines SDK or TensorFlow Extended in a serverless manner. This document describes how to run an ML pipeline and how to schedule a recurring pipeline run.

If you have not yet built an ML pipeline, refer to Build a pipeline.

Before you begin

Before you run a pipeline with Vertex AI Pipelines, use the following instructions to set up your Google Cloud project and development environment.

  1. To get your Cloud project ready to run ML pipelines, follow the instructions in the guide to configuring your Cloud project.

  2. To author a pipeline using Python, you must use one of the following SDKs.

  3. To run a pipeline using the Vertex AI SDK for Python, install the Vertex SDK.

Create a pipeline run

Use the following instructions to run an ML pipeline using Google Cloud Console or Python.

Console

Use the following instructions to run an ML pipeline using Cloud Console.

  1. In the Cloud Console, in the Vertex AI section, go to the Pipelines page.

    Go to Pipelines

  2. In the Region drop-down list, select the region that you want to create a pipeline run in.

  3. Click Create run to open the Create pipeline run pane.

  4. Specify the following Run details.

    • In the File field, click Choose to open the file selector. Navigate to the compiled pipeline JSON file that you want to run, select the pipeline, and click Open.

    • The Pipeline name defaults to the name that you specified in the pipeline definition. Optionally, specify a different Pipeline name.

    • Specify a Run name to uniquely identify this pipeline run.

  5. To specify that this pipeline run uses a custom service account, a customer-managed encryption key, or a peered VPC network, click Advanced options.

    Use the following instructions to configure advanced options such as a custom service account.

    • To specify a service account, select a service account from the Service account drop-down list.

      If you do not specify a service account, Vertex AI Pipelines runs your pipeline using the default Compute Engine service account.

      Learn more about configuring a service account for use with Vertex AI Pipelines.

    • To use a customer-managed encryption key (CMEK), select Use a customer-managed encryption key. The Select a customer-managed key drop-down list appears. In the Select a customer-managed key drop-down list, select the key that you want to use.

    • To use a peered VPC network in this pipeline run, enter the VPC network name in the Peered VPC network box.

  6. Click Continue.

    The pipeline run parameters pane appears.

  7. If your pipeline has parameters, specify your pipeline run parameters.

  8. Click Submit to create your pipeline run.

Vertex AI SDK for Python

Use the following instructions to run an ML pipeline using the Vertex AI SDK for Python. Before you run the following code sample, you must set up authentication.

Running a Vertex AI PipelineJob requires you to create a PipelineJob object, and then invoke the submit method.

from google.cloud import aiplatform

job = aiplatform.PipelineJob(display_name = DISPLAY_NAME,
                             template_path = COMPILED_PIPELINE_PATH,
                             job_id = JOB_ID,
                             pipeline_root = PIPELINE_ROOT_PATH,
                             parameter_values = PIPELINE_PARAMETERS,
                             enable_caching = ENABLE_CACHING,
                             encryption_spec_key_name = CMEK,
                             labels = LABELS,
                             credentials = SERVICE_ACCOUNT,
                             project = PROJECT_ID,
                             location = LOCATION)

job.submit(service_account=SERVICE_ACCOUNT,
           network=NETWORK)

Replace the following:

  • DISPLAY_NAME: The name of the pipeline, this will show up in the Vertex UI.
  • COMPILED_PIPELINE_PATH: The path to your compiled pipeline JSON file. It can be a local path or a Google Cloud Storage URI.
  • JOB_ID: (optional) A unique identifier for this pipeline run. If the job ID is not specified, Vertex AI Pipelines creates a job ID for you using the pipeline name and the timestamp of when the pipeline run was started.
  • PIPELINE_ROOT_PATH: (optional) To override the pipeline root path specified in the pipeline definition, specify a path that your pipeline job can access, such as a Cloud Storage bucket URI.
  • PIPELINE_PARAMETERS: (optional) The pipeline parameters to pass to this run. For example, create a dict() with the parameter names as the dictionary keys and the parameter values as the dictionary values.
  • ENABLE_CACHING: (optional) Specifies if this pipeline run uses execution caching. Execution caching reduces costs by skipping pipeline steps where the output is known for the current set of inputs. If the enable caching argument is not specified, execution caching is used in this pipeline run. Learn more about execution caching.
  • CMEK: (optional) The name of the customer-managed encryption key that you want to use for this pipeline run.
  • LABELS: (optional) The user defined labels to organize this PipelineJob.
  • CREDENTIALS: (optional) Custom credentials to use to create this PipelineJob. Overrides credentials set in aiplatform.init.
  • PROJECT_ID: The project that you want to run the pipeline in.
  • LOCATION: The region that you want to run the pipeline in. For more information about the regions that Vertex AI Pipelines is available in, see the Vertex AI locations guide. If this variable is not set, the default location set in aiplatform.init is used.
  • SERVICE_ACCOUNT: (optional) The name of the service account to use for this pipeline run. If you do not specify a service account, Vertex AI Pipelines runs your pipeline using the default Compute Engine service account.
  • NETWORK: (optional) :The name of the VPC peered network to use for this pipeline run.