Migrate from Kubeflow Pipelines to Vertex AI Pipelines

For developers with experience building Kubeflow pipelines it is important to understand the following ways that Vertex AI Pipelines is different from Kubeflow Pipelines.

Data passing (inputs/outputs)

  • Data passing via inputs and outputs differs from Kubeflow Pipelines SDK v1 to Kubeflow Pipelines SDK v2. Kubeflow Pipelines SDK v2 has the separation of parameters and artifacts, and they can't be passed into one another. For more detailed information, see Kubeflow Pipelines Pipelines Basics and Kubeflow Pipelines Data Types.

Domain-specific language (DSL) version usage

  • Vertex AI Pipelines can run pipelines that were built using TFX v0.30.0 or later, or the Kubeflow Pipelines SDK v2 domain-specific language (DSL).

    The Kubeflow Pipelines SDK v2 DSL is available in Kubeflow Pipelines SDK v1.6 or later.

    Kubeflow Pipelines can run pipelines that were built using the Kubeflow Pipelines SDK. Kubeflow Pipelines v1.6 or later can also run pipelines built using the Kubeflow Pipelines SDK v2 DSL.

Storage

  • Kubeflow Pipelines and Vertex AI Pipelines handle storage differently. In Kubeflow Pipelines you can make use of Kubernetes resources such as persistent volume claims. In Vertex AI Pipelines your data is stored on Cloud Storage, and mounted into your components using Cloud Storage FUSE.

    In Vertex AI Pipelines, you can use Google Cloud services to make resources available — for example, you can use Cloud Storage FUSE to access a Cloud Storage bucket as a mounted volume in a pipeline step. If your Cloud Storage URI is gs://example-bucket/example-pipeline, then your pipeline component's container can use Cloud Storage FUSE to access that URI as the following path: /gcs/example-bucket/example-pipeline.

  • When you run a pipeline using Vertex AI Pipelines, the pipeline root must have been specified in the @pipeline annotation or when you created the pipeline run.

    In Kubeflow Pipelines, specifying the pipeline root is optional. The artifacts of a pipeline run are stored using MinIO by default.

Features not supported in Vertex AI Pipelines

  • The following Kubeflow Pipelines features are not currently supported in Vertex AI Pipelines.

    • Cache Expiration: In Kubeflow Pipelines, you can specify that cached component executions expire after a specified amount of time using the Kubeflow Pipelines SDK v1 DSL.

      Currently, you cannot specify that component executions expire after a specified amount of time using the Kubeflow Pipelines SDK v2 DSL.

      In Vertex AI Pipelines, when you run a pipeline using create_run_from_job_spec you can use the enable_caching argument to specify that this pipeline run does not use caching.

    • Recursion: In Kubeflow Pipelines, you can specify pipeline components that are called recursively.

      Currently, Vertex AI Pipelines does not support pipeline components that are called recursively.