Quickstart: Google Cloud Pipeline Components

This quickstart guides you through the installation of the Google Cloud Pipeline Components (GCPC) SDK.

Install latest release

Use the following command to install the Google Cloud Pipeline Components SDK from the Python Package Index (PyPI):

pip install --upgrade google-cloud-pipeline-components

Use a prebuilt component via the GCPC SDK

After you install the Google Cloud Pipeline Components SDK, you can use it to import a prebuilt component.

For SDK reference information for supported components, see the google_cloud_pipeline_components SDK documentation.

For example, you can use the following code to import and use the Dataflow component in a pipeline.

from google_cloud_pipeline_components.v1.dataflow import DataflowPythonJobOp
from kfp import dsl

@dsl.pipeline(
    name=PIPELINE_NAME,
    description='Dataflow launch python pipeline'
)
def pipeline(
    python_file_path:str = 'gs://ml-pipeline-playground/samples/dataflow/wc/wc.py',
    project_id:str = PROJECT_ID,
    location:str = LOCATION,
    staging_dir:str = PIPELINE_ROOT,
    requirements_file_path:str = 'gs://ml-pipeline-playground/samples/dataflow/wc/requirements.txt',
):
    dataflow_python_op = DataflowPythonJobOp(
        project=project_id,
        location=location,
        python_module_path=python_file_path,
        temp_location = staging_dir,
        requirements_file_path = requirements_file_path,
        args = ['--output', OUTPUT_FILE],
    )

What's next