You can run your Python component on Vertex AI Pipelines by using Google Cloud-specific machine resources offered by Vertex AI custom training.
You can use the
create_custom_training_job_from_component method from the Google Cloud Pipeline Components to transform a Python component into a Vertex AI custom training job. Learn how to create a custom job.
Create a custom training job from a component using Vertex AI Pipelines
The following sample shows how to use the
create_custom_training_job_from_component method to transform a Python component into a custom training job with user-defined Google Cloud machine resources, and then run the compiled pipeline on Vertex AI Pipelines:
import kfp from kfp.v2 import dsl from kfp.v2.dsl import component from google_cloud_pipeline_components.v1.custom_job import create_custom_training_job_from_component # Create a Python component @component def my_python_component(): import time time.sleep(1) # Convert the above component into a custom training job custom_training_job = create_custom_training_job_from_component( my_python_component, display_name = 'DISPLAY_NAME', machine_type = 'MACHINE_TYPE', accelerator_type='ACCELERATOR_TYPE', accelerator_count='ACCELERATOR_COUNT' ) # Define a pipeline that runs the custom training job @dsl.pipeline( name="resource-spec-request", description="A simple pipeline that requests a Google Cloud machine resource", pipeline_root='PIPELINE_ROOT', ) def pipeline(): training_job_task = custom_training_job( project='PROJECT_ID', location='LOCATION', ).set_display_name('training-job-task')
Replace the following:
DISPLAY_NAME: The name of the custom job.
MACHINE_TYPE: The type of the machine for running the custom job—for example,
e2-standard-4. For more information about machine types, see Machine types. If you specified a TPU as the
accelerator_type, set this to
ACCELERATOR_TYPE: The type of accelerator attached to the machine. For more information about the available GPUs and how to configure them, see GPUs. For more information about the available TPU types and how to configure them, see TPUs.
ACCELERATOR_COUNT: The number of accelerators attached to the machine running the custom job.
PIPELINE_ROOT: Specify a Cloud Storage URI that your pipelines service account can access. The artifacts of your pipeline runs are stored within the pipeline root.
PROJECT_ID: The Google Cloud project that this pipeline runs in.
LOCATION: The location or region that this pipeline runs in.
For a complete list of arguments supported by the
create_custom_training_job_from_component method, see the Google Cloud Pipeline Components SDK Reference.