You can run your Python component on Vertex AI Pipelines by using Google Cloud-specific machine resources offered by Vertex AI custom training.
You can use the
create_custom_training_job_from_component method from the Google Cloud Pipeline Components to transform a Python component into a Vertex AI custom training job. Learn how to create a custom job.
Create a custom training job from a component using Vertex AI Pipelines
The following sample shows how to use the
create_custom_training_job_from_component method to transform a Python component into a custom training job with user-defined Google Cloud machine resources, and then run the compiled pipeline on Vertex AI Pipelines:
import kfp from kfp.v2 import dsl from kfp.v2.dsl import component from google_cloud_pipeline_components.v1.custom_job import create_custom_training_job_from_component # Create a Python component def my_python_component(): import time time.sleep(1) # Convert the above component into a custom training job custom_training_job = create_custom_training_job_from_component( my_python_component, display_name = 'DISPLAY_NAME', machine_type = 'MACHINE_TYPE', accelerator_type='ACCELERATOR_TYPE', accelerator_count='ACCELERATOR_COUNT' ) # Define a pipeline that runs the custom training job @dsl.pipeline( name="resource-spec-request", description="A simple pipeline that requests GCP machine resource", pipeline_root='PIPELINE_ROOT', ) def pipeline(): training_job_task = custom_training_job( project='PROJECT_ID', location='LOCATION', ).set_display_name('training-job-task')
Replace the following:
DISPLAY_NAME: The name of the custom job.
MACHINE_TYPE: The type of the machine for running the custom job—for example,
e2-standard-4. For more information about machine types, see Machine types.
ACCELERATOR_TYPE: The type of accelerator attached to the machine. For more information about accelerator types, see Accelerator types.
ACCELERATOR_COUNT: The number of accelerators attached to the machine running the custom job.
PIPELINE_ROOT: Specify a Cloud Storage URI that your pipelines service account can access. The artifacts of your pipeline runs are stored within the pipeline root.
PROJECT_ID: The Google Cloud project that this pipeline runs in.
LOCATION: The location or region that this pipeline runs in.