Request Google Cloud machine resources with Vertex AI Pipelines

You can run your Python component on Vertex AI Pipelines by using Google Cloud-specific machine resources offered by Vertex AI custom training.

You can use the create_custom_training_job_from_component method from the Google Cloud Pipeline Components to transform a Python component into a Vertex AI custom training job. Learn how to create a custom job.

Create a custom training job from a component using Vertex AI Pipelines

The following sample shows how to use the create_custom_training_job_from_component method to transform a Python component into a custom training job with user-defined Google Cloud machine resources, and then run the compiled pipeline on Vertex AI Pipelines:


import kfp
from kfp import dsl
from google_cloud_pipeline_components.v1.custom_job import create_custom_training_job_from_component

# Create a Python component
@dsl.component
def my_python_component():
  import time
  time.sleep(1)

# Convert the above component into a custom training job
custom_training_job = create_custom_training_job_from_component(
    my_python_component,
    display_name = 'DISPLAY_NAME',
    machine_type = 'MACHINE_TYPE',
    accelerator_type='ACCELERATOR_TYPE',
    accelerator_count='ACCELERATOR_COUNT',
    boot_disk_type: 'BOOT_DISK_TYPE',
    boot_disk_size_gb: 'BOOT_DISK_SIZE',
    network: 'NETWORK',
    reserved_ip_ranges: 'RESERVED_IP_RANGES',
    nfs_mounts: 'NFS_MOUNTS'
)

# Define a pipeline that runs the custom training job
@dsl.pipeline(
  name="resource-spec-request",
  description="A simple pipeline that requests a Google Cloud machine resource",
  pipeline_root='PIPELINE_ROOT',
)
def pipeline():
  training_job_task = custom_training_job(
      project='PROJECT_ID',
      location='LOCATION',
  ).set_display_name('training-job-task')

Replace the following:

  • DISPLAY_NAME: The name of the custom job. If you don't specify the name, the component name is used, by default.

  • MACHINE_TYPE: The type of the machine for running the custom job—for example, e2-standard-4. For more information about machine types, see Machine types. If you specified a TPU as the accelerator_type, set this to cloud-tpu. For more information, see the machine_type parameter reference.

  • ACCELERATOR_TYPE: The type of accelerator attached to the machine. For more information about the available GPUs and how to configure them, see GPUs. For more information about the available TPU types and how to configure them, see TPUs. For more information, see the accelerator_type parameter reference.

  • ACCELERATOR_COUNT: The number of accelerators attached to the machine running the custom job. If you specify the accelerator type, the accelerator count is set to 1, by default.

  • BOOT_DISK_TYPE: The type of boot disk. For more information, see the boot_disk_type parameter reference.

  • BOOT_DISK_SIZE: The size of the boot disk in GB. For more information, see the boot_disk_size_gb parameter reference.

  • NETWORK: If the custom job is peered to a Compute Engine network that has private services access configured, specify the full name of the network. For more information, see the network parameter reference.

  • RESERVED_IP_RANGES: A list of names for the reserved IP ranges under the VPC network used to deploy the custom job. For more information, see the reserved_ip_ranges parameter reference.

  • NFS_MOUNTS: A list of NFS mount resources in JSON dict format. For more information, see the nfs_mounts parameter reference.

  • PIPELINE_ROOT: Specify a Cloud Storage URI that your pipelines service account can access. The artifacts of your pipeline runs are stored within the pipeline root.

  • PROJECT_ID: The Google Cloud project that this pipeline runs in.

  • LOCATION: The location or region that this pipeline runs in.

API Reference

For a complete list of arguments supported by the create_custom_training_job_from_component method, see the Google Cloud Pipeline Components SDK Reference.