Configure a pipeline run on a persistent resource

A Vertex AI persistent resource is a long-running cluster that you can use to run custom training jobs and pipeline runs. By using a persistent resource for a pipeline run, you can help ensure compute resource availability and reduce pipeline task startup time. Persistent resources support all VMs and GPUs that are supported by custom training jobs. To learn more about persistent resources, see Overview of persistent resources.

This page shows you how to do the following:

Create a persistent resource
Create a pipeline run using the persistent resource

Before you begin

Before you can create a pipeline run with a persistent resource, you must first complete the following prerequisites.

Define and compile a pipeline

Define your pipeline and then compile the pipeline definition into a YAML file. For more information about defining and compiling a pipeline, see Build a pipeline.

Required IAM roles

To get the permission that you need to create a persistent resource, ask your administrator to grant you the Vertex AI Administrator (roles/aiplatform.admin) IAM role on your project. For more information about granting roles, see Manage access to projects, folders, and organizations.

This predefined role contains the aiplatform.persistentResources.create permission, which is required to create a persistent resource.

You might also be able to get this permission with custom roles or other predefined roles.

Create a persistent resource

Use the following samples to create a persistent resource that you can associate with a pipeline run. For more information about creating persistent resources, see Create a persistent resource.

gcloud

To create a persistent resource that you can associate with a pipeline run, use the gcloud ai persistent-resources create command along with the --enable-custom-service-account flag.

A persistent resource can have one or more resource pools. To create multiple resource pools in a persistent resource, specify multiple --resource-pool-spec flags.

You can specify all resource pool configurations as part of the command-line or use the --config flag to specify the path to a YAML file that contains the configurations.

Before using any of the command data below, make the following replacements:

PROJECT_ID: The project ID of the Google Cloud project where you want to create the persistent resource.
LOCATION: The region where you want to create the persistent resource. For a list of supported regions, see Feature availability.
PERSISTENT_RESOURCE_ID: A unique, user-defined ID for the persistent resource. It must start with a letter, end with a letter or number, and contain only lowercase letters, numbers, and hyphens (-).
DISPLAY_NAME: Optional. The display name of the persistent resource.
MACHINE_TYPE: The type of virtual machine (VM) to use. For a list of supported VMs, see Machine types. This field corresponds to the machineSpec.machineType field in the ResourcePool API message.
REPLICA_COUNT: Optional. The number of replicas to create for the resource pool, if you don't want to use autoscaling. This field corresponds to the replicaCount field in the ResourcePool API message. You must specify the replica count if you don't specify the MIN_REPLICA_COUNT and MAX_REPLICA_COUNT fields.
MIN_REPLICA_COUNT: Optional. The minimum number of replicas if you're using autoscaling for the resource pool. You must specify both MIN_REPLICA_COUNT and MAX_REPLICA_COUNT to use autoscaling.
MAX_REPLICA_COUNT: Optional. The maximum number of replicas if you're using autoscaling for the resource pool. You must specify both MIN_REPLICA_COUNT and MAX_REPLICA_COUNT to use autoscaling.
CONFIG: Path to the persistent resource YAML configuration file, containing a list of ResourcePool specs. If an option is specified in both the configuration file and the command-line arguments, the command-line arguments override the configuration file. Note that keys with underscores are considered invalid.
Example YAML configuration file:
```
resourcePoolSpecs:
  machineSpec:
    machineType: n1-standard-4
  replicaCount: 1
    
```

Execute the following command:

Linux, macOS, or Cloud Shell

gcloud ai persistent-resources create \
    --persistent-resource-id=PERSISTENT_RESOURCE_ID \
    --display-name=DISPLAY_NAME \
    --project=PROJECT_ID \
    --region=LOCATION \
    --resource-pool-spec="replica-count=REPLICA_COUNT,machine-type=MACHINE_TYPE,min-replica-count=MIN_REPLICA_COUNT,max-replica-count=MAX_REPLICA_COUNT" \
    --enable-custom-service-account

Windows (PowerShell)

gcloud ai persistent-resources create `
    --persistent-resource-id=PERSISTENT_RESOURCE_ID `
    --display-name=DISPLAY_NAME `
    --project=PROJECT_ID `
    --region=LOCATION `
    --resource-pool-spec="replica-count=REPLICA_COUNT,machine-type=MACHINE_TYPE,min-replica-count=MIN_REPLICA_COUNT,max-replica-count=MAX_REPLICA_COUNT" `
    --enable-custom-service-account

Windows (cmd.exe)

gcloud ai persistent-resources create ^
    --persistent-resource-id=PERSISTENT_RESOURCE_ID ^
    --display-name=DISPLAY_NAME ^
    --project=PROJECT_ID ^
    --region=LOCATION ^
    --resource-pool-spec="replica-count=REPLICA_COUNT,machine-type=MACHINE_TYPE,min-replica-count=MIN_REPLICA_COUNT,max-replica-count=MAX_REPLICA_COUNT" ^
    --enable-custom-service-account

You should receive a response similar to the following:

Using endpoint [https://us-central1-aiplatform.googleapis.com/]
Operation to create PersistentResource [projects/PROJECT_NUMBER/locations/us-central1/persistentResources/mypersistentresource/operations/OPERATION_ID] is submitted successfully.

You can view the status of your PersistentResource create operation with the command

  $ gcloud ai operations describe projects/sample-project/locations/us-central1/operations/OPERATION_ID

Example gcloud command:

gcloud ai persistent-resources create \
    --persistent-resource-id=my-persistent-resource \
    --region=us-central1 \
    --resource-pool-spec="replica-count=4,machine-type=n1-standard-4"
    --enable-custom-service-account

Advanced `gcloud` configurations

If you want to specify configuration options that are not available in the preceding examples, you can use the --config flag to specify the path to a config.yaml file in your local environment that contains the fields of persistentResources. For example:

gcloud ai persistent-resources create \
    --persistent-resource-id=PERSISTENT_RESOURCE_ID \
    --project=PROJECT_ID \
    --region=LOCATION \
    --config=CONFIG
    --enable-custom-service-account

Python

Before trying this sample, follow the Python setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Python API reference documentation.

To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

To create a persistent resource that you can use with a pipeline run, set the enable_custom_service_account parameter to True in the ResourceRuntimeSpec object while creating the persistent resource.

from google.cloud.aiplatform.preview import persistent_resource
from google.cloud.aiplatform_v1beta1.types.persistent_resource import ResourcePool
from google.cloud.aiplatform_v1beta1.types.machine_resources import MachineSpec

my_example_resource = persistent_resource.PersistentResource.create(
    persistent_resource_id='PERSISTENT_RESOURCE_ID',
    display_name='DISPLAY_NAME',
    resource_pools=[
        ResourcePool(
            machine_spec=MachineSpec(
                machine_type='MACHINE_TYPE'
            ),
            replica_count=REPLICA_COUNT
        )
    ],
    enable_custom_service_account=True,
)

Replace the following:

PERSISTENT_RESOURCE_ID: A unique, user-defined ID for the persistent resource. The ID must contain only lowercase letters, numbers, and hyphens (-). The first character must be a lowercase letter and the last character must be either be a lowercase letter or a number.
DISPLAY_NAME: Optional. The display name of the persistent resource.
MACHINE_TYPE: The type of virtual machine (VM) to use. For a list of supported VMs, see Machine types. This field corresponds to the machineSpec.machineType field in the ResourcePool API message.
REPLICA_COUNT: The number of replicas to create when creating this resource pool.

REST

To create a PersistentResource resource that you can associate with a pipeline run, send a POST request by using the persistentResources/create method with the enable_custom_service_account parameter set to true in the request body.

A persistent resource can have one or more resource pools. You can configure each resource pool to use either a fixed number of replicas or autoscaling.

Before using any of the request data, make the following replacements:

PROJECT_ID: The project ID of the Google Cloud project where you want to create the persistent resource.
LOCATION: The region where you want to create the persistent resource. For a list of supported regions, see Feature availability.
PERSISTENT_RESOURCE_ID: A unique, user-defined ID for the persistent resource. It must start with a letter, end with a letter or number, and contain only lowercase letters, numbers, and hyphens (-).
DISPLAY_NAME: Optional. The display name of the persistent resource.
MACHINE_TYPE: The type of virtual machine (VM) to use. For a list of supported VMs, see Machine types. This field corresponds to the machineSpec.machineType field in the ResourcePool API message.
REPLICA_COUNT: Optional. The number of replicas to create for the resource pool, if you don't want to use autoscaling. This field corresponds to the replicaCount field in the ResourcePool API message. You must specify the replica count if you don't specify the MIN_REPLICA_COUNT and MAX_REPLICA_COUNT fields.
MIN_REPLICA_COUNT: Optional. The minimum number of replicas if you're using autoscaling for the resource pool. You must specify both MIN_REPLICA_COUNT and MAX_REPLICA_COUNT to use autoscaling.
MAX_REPLICA_COUNT: Optional. The maximum number of replicas if you're using autoscaling for the resource pool. You must specify both MIN_REPLICA_COUNT and MAX_REPLICA_COUNT to use autoscaling.

HTTP method and URL:

POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/persistentResources?persistent_resource_id=PERSISTENT_RESOURCE_ID

Request JSON body:

{
  "display_name": "DISPLAY_NAME",
  "resource_pools": [
    {
      "machine_spec": {
        "machine_type": "MACHINE_TYPE"
      },
      "replica_count": REPLICA_COUNT,
      "autoscaling_spec": {
        "min_replica_count": MIN_REPLICA_COUNT,
        "max_replica_count": MAX_REPLICA_COUNT
      }
    }
  ],
  "resource_runtime_spec": {
    "service_account_spec": {
      "enable_custom_service_account": true
    }
  }
}

To send your request, expand one of these options:

curl (Linux, macOS, or Cloud Shell)

Note: The following command assumes that you have logged in to the gcloud CLI with your user account by running gcloud init or gcloud auth login , or by using Cloud Shell, which automatically logs you into the gcloud CLI . You can check the currently active account by running gcloud auth list.

Save the request body in a file named request.json. Run the following command in the terminal to create or overwrite this file in the current directory:

cat > request.json << 'EOF'
{
  "display_name": "DISPLAY_NAME",
  "resource_pools": [
    {
      "machine_spec": {
        "machine_type": "MACHINE_TYPE"
      },
      "replica_count": REPLICA_COUNT,
      "autoscaling_spec": {
        "min_replica_count": MIN_REPLICA_COUNT,
        "max_replica_count": MAX_REPLICA_COUNT
      }
    }
  ],
  "resource_runtime_spec": {
    "service_account_spec": {
      "enable_custom_service_account": true
    }
  }
}
EOF

Then execute the following command to send your REST request:

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/persistentResources?persistent_resource_id=PERSISTENT_RESOURCE_ID"

PowerShell (Windows)

Note: The following command assumes that you have logged in to the gcloud CLI with your user account by running gcloud init or gcloud auth login . You can check the currently active account by running gcloud auth list.

Save the request body in a file named request.json. Run the following command in the terminal to create or overwrite this file in the current directory:

@'
{
  "display_name": "DISPLAY_NAME",
  "resource_pools": [
    {
      "machine_spec": {
        "machine_type": "MACHINE_TYPE"
      },
      "replica_count": REPLICA_COUNT,
      "autoscaling_spec": {
        "min_replica_count": MIN_REPLICA_COUNT,
        "max_replica_count": MAX_REPLICA_COUNT
      }
    }
  ],
  "resource_runtime_spec": {
    "service_account_spec": {
      "enable_custom_service_account": true
    }
  }
}
'@  | Out-File -FilePath request.json -Encoding utf8

Then execute the following command to send your REST request:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/persistentResources?persistent_resource_id=PERSISTENT_RESOURCE_ID" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{
  "name": "projects/PROJECT_NUMBER/locations/LOCATION/persistentResources/mypersistentresource/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.aiplatform.v1.CreatePersistentResourceOperationMetadata",
    "genericMetadata": {
      "createTime": "2023-02-08T21:17:15.009668Z",
      "updateTime": "2023-02-08T21:17:15.009668Z"
    }
  }
}

Create a pipeline run using the persistent resource

To create a pipeline job, you must first create a pipeline spec. A pipeline spec is an in-memory object that you create by converting a compiled pipeline definition.

Create a pipeline spec

Follow these instructions to create an in-memory pipeline spec that you can use to create the pipeline run:

Define a pipeline and compile it into a YAML file. For more information about defining and compiling a pipeline, see Build a pipeline.

Use the following code sample to convert the compiled pipeline YAML file to an in-memory pipeline spec.

import yaml
with open("COMPILED_PIPELINE_PATH", "r") as stream:
  try:
    pipeline_spec = yaml.safe_load(stream)
    print(pipeline_spec)
  except yaml.YAMLError as exc:
    print(exc)

Replace COMPILED_PIPELINE_PATH with the local path to your compiled pipeline YAML file.

Create a pipeline run

Use the following Python code sample to create a pipeline run that uses the persistent resource:

# Import aiplatform and the appropriate API version v1beta1
from google.cloud import aiplatform, aiplatform_v1beta1
from google.cloud.aiplatform_v1beta1.types import pipeline_job as pipeline_job_types

# Initialize the Vertex SDK using PROJECT_ID and LOCATION
aiplatform.init(project='PROJECT_ID', location='LOCATION')

# Create the API Endpoint
client_options = {
    "api_endpoint": f"LOCATION-aiplatform.googleapis.com"
}

# Initialize the PipeLineServiceClient
client = aiplatform_v1beta1.PipelineServiceClient(client_options=client_options)

# Construct the runtime detail
pr_runtime_detail = pipeline_job_types.PipelineJob.RuntimeConfig.PersistentResourceRuntimeDetail(
    persistent_resource_name=(
        f"projects/PROJECT_NUMBER/"
        f"locations/LOCATION/"
        f"persistentResources/PERSISTENT_RESOURCE_ID"
    ),
    task_resource_unavailable_wait_time_ms=WAIT_TIME,
    task_resource_unavailable_timeout_behavior='TIMEOUT_BEHAVIOR',
)

# Construct the default runtime configuration block
default_runtime = pipeline_job_types.PipelineJob.RuntimeConfig.DefaultRuntime(
    persistent_resource_runtime_detail=pr_runtime_detail
)

# Construct the main runtime configuration
runtime_config = pipeline_job_types.PipelineJob.RuntimeConfig(
    gcs_output_directory='PIPELINE_ROOT',
    parameter_values={
        'project_id': 'PROJECT_ID'
    },
    default_runtime=default_runtime
)

# Construct the pipeline job object
pipeline_job = pipeline_job_types.PipelineJob(
    display_name='PIPELINE_DISPLAY_NAME',
    pipeline_spec=PIPELINE_SPEC,
    runtime_config=runtime_config,
)

# Construct the request
parent_path = f"projects/PROJECT_ID/locations/LOCATION"
request = aiplatform_v1beta1.CreatePipelineJobRequest(
    parent=parent_path,
    pipeline_job=pipeline_job,
)

# Make the API Call to create the pipeline job
response = client.create_pipeline_job(request=request)

# Construct the Google Cloud console link
job_id = response.name.split('/')[-1]
console_link = (
    f"https://console.cloud.google.com/vertex-ai/locations/LOCATION"
    f"/pipelines/runs/{job_id}"
    f"?project=PROJECT_ID"
)

# Print the Google Cloud console link to the pipeline run
print(f"View Pipeline Run in Google Cloud console: {console_link}")