A Vertex AI persistent resource is a long-running cluster that you can use to run custom training jobs and pipeline runs. By using a persistent resource for a pipeline run, you can help ensure compute resource availability and reduce pipeline task startup time. Persistent resources support all VMs and GPUs that are supported by custom training jobs. Learn more about persistent resources.
This page shows you how to do the following:
Before you begin
Before you can create a pipeline run with a persistent resource, you must first complete the following prerequisites.
Define and compile a pipeline
Define your pipeline and then compile the pipeline definition into a YAML file. For more information about defining and compiling a pipeline, see Build a pipeline.
Required IAM roles
To get the permission that you need to create a persistent resource,
ask your administrator to grant you the
Vertex AI Administrator (roles/aiplatform.admin
) IAM role on your project.
For more information about granting roles, see Manage access to projects, folders, and organizations.
This predefined role contains the
aiplatform.persistentResources.create
permission,
which is required to
create a persistent resource.
You might also be able to get this permission with custom roles or other predefined roles.
Create a persistent resource
Use the following samples to create a persistent resource that you can associate with a pipeline run. For more information about creating persistent resources, see Create a persistent resource.
gcloud
To create a persistent resource that you can associate with a pipeline run, use
the gcloud ai persistent-resources create
command
along with the --enable-custom-service-account
flag.
A persistent resource can have one or more resource pools. To create multiple
resource pools in a persistent resource, specify multiple
--resource-pool-spec
flags.
You can specify all resource pool configurations as part of the command-line
or use the --config
flag to specify the path to a YAML file that
contains the configurations.
Before using any of the command data below, make the following replacements:
- PROJECT_ID: The project ID of the Google Cloud project where you want to create the persistent resource.
- LOCATION: The region where you want to create the persistent resource. For a list of supported regions, see Feature availability.
- PERSISTENT_RESOURCE_ID: The ID of the persistent resource.
- DISPLAY_NAME: Optional. The display name of the persistent resource.
- MACHINE_TYPE: The type of virtual machine (VM)
to use. For a list of supported VMs, see
Machine types.
This field corresponds to the
machineSpec.machineType
field in theResourcePool
API message. - REPLICA_COUNT: Optional. The number of replicas to create
for the resource pool, if you don't want to use autoscaling. This field corresponds to the
replicaCount
field in theResourcePool
API message. You must specify the replica count if you don't specify the MIN_REPLICA_COUNT and MAX_REPLICA_COUNT fields. - MIN_REPLICA_COUNT: Optional. The minimum number of replicas if you're using autoscaling for the resource pool. You must specify both MIN_REPLICA_COUNT and MAX_REPLICA_COUNT to use autoscaling.
- MAX_REPLICA_COUNT: Optional. The maximum number of replicas if you're using autoscaling for the resource pool. You must specify both MIN_REPLICA_COUNT and MAX_REPLICA_COUNT to use autoscaling.
- CONFIG: Path to the persistent resource YAML
configuration file, containing a list of
ResourcePool
specs. If an option is specified in both the configuration file and the command-line arguments, the command-line arguments override the configuration file. Note that keys with underscores are considered invalid.Example YAML configuration file:
resourcePoolSpecs: machineSpec: machineType: n1-standard-4 replicaCount: 1
Execute the following command:
Linux, macOS, or Cloud Shell
gcloud ai persistent-resources create \ --persistent-resource-id=PERSISTENT_RESOURCE_ID \ --display-name=DISPLAY_NAME \ --project=PROJECT_ID \ --region=LOCATION \ --resource-pool-spec="replica-count=REPLICA_COUNT,machine-type=MACHINE_TYPE,min-replica-count=MIN_REPLICA_COUNT,max-replica-count=MAX_REPLICA_COUNT" \ --enable-custom-service-account
Windows (PowerShell)
gcloud ai persistent-resources create ` --persistent-resource-id=PERSISTENT_RESOURCE_ID ` --display-name=DISPLAY_NAME ` --project=PROJECT_ID ` --region=LOCATION ` --resource-pool-spec="replica-count=REPLICA_COUNT,machine-type=MACHINE_TYPE,min-replica-count=MIN_REPLICA_COUNT,max-replica-count=MAX_REPLICA_COUNT" ` --enable-custom-service-account
Windows (cmd.exe)
gcloud ai persistent-resources create ^ --persistent-resource-id=PERSISTENT_RESOURCE_ID ^ --display-name=DISPLAY_NAME ^ --project=PROJECT_ID ^ --region=LOCATION ^ --resource-pool-spec="replica-count=REPLICA_COUNT,machine-type=MACHINE_TYPE,min-replica-count=MIN_REPLICA_COUNT,max-replica-count=MAX_REPLICA_COUNT" ^ --enable-custom-service-account
You should receive a response similar to the following:
Using endpoint [https://us-central1-aiplatform.googleapis.com/] Operation to create PersistentResource [projects/PROJECT_NUMBER/locations/us-central1/persistentResources/mypersistentresource/operations/OPERATION_ID] is submitted successfully. You can view the status of your PersistentResource create operation with the command $ gcloud ai operations describe projects/sample-project/locations/us-central1/operations/OPERATION_ID
Example gcloud
command:
gcloud ai persistent-resources create \ --persistent-resource-id=my-persistent-resource \ --region=us-central1 \ --resource-pool-spec="replica-count=4,machine-type=n1-standard-4" --enable-custom-service-account
Advanced gcloud
configurations
If you want to specify configuration options that are not available in the
preceding examples, you can use the --config
flag to specify the path to a
config.yaml
file in your local environment that contains the fields of
persistentResources
. For example:
gcloud ai persistent-resources create \ --persistent-resource-id=PERSISTENT_RESOURCE_ID \ --project=PROJECT_ID \ --region=LOCATION \ --config=CONFIG --enable-custom-service-account
Python
Before trying this sample, follow the Python setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Python API reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
To create a persistent resource that you can use with a pipeline run, set theenable_custom_service_account
parameter to True
in the ResourceRuntimeSpec
object while creating the
persistent resource.
my_example_resource = persistent_resource.PersistentResource.create(
persistent_resource_id=PERSISTENT_RESOURCE_ID,
display_name=DISPLAY_NAME,
resource_pools=[
resource_pool.ResourcePool(
machine_spec=machine_spec.MachineSpec(
machine_type=MACHINE_TYPE,
),
replica_count=REPLICA_COUNT,
)
],
resource_runtime_spec=resource_runtime_spec.ResourceRuntimeSpec(
enable_custom_service_account=True,
),
)
Replace the following:
- PERSISTENT_RESOURCE_ID: The ID of the persistent resource.
- DISPLAY_NAME: Optional. The display name of the persistent resource.
- MACHINE_TYPE: The type of virtual machine (VM)
to use. For a list of supported VMs, see
Machine types.
This field corresponds to the
machineSpec.machineType
field in theResourcePool
API message. - REPLICA_COUNT: The number of replicas to create when creating this resource pool.
REST
To create a PersistentResource
resource that you can associate with a pipeline run, send a POST request by using the
persistentResources/create
method with the enable_custom_service_account
parameter set to
true
in the request body.
A persistent resource can have one or more resource pools. You can configure each resource pool to use either a fixed number of replicas or autoscaling.
Before using any of the request data, make the following replacements:
- PROJECT_ID: The project ID of the Google Cloud project where you want to create the persistent resource.
- LOCATION: The region where you want to create the persistent resource. For a list of supported regions, see Feature availability.
- PERSISTENT_RESOURCE_ID: The ID of the persistent resource.
- DISPLAY_NAME: Optional. The display name of the persistent resource.
- MACHINE_TYPE: The type of virtual machine (VM)
to use. For a list of supported VMs, see
Machine types.
This field corresponds to the
machineSpec.machineType
field in theResourcePool
API message. - REPLICA_COUNT: Optional. The number of replicas to create
for the resource pool, if you don't want to use autoscaling. This field corresponds to the
replicaCount
field in theResourcePool
API message. You must specify the replica count if you don't specify the MIN_REPLICA_COUNT and MAX_REPLICA_COUNT fields. - MIN_REPLICA_COUNT: Optional. The minimum number of replicas if you're using autoscaling for the resource pool. You must specify both MIN_REPLICA_COUNT and MAX_REPLICA_COUNT to use autoscaling.
- MAX_REPLICA_COUNT: Optional. The maximum number of replicas if you're using autoscaling for the resource pool. You must specify both MIN_REPLICA_COUNT and MAX_REPLICA_COUNT to use autoscaling.
HTTP method and URL:
POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/persistentResources?persistent_resource_id=PERSISTENT_RESOURCE_ID
Request JSON body:
{ "display_name": "DISPLAY_NAME", "resource_pools": [ { "machine_spec": { "machine_type": "MACHINE_TYPE" }, "replica_count": REPLICA_COUNT, "autoscaling_spec": { "min_replica_count": MIN_REPLICA_COUNT, "max_replica_count": MAX_REPLICA_COUNT } } ], "resource_runtime_spec": { "enable_custom_service_account: true } }
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
{ "name": "projects/PROJECT_NUMBER/locations/LOCATION/persistentResources/mypersistentresource/operations/OPERATION_ID", "metadata": { "@type": "type.googleapis.com/google.cloud.aiplatform.v1.CreatePersistentResourceOperationMetadata", "genericMetadata": { "createTime": "2023-02-08T21:17:15.009668Z", "updateTime": "2023-02-08T21:17:15.009668Z" } } }
Create a pipeline run using the persistent resource
Use the following code sample to create a pipeline run that uses the persistent resource:
job = aiplatform.PipelineJob(display_name = 'DISPLAY_NAME',
template_path = 'COMPILED_PIPELINE_PATH',
pipeline_root = 'PIPELINE_ROOT',
project = 'PROJECT_ID',
location = 'LOCATION',
default_runtime = {
"persistentResourceRuntimeDetail": {
"persistentResourceName": "PERSISTENT_RESOURCE_ID",
"taskResourceUnavailableWaitTimeMs": WAIT_TIME,
"taskResourceUnavailableTimeoutBehavior": TIMEOUT_BEHAVIOR,
}
}
Replace the following:
DISPLAY_NAME: The name of the pipeline. This appears in the Google Cloud console.
COMPILED_PIPELINE_PATH: The path to your compiled pipeline YAML file. It can be a local path or a Cloud Storage URI.
PIPELINE_ROOT: Specify a Cloud Storage URI to store the artifacts of your pipeline run.
PROJECT_ID: The Google Cloud project that this pipeline runs in.
LOCATION: The region where the pipeline run is executed. For more information about the regions where Vertex AI Pipelines is available, see the Vertex AI locations guide. If you don't set this parameter, Vertex AI Pipelines uses the default location set in
aiplatform.init
.PERSISTENT_RESOURCE_ID: The ID of the persistent resource that you created.
WAIT_TIME: The time in milliseconds to wait if the persistent resource is unavailable.
TIMEOUT_BEHAVIOR: Specify the fall back behavior of the pipeline task in case the WAIT_TIME is exceeded. Possible values include the following:
FAIL
The pipeline task fails after exceeding the wait time.FALL_BACK_TO_ON_DEMAND
The pipeline task continues to run using the default Vertex AI training resources, without using the persistent resource.
What's next
- Learn how to run a pipeline.