You can configure private connectivity for your pipeline run using the Private Service Connect interface (PSC-I). We recommend using Vertex AI Private Service Connect for private connectivity, since it reduces the chances of IP exhaustion and supports transitive peering.
Vertex AI Pipelines uses the underlying PSC-I infrastructure for training to pass the connection details to the custom training job. To learn more about the limitations and pricing of using PSC-I with custom training, see Use Private Service Connect interface for Vertex AI Training.
Limitations
Private Service Connect interfaces don't support external IP addresses.
Pricing
Pricing for Private Service Connect interfaces is described on the All networking pricing page.
Before you begin
To use PSC-I with Vertex AI Pipelines, you must first Set up a Private Service Connect interface for Vertex AI resources.
Create a pipeline run with PSC-I
To create a pipeline job, you must first create a pipeline spec. A pipeline spec is an in-memory object that you create by converting a compiled pipeline definition.
Create a pipeline spec
Follow these instructions to create an in-memory pipeline spec that you can use to create the pipeline run:
Define a pipeline and compile it into a YAML file. For more information about defining and compiling a pipeline, see Build a pipeline.
Use the following code sample to convert the compiled pipeline YAML file to an in-memory pipeline spec.
import yaml with open("COMPILED_PIPELINE_PATH", "r") as stream: try: pipeline_spec = yaml.safe_load(stream) print(pipeline_spec) except yaml.YAMLError as exc: print(exc)
Replace COMPILED_PIPELINE_PATH with the local path to your compiled pipeline YAML file.
Create the pipeline run
Use the following samples to create a pipeline run using PSC-I:
Python
To create a pipeline run with PSC-I using the Vertex AI SDK for Python,
configure the run using the
aiplatform_v1beta1/services/pipeline_service
definition.
# Import aiplatform and the appropriate API version v1beta1
from google.cloud import aiplatform, aiplatform_v1beta1
# Initialize the Vertex SDK using PROJECT_ID and LOCATION
aiplatform.init(project="PROJECT_ID", location="LOCATION")
# Create the API endpoint
client_options = {
"api_endpoint": f"LOCATION-aiplatform.googleapis.com"
}
# Initialize the PipelineServiceClient
client = aiplatform_v1beta1.PipelineServiceClient(client_options=client_options)
# Construct the request
request = aiplatform_v1beta1.CreatePipelineJobRequest(
parent=f"projects/PROJECT_ID/locations/LOCATION",
pipeline_job=aiplatform_v1beta1.PipelineJob(
display_name="DISPLAY_NAME",
pipeline_spec=PIPELINE_SPEC,
runtime_config=aiplatform_v1beta1.PipelineJob.RuntimeConfig(
gcs_output_directory="OUTPUT_DIRECTORY",
),
psc_interface_config=aiplatform_v1beta1.PscInterfaceConfig(
network_attachment="NETWORK_ATTACHMENT_NAME"
),
)
# Make the API call
response = client.create_pipeline_job(request=request)
# Print the response
print(response)
Replace the following:
- PROJECT_ID: The project ID of the project where you want to create the pipeline run.
- LOCATION: The region where you want to create the pipeline run.
- DISPLAY_NAME: The name of the pipeline job. The maximum length for a display name is 128 UTF-8 characters.
- PIPELINE_SPEC: The pipeline spec you created in Create a pipeline spec.
- OUTPUT_DIRECTORY: The URI of the Cloud Storage bucket for storing output artifacts. This path is the root output directory for the pipeline and is used to generate the paths of output artifacts.
- NETWORK_ATTACHMENT_NAME: The name of the Compute Engine network
attachment to attach to the
PipelineJob
resource. To obtain the network attachment, you must have completed the steps in the Before you begin section. For more information about the network attachment, see Set up a VPC network, subnet, and network attachment.
REST
To create a pipeline run, send a POST
request by using the
pipelineJobs.create
method.
Before using any of the request data, make the following replacements:
- PROJECT_ID: The project ID of the project where you want to create the pipeline run.
- LOCATION: The region where you want to create the pipeline run.
- DISPLAY_NAME: The name of the pipeline job. The maximum length for a display name is 128 UTF-8 characters.
- PIPELINE_SPEC: The pipeline spec you created in Create a pipeline spec.
- OUTPUT_DIRECTORY: The URI of the Cloud Storage bucket for storing output artifacts. This path is the root output directory for the pipeline and is used to generate the paths of output artifacts.
- NETWORK_ATTACHMENT_NAME: The name of the Compute Engine network
attachment to attach to the
PipelineJob
resource. To obtain the network attachment, you must have completed the steps in the Before you begin section. For more information about the network attachment, see Set up a VPC network, subnet, and network attachment.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/pipelineJobs
Request JSON body:
{ "display_name": "DISPLAY_NAME", "pipeline_spec": "PIPELINE_SPEC", "runtime_config": { "gcs_output_directory": "OUTPUT_DIRECTORY", }, "psc_interface_config": { "network_attachment": "NETWORK_ATTACHMENT_NAME" } }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/pipelineJobs"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/pipelineJobs" | Select-Object -Expand Content
You should see output similar to the following. PIPELINE_JOB_ID represents the ID of the pipeline run and SERVICE_ACCOUNT_NAME represents the service account used to run the pipeline.
{ "name": "projects/PROJECT_ID/locations/LOCATION/pipelineJobs/PIPELINE_JOB_ID", "displayName": "DISPLAY_NAME", "createTime": "20xx-01-01T00:00:00.000000Z", "updateTime": "20xx-01-01T00:00:00.000000Z", "pipelineSpec": PIPELINE_SPEC, "state": "PIPELINE_STATE_PENDING", "labels": { "vertex-ai-pipelines-run-billing-id": "VERTEX_AI_PIPELINES_RUN_BILLING_ID" }, "runtimeConfig": { "gcsOutputDirectory": "OUTPUT_DIRECTORY" }, "serviceAccount": "SERVICE_ACCOUNT_NAME" "pscInterfaceConfig": { "networkAttachment": "NETWORK_ATTACHMENT_NAME" } }