Configure Private Service Connect interface for a pipeline

You can configure private connectivity for your pipeline run using a Private Service Connect interface. Google recommends using Vertex AI Private Service Connect for private connectivity, since it reduces the chances of IP exhaustion and supports transitive peering.

Vertex AI Pipelines uses the underlying Private Service Connect interface infrastructure for training to pass the connection details to the custom training job. To learn more about the limitations and pricing of using Private Service Connect interfaces with custom training, see Use Private Service Connect interface for Vertex AI Training.

Limitations

Private Service Connect interfaces don't support external IP addresses.

Pricing

Pricing for Private Service Connect interfaces is described on the All networking pricing page.

Before you begin

To use a Private Service Connect interface with Vertex AI Pipelines, you must first Set up a Private Service Connect interface for Vertex AI resources.

Create a pipeline run with Private Service Connect interfaces

To create a pipeline job, you must first create a pipeline spec. A pipeline spec is an in-memory object that you create by converting a compiled pipeline definition.

Create a pipeline spec

Follow these instructions to create an in-memory pipeline spec that you can use to create the pipeline run:

  1. Define a pipeline and compile it into a YAML file. For more information about defining and compiling a pipeline, see Build a pipeline.

  2. Use the following code sample to convert the compiled pipeline YAML file to an in-memory pipeline spec.

    import yaml
    with open("COMPILED_PIPELINE_PATH", "r") as stream:
      try:
        pipeline_spec = yaml.safe_load(stream)
        print(pipeline_spec)
      except yaml.YAMLError as exc:
        print(exc)
    

    Replace COMPILED_PIPELINE_PATH with the local path to your compiled pipeline YAML file.

Create the pipeline run

Use the following samples to create a pipeline run using Private Service Connect interfaces:

Python

To create a pipeline run with Private Service Connect interfaces using the Vertex AI SDK for Python, configure the run using the aiplatform_v1beta1/services/pipeline_service definition.

# Import aiplatform and the appropriate API version v1beta1
from google.cloud import aiplatform, aiplatform_v1beta1

# Initialize the Vertex SDK using PROJECT_ID and LOCATION
aiplatform.init(project="PROJECT_ID", location="LOCATION")

# Create the API endpoint
client_options = {
"api_endpoint": f"LOCATION-aiplatform.googleapis.com"
}

# Initialize the PipelineServiceClient
client = aiplatform_v1beta1.PipelineServiceClient(client_options=client_options)

# Construct the request
request = aiplatform_v1beta1.CreatePipelineJobRequest(
parent=f"projects/PROJECT_ID/locations/LOCATION",
pipeline_job=aiplatform_v1beta1.PipelineJob(
    display_name="DISPLAY_NAME",
    pipeline_spec=PIPELINE_SPEC,
    runtime_config=aiplatform_v1beta1.PipelineJob.RuntimeConfig(
        gcs_output_directory="OUTPUT_DIRECTORY",
    ),
    psc_interface_config=aiplatform_v1beta1.PscInterfaceConfig(
        network_attachment="NETWORK_ATTACHMENT_NAME"
    ),
)

# Make the API call
response = client.create_pipeline_job(request=request)

# Print the response
print(response)

Replace the following:

  • PROJECT_ID: The project ID of the project where you want to create the pipeline run.
  • LOCATION: The region where you want to create the pipeline run.
  • DISPLAY_NAME: The name of the pipeline job. The maximum length for a display name is 128 UTF-8 characters.
  • PIPELINE_SPEC: The pipeline spec you created in Create a pipeline spec.
  • OUTPUT_DIRECTORY: The URI of the Cloud Storage bucket for storing output artifacts. This path is the root output directory for the pipeline and is used to generate the paths of output artifacts.
  • NETWORK_ATTACHMENT_NAME: The name of the Compute Engine network attachment to attach to the PipelineJob resource. To obtain the network attachment, you must have completed the steps in the Before you begin section. For more information about the network attachment, see Set up a VPC network, subnet, and network attachment.

REST

To create a pipeline run, send a POST request by using the pipelineJobs.create method.

Before using any of the request data, make the following replacements:

  • PROJECT_ID: The project ID of the project where you want to create the pipeline run.
  • LOCATION: The region where you want to create the pipeline run.
  • DISPLAY_NAME: The name of the pipeline job. The maximum length for a display name is 128 UTF-8 characters.
  • PIPELINE_SPEC: The pipeline spec you created in Create a pipeline spec.
  • OUTPUT_DIRECTORY: The URI of the Cloud Storage bucket for storing output artifacts. This path is the root output directory for the pipeline and is used to generate the paths of output artifacts.
  • NETWORK_ATTACHMENT_NAME: The name of the Compute Engine network attachment to attach to the PipelineJob resource. To obtain the network attachment, you must have completed the steps in the Before you begin section. For more information about the network attachment, see Set up a VPC network, subnet, and network attachment.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/pipelineJobs

Request JSON body:

{
  "display_name": "DISPLAY_NAME",
  "pipeline_spec": "PIPELINE_SPEC",
  "runtime_config": {
       "gcs_output_directory": "OUTPUT_DIRECTORY",
   },
   "psc_interface_config": {
       "network_attachment": "NETWORK_ATTACHMENT_NAME"
   }
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/pipelineJobs"

PowerShell

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/pipelineJobs" | Select-Object -Expand Content

You should see output similar to the following. PIPELINE_JOB_ID represents the ID of the pipeline run and SERVICE_ACCOUNT_NAME represents the service account used to run the pipeline.

{
  "name": "projects/PROJECT_ID/locations/LOCATION/pipelineJobs/PIPELINE_JOB_ID",
  "displayName": "DISPLAY_NAME",
  "createTime": "20xx-01-01T00:00:00.000000Z",
  "updateTime": "20xx-01-01T00:00:00.000000Z",
  "pipelineSpec": PIPELINE_SPEC,
  "state": "PIPELINE_STATE_PENDING",
  "labels": {
    "vertex-ai-pipelines-run-billing-id": "VERTEX_AI_PIPELINES_RUN_BILLING_ID"
  },
  "runtimeConfig": {
    "gcsOutputDirectory": "OUTPUT_DIRECTORY"
  },
  "serviceAccount": "SERVICE_ACCOUNT_NAME"
  "pscInterfaceConfig": {
    "networkAttachment": "NETWORK_ATTACHMENT_NAME"
  }
}