Understand pipeline run costs

You can review the cost of your pipeline runs using Cloud Billing export to BigQuery.

You can also use the unique pipeline run billing ID of your pipeline run to review the costs of resources created by the pipeline run, as follows:

  1. Vertex AI Pipelines automatically attaches the vertex-ai-pipelines-run-billing-id label to your pipeline run. The value of this label is your unique pipeline run billing ID.

  2. Vertex AI Pipelines propagates this label to Google Cloud resources generated by pipeline components during the pipeline run. Note that for some components and resources, you need to either upgrade the Google Cloud Pipeline Components SDK or update your component code to propagate the labels. For more information about labeling Google Cloud resources, see Resource labeling by Vertex AI Pipelines.

  3. The vertex-ai-pipelines-run-billing-id label connects the usage of Google Cloud resources generated by the pipeline run in billing reports. Using the value of this label, you can review the cost of the resource usage in the pipeline run with Cloud Billing export to BigQuery.

This page shows you how to do the following:

Before you begin

Before you use Cloud Billing to understand the cost of a pipeline run, use the following instructions to set up your Google Cloud project and development environment:

For more information about the schema of Cloud Billing standard usage cost data, see Schema of the standard usage cost data.

List your ten most expensive pipeline runs

Run the following query to view a list of your ten most expensive pipeline runs over a specified time period:

Standard SQL

SELECT
  project.id,
  location.region,
  L.value,
  SUM(cost) AS total_cost
FROM
  `project.dataset.gcp_billing_export_v1_XXXXXX_XXXXXX_XXXXXX` B,
  UNNEST (B.labels) L
WHERE
  DATE(_PARTITIONTIME) >= "START_DATE"
  AND DATE(_PARTITIONTIME) < "END_DATE"
  AND L.key = "vertex-ai-pipelines-run-billing-id"
GROUP BY
  project.id,
  location.region,
  L.value
ORDER BY
  total_cost DESC
LIMIT
  10;

Replace the following:

  • START_DATE: Start date of the time period.

  • END_DATE: End date of the time period.

You should see the following columns in the query results:

  • project_id

  • region

  • pipeline_run_billing_id

  • total_cost

You can now use the unique pipeline run billing ID from the pipeline_run_billing_id column of the query results to do the following:

Use the billing ID to locate a pipeline run

You can use the unique pipeline run billing ID from the query results in List your ten most expensive pipeline runs to locate a pipeline run.

Console

Use the following instructions to retrieve a pipeline run in the Google Cloud console.

  1. In the Google Cloud console, in the Vertex AI section, go to the Pipelines page.

    Go to Pipelines

  2. To locate the pipeline run, filter the list using a unique pipeline run billing ID from the query results in List your ten most expensive pipeline runs. To do this:

    1. Click Filter and then click Labels.

    2. Enter the unique pipeline run billing ID in the following format and press Enter:
      labels.vertex-ai-pipelines-run-billing-id=PIPELINE_RUN_BILLING_ID
      where PIPELINE_RUN_BILLING_ID is the unique pipeline run billing ID.

Vertex AI SDK for Python

Use the following code sample to retrieve the pipeline run:

runs = aip.PipelineJob.list(
  project=PROJECT_ID,
  location=LOCATION,
  filter="labels.vertex-ai-pipelines-run-billing-id=PIPELINE_RUN_BILLING_ID")

Replace the following:

  • PROJECT_ID: The Google Cloud project that this pipeline runs in.

  • LOCATION: The region that the pipeline runs in. For more information about the regions that Vertex AI Pipelines is available in, see the Vertex AI locations guide.

  • PIPELINE_RUN_BILLING_ID: Unique pipeline run billing ID from the query results in List your ten most expensive pipeline runs.

View the costs of Google Cloud resources in a pipeline run

You can use the unique pipeline run billing ID from the query results in List your ten most expensive pipeline runs to view the costs of Google Cloud resources generated by the pipeline run.

Run the following query to view the list of Google Cloud resources generated in a pipeline run, along with the cost of each resource:

Standard SQL

SELECT
  service,
  sku,
  cost
FROM
  `project.dataset.gcp_billing_export_v1_XXXXXX_XXXXXX_XXXXXX` B,
  UNNEST (B.labels) L
WHERE
  DATE(_PARTITIONTIME) >=  "START_DATE"
  AND DATE(_PARTITIONTIME) <  "END_DATE"
  AND L.key = "vertex-ai-pipelines-run-billing-id"
  AND L.value = "PIPELINE_RUN_BILLING_ID";

Replace the following:

You should see the following columns in the query results:

  • service.id

  • service.description

  • sku.id

  • sku_description

  • cost

The cost column represents the cost of a resource corresponding to the sku.id in the pipeline run.