You can review the cost of your pipeline runs using Cloud Billing export to BigQuery.
You can also use the unique pipeline run billing ID of your pipeline run to review the costs of resources created by the pipeline run, as follows:
Vertex AI Pipelines automatically attaches the
vertex-ai-pipelines-run-billing-id
label to your pipeline run. The value of this label is your unique pipeline run billing ID.Vertex AI Pipelines propagates this label to Google Cloud resources generated by pipeline components during the pipeline run. Note that for some components and resources, you need to either upgrade the Google Cloud Pipeline Components SDK or update your component code to propagate the labels. For more information about labeling Google Cloud resources, see Resource labeling by Vertex AI Pipelines.
The
vertex-ai-pipelines-run-billing-id
label connects the usage of Google Cloud resources generated by the pipeline run in billing reports. Using the value of this label, you can review the cost of the resource usage in the pipeline run with Cloud Billing export to BigQuery.
This page shows you how to do the following:
Before you begin
Before you use Cloud Billing to understand the cost of a pipeline run, use the following instructions to set up your Google Cloud project and development environment:
For more information about the schema of Cloud Billing standard usage cost data, see Schema of the standard usage cost data.
List your ten most expensive pipeline runs
Run the following query to view a list of your ten most expensive pipeline runs over a specified time period:
Standard SQL
SELECT
project.id,
location.region,
L.value,
SUM(cost) AS total_cost
FROM
`project.dataset.gcp_billing_export_v1_XXXXXX_XXXXXX_XXXXXX` B,
UNNEST (B.labels) L
WHERE
DATE(_PARTITIONTIME) >= "START_DATE"
AND DATE(_PARTITIONTIME) < "END_DATE"
AND L.key = "vertex-ai-pipelines-run-billing-id"
GROUP BY
project.id,
location.region,
L.value
ORDER BY
total_cost DESC
LIMIT
10;
Replace the following:
START_DATE: Start date of the time period.
END_DATE: End date of the time period.
You should see the following columns in the query results:
project_id
region
pipeline_run_billing_id
total_cost
You can now use the unique pipeline run billing ID from the pipeline_run_billing_id
column of the query results to do the following:
Use the billing ID to locate a pipeline run
You can use the unique pipeline run billing ID from the query results in List your ten most expensive pipeline runs to locate a pipeline run.
Console
Use the following instructions to retrieve a pipeline run in the Google Cloud console.
In the Google Cloud console, in the Vertex AI section, go to the Pipelines page.
To locate the pipeline run, filter the list using a unique pipeline run billing ID from the query results in List your ten most expensive pipeline runs. To do this:
Click Filter and then click Labels.
Enter the unique pipeline run billing ID in the following format and press Enter:
labels.vertex-ai-pipelines-run-billing-id=PIPELINE_RUN_BILLING_ID
where PIPELINE_RUN_BILLING_ID is the unique pipeline run billing ID.
Vertex AI SDK for Python
Use the following code sample to retrieve the pipeline run:
runs = aip.PipelineJob.list(
project=PROJECT_ID,
location=LOCATION,
filter="labels.vertex-ai-pipelines-run-billing-id=PIPELINE_RUN_BILLING_ID")
Replace the following:
PROJECT_ID: The Google Cloud project that this pipeline runs in.
LOCATION: The region that the pipeline runs in. For more information about the regions that Vertex AI Pipelines is available in, see the Vertex AI locations guide.
PIPELINE_RUN_BILLING_ID: Unique pipeline run billing ID from the query results in List your ten most expensive pipeline runs.
View the costs of Google Cloud resources in a pipeline run
You can use the unique pipeline run billing ID from the query results in List your ten most expensive pipeline runs to view the costs of Google Cloud resources generated by the pipeline run.
Run the following query to view the list of Google Cloud resources generated in a pipeline run, along with the cost of each resource:
Standard SQL
SELECT
service,
sku,
cost
FROM
`project.dataset.gcp_billing_export_v1_XXXXXX_XXXXXX_XXXXXX` B,
UNNEST (B.labels) L
WHERE
DATE(_PARTITIONTIME) >= "START_DATE"
AND DATE(_PARTITIONTIME) < "END_DATE"
AND L.key = "vertex-ai-pipelines-run-billing-id"
AND L.value = "PIPELINE_RUN_BILLING_ID";
Replace the following:
START_DATE: Start date of the time period.
END_DATE: End date of the time period.
PIPELINE_RUN_BILLING_ID: Unique pipeline run billing ID from the query results in List your ten most expensive pipeline runs.
You should see the following columns in the query results:
service.id
service.description
sku.id
sku_description
cost
The cost
column represents the cost of a resource corresponding to the sku.id
in the pipeline run.