Prices are listed in US Dollars (USD).
If you pay in a currency other than USD, the prices listed in your currency on
Cloud Platform SKUs
apply.
Vertex AI pricing compared to legacy product pricing
The costs for Vertex AI remain the same as they are for the legacy
AI Platform and AutoML products that Vertex AI
supersedes, with the following exceptions:
Legacy AI Platform Prediction and AutoML Tables
predictions supported lower-cost, lower-performance machine types that aren't
supported for Vertex AI Prediction and AutoML tabular.
Legacy AI Platform Prediction supported scale-to-zero, which
isn't supported for Vertex AI Prediction.
Vertex AI also offers more ways to optimize costs, such as the following:
For Vertex AI AutoML models, you pay for three main activities:
Training the model
Deploying the model to an endpoint
Using the model to make predictions
Vertex AI uses predefined machine configurations for Vertex AutoML models,
and the hourly rate for these activities reflects the resource usage.
The time required to train your model depends on the size and complexity
of your training data. Models must be deployed before they can provide online
predictions or online explanations.
You pay for each model deployed to an endpoint, even if no prediction is made.
You must undeploy your model to stop incurring further charges.
Models that are not deployed or have failed to deploy are not charged.
You pay only for compute hours used; if training fails for any reason
other than a user-initiated cancellation, you are not billed for the
time. You are charged for training time if you cancel the operation.
Select a model type below for pricing information.
Compute associated with Vertex Explainable AI is charged at same rate as prediction.
However, explanations take longer to process than normal predictions, so heavy
usage of Vertex Explainable AI along with auto-scaling could result in more nodes being
started, which would increase prediction charges.
The tables below provide the approximate price per hour of various training
configurations. You can choose a custom configuration of selected
machine types. To calculate pricing,
sum the costs of the virtual machines you use.
If you use Compute Engine machine types and attach
accelerators, the cost of the accelerators is separate. To calculate this cost,
multiply the prices in the table of accelerators below by how many
machine hours of each type of accelerator you use.
Machine types
You can use Spot VMs with Vertex AI custom training.
Spot VMs are billed according to
Compute Engine Spot VMs pricing. There are Vertex AI custom
training management fees in addition to your infrastructure usage, captured in the
following tables.
You can use Compute Engine reservations with Vertex AI custom training. When using
Compute Engine reservations, you're billed according to
Compute Engine Pricing, including any applicable committed
use discounts (CUDs). There are Vertex AI custom training management fees in
addition to your infrastructure usage, captured in the following tables.
*This amount includes GPU price, since this instance type always requires a fixed number of GPU accelerators.
If you pay in a currency other than USD, the prices listed in your currency on
Cloud Platform SKUs
apply.
Accelerators
If you pay in a currency other than USD, the prices listed in your currency on
Cloud Platform SKUs
apply.
* The price for training using a Cloud TPU Pod is based on the
number of cores in the Pod. The number of cores in a pod is always a multiple
of 32. To determine the price of training on a Pod that has more than 32 cores,
take the price for a 32-core Pod, and multiply it by the number of cores,
divided by 32. For example, for a 128-core Pod, the price is
(32-core Pod price) * (128/32). For information about which Cloud TPU
Pods are available for a specific region, see System Architecture
in the Cloud TPU documentation.
Disks
If you pay in a currency other than USD, the prices listed in your currency on
Cloud Platform SKUs
apply.
You are required to store your data and program files in
Google Cloud Storage buckets during the Vertex AI lifecycle.
See more about Cloud Storage usage.
You are charged for training your models from the moment when resources are
provisioned for a job until the job finishes.
Scale tiers for predefined configurations (AI Platform Training)
You can control the type of processing cluster to use when training your model.
The simplest way is to choose from one of the predefined configurations called
scale tiers. Read more about
scale tiers.
Machine types for custom configurations
If you use Vertex AI or select CUSTOM as your scale tier for
AI Platform Training, you have control over the number and
type of virtual machines to use for the cluster's master, worker and parameter
servers. Read more about
machine types for Vertex AI and
machine types for AI Platform Training.
The cost of training with a custom processing cluster is the sum of all the
machines you specify. You are charged for the total time of the job, not for the
active processing time of individual machines.
Gen AI Evaluation Service
Vertex AI Gen AI Evaluation Service charges string input and output fields by every 1,000 characters. One character is defined as one Unicode character. White space is excluded from the count. Failed evaluation request, including filtered response, will not be charged for input nor output. At the end of each billing cycle, fractions of one cent ($0.01) are rounded to one cent.
Gen AI Evaluation Service is generally available (GA). Pricing took effect
on September 27, 2024.
Metric
Pricing
Pointwise
Input: $0.005 per 1k characters Output: $0.015 per 1k characters
Pairwise
Input: $0.005 per 1k characters Output: $0.015 per 1k characters
Computation-based metrics are charged at $0.00003 per 1k characters for input and $0.00009 per 1k characters for output. They are referred to as Automatic Metric in SKU.
Metric Name
Type
Exact Match
Computation-based
Bleu
Computation-based
Rouge
Computation-based
Tool Call Valid
Computation-based
Tool Name Match
Computation-based
Tool Parameter Key Match
Computation-based
Tool Parameter KV Match
Computation-based
Prices are listed in US Dollars (USD).
If you pay in a currency other than USD, the prices listed in your currency on
Cloud Platform SKUs
apply.
Legacy model-based metrics are charged at $0.005 per 1k characters for input and $0.015 per 1k characters for output.
Metric Name
Type
Coherence
Pointwise
Fluency
Pointwise
Fulfillment
Pointwise
Safety
Pointwise
Groundedness
Pointwise
Summarization Quality
Pointwise
Summarization Helpfulness
Pointwise
Summarization Verbosity
Pointwise
Question Answering Quality
Pointwise
Question Answering Relevance
Pointwise
Question Answering Helpfulness
Pointwise
Question Answering Correctness
Pointwise
Pairwise Summarization Quality
Pairwise
Pairwise Question Answering Quality
Pairwise
Prices are listed in US Dollars (USD).
If you pay in a currency other than USD, the prices listed in your currency on
Cloud Platform SKUs
apply.
Ray on Vertex AI
Training
The tables below provide the approximate price per hour of various training
configurations. You can choose a custom configuration of selected
machine types. To calculate pricing,
sum the costs of the virtual machines you use.
If you use Compute Engine machine types and attach
accelerators, the cost of the accelerators is separate. To calculate this cost,
multiply the prices in the table of accelerators below by how many
machine hours of each type of accelerator you use.
Machine types
If you pay in a currency other than USD, the prices listed in your currency on
Cloud Platform SKUs
apply.
Accelerators
If you pay in a currency other than USD, the prices listed in your currency on
Cloud Platform SKUs
apply.
* The price for training using a Cloud TPU Pod is based on the
number of cores in the Pod. The number of cores in a pod is always a multiple
of 32. To determine the price of training on a Pod that has more than 32 cores,
take the price for a 32-core Pod, and multiply it by the number of cores,
divided by 32. For example, for a 128-core Pod, the price is
(32-core Pod price) * (128/32). For information about which Cloud TPU
Pods are available for a specific region, see System Architecture
in the Cloud TPU documentation.
Disks
If you pay in a currency other than USD, the prices listed in your currency on
Cloud Platform SKUs
apply.
You are required to store your data and program files in
Google Cloud Storage buckets during the Vertex AI lifecycle.
See more about Cloud Storage usage.
You are charged for training your models from the moment when resources are
provisioned for a job until the job finishes.
Prediction and explanation
The following tables provide the prices of batch prediction, online prediction,
and online explanation per node hour. A node hour represents the time a virtual
machine spends running your prediction job or waiting in an active state (an
endpoint with one or more models deployed) to handle prediction or explanation
requests.
You can use Spot VMs with Vertex AI custom training.
Spot VMs are billed according to
Compute Engine Spot VMs pricing. There are Vertex AI custom
training management fees in addition to your infrastructure usage, captured in the
following tables.
You can use Compute Engine reservations with Vertex AI custom training. When using
Compute Engine reservations, you're billed according to
Compute Engine Pricing, including any applicable committed
use discounts (CUDs). There are Vertex AI custom training management fees in
addition to your infrastructure usage, captured in the following tables.
*When consuming from a reservation or spot capacity, billing is spread
across two SKUs: the GCE SKU with the label 'vertex-ai-online-prediction' and
the Vertex AI Management Fee SKU. This enables you to use your Committed Use
Discounts (CUDs) in Vertex AI.
TPU v5e
ct5lp-hightpu-1t
Approximations:
us-west1
$1.38
ct5lp-hightpu-4t
Approximations:
us-west1
$5.52
ct5lp-hightpu-8t
Approximations:
us-west1
$11.04
Pricing for Europe
The following tables provide the price per node hour for each machine type.
*When consuming from a reservation or spot capacity, billing is spread
across two SKUs: the GCE SKU with the label 'vertex-ai-online-prediction' and
the Vertex AI Management Fee SKU. This enables you to use your Committed Use
Discounts (CUDs) in Vertex AI.
Pricing for Asia Pacific
The following tables provide the price per node hour for each machine type.
*When consuming from a reservation or spot capacity, billing is spread
across two SKUs: the GCE SKU with the label 'vertex-ai-online-prediction' and
the Vertex AI Management Fee SKU. This enables you to use your Committed Use
Discounts (CUDs) in Vertex AI.
Each machine type
is charged as the following SKUs on your Google Cloud bill:
vCPU cost: measured in vCPU hours
RAM cost: measured in GB hours
GPU cost: if either built into the machine or optionally configured,
measured in GPU hours
The prices for machine types are used to approximate the
total hourly cost for each prediction node of a model version using that machine
type.
For example, a machine type of n1-highcpu-32 includes 32 vCPUs and 32 GB of RAM.
Therefore, the hourly pricing equals 32 vCPU hours + 32 GB hours.
The SKU pricing table is available by region. Each table shows the vCPU, RAM,
and built-in GPU pricing for prediction machine types, which more precisely
reflect the SKUs charged.
To view the SKU pricing per region, choose a region to view its pricing table:
*When consuming from a reservation or spot capacity, billing is spread
across two SKUs: the GCE SKU with the label 'vertex-ai-online-prediction' and
the Vertex AI Management Fee SKU. This enables you to use your Committed Use
Discounts (CUDs) in Vertex AI.
*When consuming from a reservation or spot capacity, billing is spread
across two SKUs: the GCE SKU with the label 'vertex-ai-online-prediction' and
the Vertex AI Management Fee SKU. This enables you to use your Committed Use
Discounts (CUDs) in Vertex AI.
*When consuming from a reservation or spot capacity, billing is spread
across two SKUs: the GCE SKU with the label 'vertex-ai-online-prediction' and
the Vertex AI Management Fee SKU. This enables you to use your Committed Use
Discounts (CUDs) in Vertex AI.
Some machine types allow you to add optional
GPU
accelerators for prediction. Optional GPUs incur an additional charge,
separate from those described in the previous table. View each pricing table,
which describes the pricing for each type of optional GPU.
Pricing is per GPU. If you use multiple GPUs per prediction node (or if your
version scales to use multiple nodes),the costs scale accordingly.
AI Platform Prediction serves predictions from your model by running a number of
virtual machines ("nodes"). By default, Vertex AI automatically
scales the number of nodes running at any time. For online prediction, the
number of nodes scales to meet demand. Each node can respond to multiple
prediction requests. For batch prediction, the number of nodes scales to reduce
the total time it takes to run a job. You can customize how prediction nodes
scale.
You are charged for the time that each node runs for your model, including:
When the node is processing a batch prediction job.
When the node is processing an online prediction request.
When the node is in a ready state for serving online predictions.
The cost of one node running for one hour is a node hour. The table of
prediction prices describes the
price of a node hour, which varies across regions and between online prediction
and batch prediction.
You can consume node hours in fractional increments. For example, one node
running for 30 minutes costs 0.5 node hours.
Cost calculations for Compute Engine (N1) machine types
The running time of a node is billed in 30-second increments. This means that
every 30 seconds, your project is billed for 30 seconds worth of whatever
vCPU, RAM, and GPU resources that your node is using at that moment.
More about automatic scaling of prediction nodes
Online prediction
Batch prediction
The priority of the scaling is to reduce the latency of individual
requests. The service keeps your model in a ready state for a few idle
minutes after servicing a request.
The priority of the scaling is to reduce the total elapsed time of
the job.
Scaling affects your total charges each month: the more numerous and
frequent your requests, the more nodes will be used.
Scaling should have little effect on the price of your job, though
there is some overhead involved in bringing up a new node.
You can choose to let the service scale in response to traffic
(automatic scaling) or you can specify a number of nodes to run
constantly to avoid latency (manual scaling).
If you choose automatic scaling, the number of nodes scales
automatically. For AI Platform Prediction legacy (MLS1) machine type
deployments, the number of nodes can scale down to zero for
no-traffic durations. Vertex AI deployments and
other types of AI Platform Prediction deployments cannot
scale down to zero nodes.
If you choose manual scaling, you specify a number of nodes to
keep running all the time. You are charged for all of the time
that these nodes are running, starting at the time of deployment
and persisting until you delete the model version.
You can affect scaling by setting a maximum number of nodes to use for
a batch prediction job, and by setting the number of nodes to keep
running for a model when you deploy it.
Batch prediction jobs are charged after job completion
Batch prediction jobs are charged after job completion, not incrementally during
the job. Any Cloud Billing budget alerts that you have configured aren't
triggered while a job is running. Before starting a large job, consider
running some cost benchmark jobs with small input data first.
Example of a prediction calculation
A real-estate company in an Americas region runs a weekly prediction of
housing values in areas it serves. In one month, it runs predictions for
four weeks in batches of 3920, 4277, 3849, and 3961. Jobs are
limited to one node and each instance takes an average of 0.72
seconds of processing.
First calculate the length of time that each job ran:
This example assumed jobs ran on a single node and took a consistent amount of
time per input instance. In real usage, make sure to account for multiple nodes
and use the actual amount of time each node spends running for your
calculations.
Charges for Vertex Explainable AI
Feature-based explanations
Feature-based explanations
come at no extra charge to prediction prices. However, explanations take longer
to process than normal predictions, so heavy usage of Vertex Explainable AI along with
auto-scaling could result in more nodes being started, which would increase
prediction charges.
When you upload a model or update a model's dataset, you are billed:
per node hour for the batch prediction job that is used to generate the
latent space representations of examples. This is billed at the same
rate as prediction.
a cost for building or updating indexes. This cost is the same as the
indexing costs for Vector Search,
which is number of examples * number of dimensions * 4 bytes per float * $3.00 per GB.
For example, if you have 1 million examples and 1,000 dimension latent
space, the cost is $12 (1,000,000 * 1,000 * 4 * 3.00 / 1,000,000,000).
When you deploy to an endpoint, you are billed per node hour for each node
in your endpoint. All compute associated with the endpoint is charged at
same rate as prediction. However,
because Example-based explanations require additional compute resources to
serve the Vector Search index, this results in more nodes being started
which increases prediction charges.
Vertex AI Neural Architecture Search
The following tables summarize the pricing in each region where
Neural Architecture Search is available.
Prices
The following tables provide the price per hour of various configurations.
You can choose a predefined scale tier or a custom configuration of selected
machine types. If you choose a custom
configuration, sum the costs of the virtual machines you use.
Accelerator-enabled legacy machine types include the cost of the accelerators in
their pricing. If you use Compute Engine machine types and attach
accelerators, the cost of the accelerators is separate. To calculate this cost,
multiply the prices in the following table of accelerators by the number of each type
of accelerator you use.
If you pay in a currency other than USD, the prices listed in your currency on
Cloud Platform SKUs
apply.
Notes:
All use is subject to the Neural Architecture Search quota policy.
You are required to store your data and program files in
Cloud Storage buckets during the Neural Architecture Search lifecycle.
See more about Cloud Storage usage.
The disk price is only charged when you configure the disk
size of each VM to be larger than 100 GB. There is no charge for the first
100 GB (the default disk size) of disk for each VM. For example, if you
configure each VM to have 105 GB of disk, then you are charged for 5 GB of
disk for each VM.
Required use of Cloud Storage
In addition to the costs described in this document, you are required to store
data and program files in Cloud Storage buckets during the
Neural Architecture Search lifecycle. This storage is subject to the
Cloud Storage pricing policy.
Required use of Cloud Storage includes:
Staging your training application package.
Storing your training input data.
Storing the output of your jobs.
Neural Architecture Search doesn't require long-term storage of these items.
You can remove the files as soon as the operation is complete.
Free operations for managing your resources
The resource management operations provided by Neural Architecture Search are
available free of charge. The Neural Architecture Search quota policy does limit
some of these operations.
Resource
Free operations
jobs
get, list, cancel
operations
get, list, cancel, delete
Vertex AI Pipelines
Vertex AI Pipelines charges a run execution fee of $0.03 per
Pipeline Run. You are not charged the execution fee during the Preview release.
You also pay for Google Cloud resources you use with
Vertex AI Pipelines, such as Compute Engine resources consumed by
pipeline components (charged at the same rate as for
Vertex AI training). Finally, you are responsible for the
cost of any services (such as Dataflow) called by your pipeline.
Vertex AI Feature Store
Vertex AI Feature Store is Generally Available (GA) since November 2023. For
information on the previous version of the product go to Vertex AI Feature Store (Legacy).
New Vertex AI Feature Store
The new Vertex AI Feature Store supports functionality across 2 types of operations:
Offline operations are operations to transfer, store,
retrieve and transform data in the offline store (BigQuery)
Online operations are operations to transfer data into the
online store(s) and operations on data while it is in the online store(s).
Offline Operations Pricing
Since BigQuery is used for offline operations, please refer to BigQuery pricing for functionality such as ingestion to the offline store, querying the offline
store, and offline storage.
Online Operations Pricing
For online operations, Vertex AI Feature Store charges for any GA features to
transfer data into the online store, serve data or store data. A node-hour
represents the time a virtual machine spends to complete an operation, charged
to the minute.
Optimized online serving and Bigtable online serving use different architectures, so their nodes are not comparable.
If you pay in a currency other than USD, the prices listed in your currency on
Cloud Platform SKUs
apply.
Online Operations Workload Estimates
Consider the following guidelines when estimating your workloads. The number of
nodes required for a given workload may differ across each serving approach.
Data processing:
Ingestion - One node can ingest approximately a minimum of 100 MB of data per hour into a Bigtable Online Store or an Optimized Online Store if no analytical functions are used.
Bigtable online serving: Each node can support approximately 15,000 QPS and up to 5TB of storage.
Optimized online serving: Performance is based on the machine type and replicas, which are automatically configured to minimize costs subject to the workload. Each node can have
a minimum of 2 and a maximum of 6 replicas for high availability and autoscaling. You're charged for the number of replicas accordingly. For more details, see the example monthly scenarios.
For non embeddings-related workloads, each node can support approximately 500 QPS and up to 200GB of storage.
For embeddings-related workloads, each node can support approximately 500 QPS and up to 4 GB of storage of 512 dimensional data.
You can view the number of nodes (with replicas) in the Metric Explorer:
Example Monthly Scenarios (assuming us-central1)
Data streaming workload - Bigtable online serving with 2.5TB of data
(1GB refreshed daily) and 1200QPS
Prices for Vertex AI Feature Store (Legacy) are based on the amount of feature
data in online and offline storage as well as the availability of online
serving. A node per hour represents the time a virtual machine spends serving
feature data or waiting in a ready state to handle feature data requests.
If you pay in a currency other than USD, the prices listed in your currency on
Cloud Platform SKUs
apply.
When you enable feature value monitoring, billing includes applicable charges above in addition to applicable charges that follow:
$3.50 per GB for all data analyzed. With snapshot analysis enabled, snapshots taken for data in Vertex AI Feature Store (Legacy) are included. With import feature analysis enabled, batches of ingested data are included.
Additional charges for other Vertex AI Feature Store (Legacy) operations used with feature value monitoring include the following:
The snapshot analysis feature periodically takes a snapshot of the feature values based on your configuration for the monitoring interval.
The charge for a snapshot export is the same as a regular batch export operation.
Snapshot Analysis Example
A data scientist enables feature value monitoring for their Vertex AI Feature Store (Legacy) and turns on monitoring for a daily snapshot analysis.
A pipeline runs daily for the entity types monitoring. The pipeline scans 2GB of data in Vertex AI Feature Store (Legacy) and exports a snapshot containing 0.1GB of data.
The total charge for one day's analysis is:
(0.1 GB * $3.50) + (2 GB * $0.005) = $0.36
Ingestion Analysis Example
A data scientist enables feature value monitoring for their Vertex AI Feature Store (Legacy) and turns on monitoring for ingestion operations.
An ingestion operation imports 1GB of data into Vertex AI Feature Store (Legacy).
The total charge for feature value monitoring is:
(1 GB * $3.50) = $3.50
Vertex ML Metadata
Metadata storage is measured in binary gigabytes (GiB), where 1 GiB is
1,073,741,824 bytes. This unit of measurement is also known as a gibibyte.
Vertex ML Metadata charges $10 per gibibyte (GiB) per month for
metadata storage. Prices are pro-rated per megabyte (MB). For example, if
you store 10 MB of metadata, you are charged $0.10 per month for that 10 MB
of metadata.
Prices are the same in all regions where Vertex ML Metadata is
supported.
Vertex AI TensorBoard
To use Vertex AI TensorBoard, request that the IAM administrator of the
project assign you to the role
"Vertex AI TensorBoard Web App User". The Vertex AI Administrator role
also has access.
Beginning in August 2023, Vertex AI TensorBoard pricing changed
from a per-user monthly license of $300/month to $10 GiB/month for data storage
of logs and metrics. This means no more subscription fees. You will only pay for the
storage you’ve used. See the
Vertex AI TensorBoard: Delete Outdated TensorBoard Experiments
tutorial for how to manage storage.
Vertex AI Vizier
Vertex AI Vizier is a black-box optimization service inside Vertex AI.
The Vertex AI Vizier pricing model consists of the following:
The first 100 Vertex AI Vizier trials per calendar month are available at no charge (trials using RANDOM_SEARCH and GRID_SEARCH do not count against this total).
After 100 Vertex AI Vizier trials, subsequent trials during the same calendar month are charged at $1 per trial (trials that use RANDOM_SEARCH or GRID_SEARCH incur no charges).
Vector Search
Pricing for Vector Search Approximate Nearest Neighbor service consists of:
Per node hour pricing for each VM used to host a deployed index.
A cost for building new indexes, updating existing indexes, and using streaming index updates.
Data processed during building and updating indexes is measured in binary
gigabytes (GiB), where 1 GiB is 1,073,741,824 bytes. This unit of measurement
is also known as a gibibyte.
Vector Search charges $3.00 per gibibyte (GiB) of data
processed in all regions. Vector Search charges $0.45/GiB ingested
for Streaming Update inserts.
The following tables summarize the pricing of an index serving in each region where
the Vector Search is available. The price corresponds to the machine type,
by region, and is charged per node hour.
Vector Search pricing is determined by the size of your data, the
amount of queries per second (QPS) you want to run, and the number of nodes you use.
To get your estimated serving cost, you need to calculate your total data size.
Your data size is the number of your embeddings/vectors* the number of dimensions
you have* 4 bytes per dimension. After you have the size of your data you can calculate
the serving cost and the building cost. The serving cost plus the building cost
equals your monthly total cost.
Building cost: data size(in GiB) * $3/GiB * # of updates/month
Streaming update: Vector Search uses heuristics-based metrics to determine when to trigger compaction. If the oldest uncompacted data is five days old, compaction is always triggered. You are billed for the cost of rebuilding the index at the same rate of a batch update, in addition to the streaming update costs.
Number of embeddings/vectors
Number of dimensions
Queries per second (QPS)
Machine Type
Nodes
Estimated monthly serving cost
2 million
128
100
e2-standard-2
1
$68
20 million
256
1,000
e2-standard-16
1
$547
20 million
256
3,000
e2-standard-16
3
$1,642
100 million
256
500
e2-highmem-16
2
$1,477
1 billion
100
500
e2-highmem-16
8
$5,910
All examples are based on machine types in us-central1.
The cost you incur will vary with recall rate and latency requirements. The estimated monthly
serving cost is directly related to the number of nodes used in the console.
To learn more about configuration parameters that affect cost, see
Configuration parameters which affect recall and latency.
If you have high queries per second (QPS), batching these queries can
reduce total costs up to 30%-40%.
Vertex AI Model Registry
The Vertex AI Model Registry is a central repository which tracks and lists your
models and model versions. You can import models into Vertex AI and they appear in
the Vertex AI Model Registry. There is no cost associated with having your models in the
Model Registry. Cost is only incurred when you deploy the model to an endpoint or
perform a batch prediction on the model. This cost is determined by the type of model you are deploying.
To learn more about pricing for deploying custom models from the Vertex AI Model Registry, see
Custom-trained models. To learn more about pricing for
deploying AutoML models, see Pricing for AutoML models.
Vertex AI Model Monitoring
Vertex AI enables you to monitor the continued effectiveness of your
model after you deploy it to production. For more information, see
Introduction to Vertex AI Model Monitoring.
When you use Vertex AI Model Monitoring, you are billed for the following:
$3.50 per GB for all data analyzed, including
the training data provided and prediction data logged in a BigQuery table.
Charges for other Google Cloud products that you use with Model Monitoring, such as BigQuery
storage or Batch Explain when attribution monitoring is enabled.
Vertex AI Model Monitoring is supported in the following regions: us-central1,
europe-west4, asia-east1, and asia-southeast1. Prices are the same for all
regions.
Data sizes are measured after they are converted to TfRecord format.
Training datasets incur a one-time charge when you set up a
Vertex AI Model Monitoring job.
Prediction Datasets consist of logs collected from the Online
Prediction service. As prediction requests arrive during different time windows,
the data for each time window is collected and the sum of the
data analyzed for each prediction window is used to calculate the charge.
Example:
A data scientist runs model monitoring on the prediction traffic belonging to
their model.
The model is trained from a BigQuery dataset. The data size after converting to
TfRecord is 1.5GB.
Prediction data logged between 1:00 - 2:00 p.m. is 0.1 GB,
between 3:00 - 4:00 p.m. is 0.2 GB.
The total price for setting up the model monitoring job is:
In addition to the costs mentioned previously,
you also pay for any Google Cloud resources that you use.
For example:
Data analysis services: You incur BigQuery costs when you
issue SQL queries within a notebook (see
BigQuery pricing).
Customer-managed encryption keys: You incur costs when you use
customer-managed encryption keys. Each time
your managed notebooks or
user-managed notebooks instance
uses a Cloud Key Management Service key, that operation is billed at the rate
of Cloud KMS key operations
(see Cloud Key Management Service pricing).
Deep Learning Containers, Deep Learning VM, and AI Platform Pipelines
For Deep Learning Containers, Deep Learning VM Images,
and AI Platform Pipelines,
pricing is calculated based on the compute and storage resources that you use.
These resources are charged at the same rate you currently
pay for Compute Engine and
Cloud Storage.
In addition to the compute and storage costs,
you also pay for any Google Cloud resources that you use.
For example:
Data analysis services: You incur BigQuery costs when you
issue SQL queries within a notebook (see
BigQuery pricing).
Customer-managed encryption keys: You incur costs when you use
customer-managed encryption keys. Each time
your managed notebooks or
user-managed notebooks instance
uses a Cloud Key Management Service key, that operation is billed at the rate
of Cloud KMS key operations
(see Cloud Key Management Service pricing).
Data labeling
Vertex AI enables you to request human labeling for a collection
of data that you plan to use to train a custom machine learning model.
Prices for the service are computed based on the type of labeling task.
For regular labeling tasks, the prices are determined by the number of
annotation units.
For an image classification task, units are determined the number of
images and the number of human labelers. For example, an image with
3 human labelers counts for 1 * 3 = 3 units. The price for single-label
and multi-label classification are the same.
For an image bounding box task, units are determined by the number of
bounding boxes identified in the images and the number of human labelers.
For example, if an image with 2 bounding boxes and 3 human labelers
counts for 2 * 3 = 6 units. Images without bounding boxes will not be
charged.
For an image segmentation/rotated box/polyline/polygon task, units are
determined in the same way as a image bounding box task.
For a video classification task, units are determined by the video length
(every 5 seconds is a price unit) and the number of human labelers. For
example, a 25 seconds video with 3 human labelers counts for 25 / 5 * 3 =
15 units. The price for single-label and multi-label classification are
the same.
For a video object tracking task, unit are determined by the number of
objects identified in the video and the number of human labelers. For
example, for a video with 2 objects and 3 human labelers, it counts for
2 * 3 = 6 units. Video without objects will not be charged.
For a video action recognition task, units are determined in the same way as a video
object tracking task.
For a text classification task, units are determined by text length
(every 50 words is a price unit) and the number of human labelers. For
example, one piece of text with 100 words and 3 human labelers counts for
100 / 50 * 3 = 6 units. The price for single-label and multi-label
classification is the same.
For a text sentiment task, units are determined in the same way as a text
classification task.
For a text entity extraction task, units are determined by text length
(every 50 words is a price unit), the number of entities identified, and
the number of human labelers. For example, a piece of text with 100 words,
2 entities identified, and 3 human labelers counts for 100 / 50 * 2 * 3 =
12 units. Text without entities will not be charged.
For image/video/text classification and text sentiment tasks, human labelers
may lose track of classes if the label set size is too large. As a result, we
send at most 20 classes to the human labelers at a time. For example, if
the label set size of a labeling task is 40, each data item will be sent for
human review 40 / 20 = 2 times, and we will charge 2 times of the price
(calculated above) accordingly.
For a labeling task that enables the custom labeler feature, each data item is
counted as 1 custom labeler unit.
For an active learning labeling task for data items with annotations that
are generated by models (without a human labeler's help), each data item is
counted as 1 active learning unit.
For an active learning labeling task for data items with annotations that are
generated by human labelers, each data item is counted as a regular labeling
task as described above.
The table below provides the price per 1,000 units per human labeler, based on
the unit listed for each objective. Tier 1 pricing applies to the first 50,000
units per month in each Google Cloud project; Tier 2 pricing applies to the next
950,000 units per month in the project, up to 1,000,000 units.
Contact us for pricing above 1,000,000
units per month.
Data type
Objective
Unit
Tier 1
Tier 2
Image
Classification
Image
$35
$25
Bounding box
Bounding box
$63
$49
Segmentation
Segment
$870
$850
Rotated box
Bounding box
$86
$60
Polygon/polyline
Polygon/Polyline
$257
$180
Video
Classification
5sec video
$86
$60
Object tracking
Bounding box
$86
$60
Action recognition
Event in 30sec video
$214
$150
Text
Classification
50 words
$129
$90
Sentiment
50 words
$200
$140
Entity extraction
Entity
$86
$60
Active Learning
All
Data item
$80
$56
Custom Labeler
All
Data item
$80
$56
Required use of Cloud Storage
In addition to the costs described in this document, you are required to store
data and program files in Cloud Storage buckets during the
Vertex AI lifecycle. This storage is subject to the
Cloud Storage pricing policy.
Required use of Cloud Storage includes:
Staging your training application package for custom-trained models.
Storing your training input data.
Storing the output of your training jobs.
Vertex AI does not require long-term storage of these items.
You can remove the files as soon as the operation is complete.
Free operations for managing your resources
The resource management operations provided by AI Platform are
available free of charge. The AI Platform quota policy does limit
some of these operations.
Resource
Free operations
models
create, get, list, delete
versions
create, get, list, delete, setDefault
jobs
get, list, cancel
operations
get, list, cancel, delete
Google Cloud costs
If you store images to be analyzed in Cloud Storage or use other
Google Cloud resources in tandem with Vertex AI, then
you will also be billed for the use of those services.
With Google Cloud's pay-as-you-go pricing, you only pay for the services you
use. Connect with our sales team to get a custom quote for your organization.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],[],[],[]]