Vertex AI pricing

Prices are listed in US Dollars (USD). If you pay in a currency other than USD, the prices listed in your currency on Cloud Platform SKUs apply.

Vertex AI pricing compared to legacy product pricing

The costs for Vertex AI remain the same as they are for the legacy AI Platform and AutoML products that Vertex AI supersedes, with the following exceptions:

  • Legacy AI Platform Prediction and AutoML Tables predictions supported lower-cost, lower-performance machine types that aren't supported for Vertex AI Inference and AutoML tabular.
  • Legacy AI Platform Prediction supported scale-to-zero, which isn't supported for Vertex AI Inference.

Vertex AI also offers more ways to optimize costs, such as the following:

Pricing for Generative AI on Vertex AI

For Generative AI on Vertex AI pricing information, see Pricing for Generative AI on Vertex AI.

Pricing for AutoML models

For Vertex AI AutoML models, you pay for three main activities:

  • Training the model
  • Deploying the model to an endpoint
  • Using the model to make predictions

Vertex AI uses predefined machine configurations for Vertex AutoML models, and the hourly rate for these activities reflects the resource usage.

The time required to train your model depends on the size and complexity of your training data. Models must be deployed before they can provide online predictions or online explanations.

You pay for each model deployed to an endpoint, even if no prediction is made. You must undeploy your model to stop incurring further charges. Models that are not deployed or have failed to deploy are not charged.

You pay only for compute hours used; if training fails for any reason other than a user-initiated cancellation, you are not billed for the time. You are charged for training time if you cancel the operation.

Select a model type below for pricing information.

Image data

Operation

Price (classification) (USD)

Price (object detection) (USD)

Training

$3.465 / 1 hour$3.465 / 1 hour

Training (Edge on-device model)

$18.00 / 1 hour$18.00 / 1 hour

Deployment and online prediction

$1.375 / 1 hour$2.002 / 1 hour

Batch prediction

$2.222 / 1 hour$2.222 / 1 hour

Tabular data

Operation

Price per node hour for classification/regression

Price for forecasting

Training

$21.252 / 1 hour

Refer to Vertex AI Forecast

Prediction

Same price as predictions for custom-trained models.

Vertex AI performs batch prediction using 40 n1-highmem-8 machines.

Refer to Vertex AI Forecast

Prediction charges for Vertex Explainable AI

Compute associated with Vertex Explainable AI is charged at same rate as prediction. However, explanations take longer to process than normal predictions, so heavy usage of Vertex Explainable AI along with auto-scaling could result in more nodes being started, which would increase prediction charges.

Vertex AI Forecast

AutoML

Stage

Pricing

Prediction

0 count to 1,000,000 count
$0.20 / 1,000 count, per 1 month / account
1,000,000 count to 50,000,000 count
$0.10 / 1,000 count, per 1 month / account
50,000,000 count and above
$0.02 / 1,000 count, per 1 month / account

Training

$21.252 / 1 hour

Explainable AI

Explainability using Shapley values. Refer to Vertex AI Inference and Explanation pricing page.

* A prediction data point is one time point in the forecast horizon. For example, with daily granularity a 7-day horizon is 7 points per each time series.

  • Up to 5 prediction quantiles can be included at no additional cost.
  • The number of data points consumed per tier is refreshed monthly.

ARIMA+

Stage

Pricing

Prediction

$5.00 / 1,000 count

Training

$250.00 per TB x Number of Candidate Models x Number of Backtesting Windows*

Explainable AI

Explainability with time series decomposition does not add any additional cost. Explainability using Shapley values is not supported.

Refer to the BigQuery ML pricing page for additional details. Each training and prediction job incurs the cost of 1 managed pipeline run, as described in Vertex AI pricing.

* A backtesting window is created for each period in the test set. The AUTO_ARIMA_MAX_ORDER used determines the number of candidate models. It ranges from 6-42 for models with multiple time series.

Custom-trained models

Training

The tables below provide the approximate price per hour of various training configurations. You can choose a custom configuration of selected machine types. To calculate pricing, sum the costs of the virtual machines you use.

If you use Compute Engine machine types and attach accelerators, the cost of the accelerators is separate. To calculate this cost, multiply the prices in the table of accelerators below by how many machine hours of each type of accelerator you use.

Machine types

You can use Spot VMs with Vertex AI custom training. Spot VMs are billed according to Compute Engine Spot VMs pricing. There are Vertex AI custom training management fees in addition to your infrastructure usage, captured in the following tables.

You can use Compute Engine reservations with Vertex AI custom training. When using Compute Engine reservations, you're billed according to Compute Engine Pricing, including any applicable committed use discounts (CUDs). There are Vertex AI custom training management fees in addition to your infrastructure usage, captured in the following tables.

  • Johannesburg (africa-south1)
  • Taiwan (asia-east1)
  • Hong Kong (asia-east2)
  • Tokyo (asia-northeast1)
  • Osaka (asia-northeast2)
  • Seoul (asia-northeast3)
  • Mumbai (asia-south1)
  • Delhi (asia-south2)
  • Singapore (asia-southeast1)
  • Jakarta (asia-southeast2)
  • Sydney (australia-southeast1)
  • Melbourne (australia-southeast2)
  • Warsaw (europe-central2)
  • Finland (europe-north1)
  • Madrid (europe-southwest1)
  • Belgium (europe-west1)
  • Turin (europe-west12)
  • London (europe-west2)
  • Frankfurt (europe-west3)
  • Netherlands (europe-west4)
  • Zurich (europe-west6)
  • Milan (europe-west8)
  • Paris (europe-west9)
  • Doha (me-central1)
  • Dammam (me-central2)
  • Tel Aviv (me-west1)
  • Montreal (northamerica-northeast1)
  • Toronto (northamerica-northeast2)
  • Sao Paulo (southamerica-east1)
  • Santiago (southamerica-west1)
  • Iowa (us-central1)
  • South Carolina (us-east1)
  • Northern Virginia (us-east4)
  • Columbus (us-east5)
  • Dallas (us-south1)
  • Oregon (us-west1)
  • Los Angeles (us-west2)
  • Salt Lake City (us-west3)
  • Las Vegas (us-west4)

Machine type

Price (USD)

n1-standard-4

$0.21849885 / 1 hour

n1-standard-8

$0.4369977 / 1 hour

n1-standard-16

$0.8739954 / 1 hour

n1-standard-32

$1.7479908 / 1 hour

n1-standard-64

$3.4959816 / 1 hour

n1-standard-96

$5.2439724 / 1 hour

n1-highmem-2

$0.13604845 / 1 hour

n1-highmem-4

$0.2720969 / 1 hour

n1-highmem-8

$0.5441938 / 1 hour

n1-highmem-16

$1.0883876 / 1 hour

n1-highmem-32

$2.1767752 / 1 hour

n1-highmem-64

$4.3535504 / 1 hour

n1-highmem-96

$6.5303256 / 1 hour

n1-highcpu-16

$0.65180712 / 1 hour

n1-highcpu-32

$1.30361424 / 1 hour

n1-highcpu-64

$2.60722848 / 1 hour

n1-highcpu-96

$3.91084272 / 1 hour

a2-highgpu-1g*

$4.425248914 / 1 hour

a2-highgpu-2g*

$8.850497829 / 1 hour

a2-highgpu-4g*

$17.700995658 / 1 hour

a2-highgpu-8g*

$35.401991315 / 1 hour

a2-megagpu-16g*

$65.707278915 / 1 hour

a3-highgpu-8g*

$101.007352 / 1 hour

a3-megagpu-8g*

$106.0464232 / 1 hour

a3-ultragpu-8g*

$99.7739296 / 1 hour

a4-highgpu-8g*

-

e2-standard-4

$0.154126276 / 1 hour

e2-standard-8

$0.308252552 / 1 hour

e2-standard-16

$0.616505104 / 1 hour

e2-standard-32

$1.233010208 / 1 hour

e2-highmem-2

$0.103959618 / 1 hour

e2-highmem-4

$0.207919236 / 1 hour

e2-highmem-8

$0.415838472 / 1 hour

e2-highmem-16

$0.831676944 / 1 hour

e2-highcpu-16

$0.455126224 / 1 hour

e2-highcpu-32

$0.910252448 / 1 hour

n2-standard-4

$0.2233714 / 1 hour

n2-standard-8

$0.4467428 / 1 hour

n2-standard-16

$0.8934856 / 1 hour

n2-standard-32

$1.7869712 / 1 hour

n2-standard-48

$2.6804568 / 1 hour

n2-standard-64

$3.5739424 / 1 hour

n2-standard-80

$4.467428 / 1 hour

n2-highmem-2

$0.1506661 / 1 hour

n2-highmem-4

$0.3013322 / 1 hour

n2-highmem-8

$0.6026644 / 1 hour

n2-highmem-16

$1.2053288 / 1 hour

n2-highmem-32

$2.4106576 / 1 hour

n2-highmem-48

$3.6159864 / 1 hour

n2-highmem-64

$4.8213152 / 1 hour

n2-highmem-80

$6.026644 / 1 hour

n2-highcpu-16

$0.6596032 / 1 hour

n2-highcpu-32

$1.3192064 / 1 hour

n2-highcpu-48

$1.9788096 / 1 hour

n2-highcpu-64

$2.6384128 / 1 hour

n2-highcpu-80

$3.298016 / 1 hour

c2-standard-4

$0.2401292 / 1 hour

c2-standard-8

$0.4802584 / 1 hour

c2-standard-16

$0.9605168 / 1 hour

c2-standard-30

$1.800969 / 1 hour

c2-standard-60

$3.601938 / 1 hour

m1-ultramem-40

$7.237065 / 1 hour

m1-ultramem-80

$14.47413 / 1 hour

m1-ultramem-160

$28.94826 / 1 hour

m1-megamem-96

$12.249984 / 1 hour

cloud-tpu

Pricing is determined by the accelerator type. See 'Accelerators'.

*This amount includes GPU price, since this instance type always requires a fixed number of GPU accelerators.

If you pay in a currency other than USD, the prices listed in your currency on Cloud Platform SKUs apply.

Accelerators

  • Johannesburg (africa-south1)
  • Taiwan (asia-east1)
  • Hong Kong (asia-east2)
  • Tokyo (asia-northeast1)
  • Osaka (asia-northeast2)
  • Seoul (asia-northeast3)
  • Mumbai (asia-south1)
  • Delhi (asia-south2)
  • Singapore (asia-southeast1)
  • Jakarta (asia-southeast2)
  • Sydney (australia-southeast1)
  • Melbourne (australia-southeast2)
  • Warsaw (europe-central2)
  • Finland (europe-north1)
  • Madrid (europe-southwest1)
  • Belgium (europe-west1)
  • Berlin (europe-west10)
  • Turin (europe-west12)
  • London (europe-west2)
  • Frankfurt (europe-west3)
  • Netherlands (europe-west4)
  • Zurich (europe-west6)
  • Milan (europe-west8)
  • Paris (europe-west9)
  • Doha (me-central1)
  • Dammam (me-central2)
  • Tel Aviv (me-west1)
  • Montreal (northamerica-northeast1)
  • Toronto (northamerica-northeast2)
  • Sao Paulo (southamerica-east1)
  • Santiago (southamerica-west1)
  • Iowa (us-central1)
  • South Carolina (us-east1)
  • Northern Virginia (us-east4)
  • Columbus (us-east5)
  • Dallas (us-south1)
  • Oregon (us-west1)
  • Los Angeles (us-west2)
  • Salt Lake City (us-west3)
  • Las Vegas (us-west4)
  • Phoenix (us-west8)

Machine type

Price (USD)

Vertex Management Fee

NVIDIA_TESLA_A100

$2.933908 / 1 hour$0.4400862 / 1 hour

NVIDIA_TESLA_A100_80GB

$3.92808 / 1 hour$0.589212 / 1 hour

NVIDIA_H100_80GB

$9.79655057 / 1 hour$1.4694826 / 1 hour

NVIDIA_H200_141GB

$10.708501 / 1 hour

Unavailable

NVIDIA_H100_MEGA_80GB

$11.8959171 / 1 hour

Unavailable

NVIDIA_TESLA_L4

$0.644046276 / 1 hour

Unavailable

NVIDIA_TESLA_P4

$0.69 / 1 hour

Unavailable

NVIDIA_TESLA_P100

$1.679 / 1 hour

Unavailable

NVIDIA_TESLA_T4

$0.4025 / 1 hour

Unavailable

NVIDIA_TESLA_V100

$2.852 / 1 hour

Unavailable

TPU_V2 Single (8 cores)

$5.175 / 1 hour

Unavailable

TPU_V2 Pod (32 cores)*

$27.60 / 1 hour

Unavailable

TPU_V3 Single (8 cores)

$9.20 / 1 hour

Unavailable

TPU_V3 Pod (32 cores)*

$36.80 / 1 hour

Unavailable

If you pay in a currency other than USD, the prices listed in your currency on Cloud Platform SKUs apply.

* The price for training using a Cloud TPU Pod is based on the number of cores in the Pod. The number of cores in a pod is always a multiple of 32. To determine the price of training on a Pod that has more than 32 cores, take the price for a 32-core Pod, and multiply it by the number of cores, divided by 32. For example, for a 128-core Pod, the price is (32-core Pod price) * (128/32). For information about which Cloud TPU Pods are available for a specific region, see System Architecture in the Cloud TPU documentation.

Disks

  • Johannesburg (africa-south1)
  • Taiwan (asia-east1)
  • Hong Kong (asia-east2)
  • Tokyo (asia-northeast1)
  • Osaka (asia-northeast2)
  • Seoul (asia-northeast3)
  • Mumbai (asia-south1)
  • Delhi (asia-south2)
  • Singapore (asia-southeast1)
  • Jakarta (asia-southeast2)
  • Sydney (australia-southeast1)
  • Melbourne (australia-southeast2)
  • Warsaw (europe-central2)
  • Finland (europe-north1)
  • Madrid (europe-southwest1)
  • Belgium (europe-west1)
  • Turin (europe-west12)
  • London (europe-west2)
  • Frankfurt (europe-west3)
  • Netherlands (europe-west4)
  • Zurich (europe-west6)
  • Milan (europe-west8)
  • Paris (europe-west9)
  • Doha (me-central1)
  • Dammam (me-central2)
  • Montreal (northamerica-northeast1)
  • Toronto (northamerica-northeast2)
  • Sao Paulo (southamerica-east1)
  • Santiago (southamerica-west1)
  • Iowa (us-central1)
  • South Carolina (us-east1)
  • Northern Virginia (us-east4)
  • Columbus (us-east5)
  • Dallas (us-south1)
  • Oregon (us-west1)
  • Los Angeles (us-west2)
  • Salt Lake City (us-west3)
  • Las Vegas (us-west4)

Machine type

Price (USD)

pd-standard

$0.000063014 / 1 gibibyte hour

pd-ssd

$0.000267808 / 1 gibibyte hour

If you pay in a currency other than USD, the prices listed in your currency on Cloud Platform SKUs apply.

You are charged for training your models from the moment when resources are provisioned for a job until the job finishes.

Warning: Your training jobs are limited by the Vertex AI quota policy. If you choose a very powerful processing cluster for your first training jobs, it's likely you will exceed your quota.

Scale tiers for predefined configurations (AI Platform Training)

You can control the type of processing cluster to use when training your model. The simplest way is to choose from one of the predefined configurations called scale tiers. Read more about scale tiers.

Machine types for custom configurations

If you use Vertex AI or select CUSTOM as your scale tier for AI Platform Training, you have control over the number and type of virtual machines to use for the cluster's master, worker and parameter servers. Read more about machine types for Vertex AI and machine types for AI Platform Training.

The cost of training with a custom processing cluster is the sum of all the machines you specify. You are charged for the total time of the job, not for the active processing time of individual machines.

Gen AI Evaluation Service

For model-based metrics, charges are applied only for the prediction costs associated with the underlying autorater model. They are billed based on the input tokens that you provide in your evaluation dataset and the autorater output.

Gen AI Evaluation Service is generally available (GA). Pricing change took effect on April 14, 2025.

Metric

Pricing

Pointwise

Default autorater model

Gemini 2.0 Flash

Pairwise

Default autorater model

Gemini 2.0 Flash

Computation-based metrics are charged at $0.00003 per 1k characters for input and $0.00009 per 1k characters for output. They are referred to as Automatic Metric in SKU.

Metric Name

Type

Exact Match

Computation-based

Bleu

Computation-based

Rouge

Computation-based

Tool Call Valid

Computation-based

Tool Name Match

Computation-based

Tool Parameter Key Match

Computation-based

Tool Parameter KV Match

Computation-based

Prices are listed in US Dollars (USD). If you pay in a currency other than USD, the prices listed in your currency on Cloud Platform SKUs apply.

Legacy model-based metrics are charged at $0.005 per 1k characters for input and $0.015 per 1k characters for output.

Metric Name

Type

Coherence

Pointwise

Fluency

Pointwise

Fulfillment

Pointwise

Safety

Pointwise

Groundedness

Pointwise

Summarization Quality

Pointwise

Summarization Helpfulness

Pointwise

Summarization Verbosity

Pointwise

Question Answering Quality

Pointwise

Question Answering Relevance

Pointwise

Question Answering Helpfulness

Pointwise

Question Answering Correctness

Pointwise

Pairwise Summarization Quality

Pairwise

Pairwise Question Answering Quality

Pairwise

Prices are listed in US Dollars (USD). If you pay in a currency other than USD, the prices listed in your currency on Cloud Platform SKUs apply.

Vertex AI Agent Engine

Pricing is based on compute (vCPU hours) and memory (GiB hours) resources used by the agents that are deployed to the Agent Engine managed runtime.

Resource

Price (USD)

vCPU

$0.0994 / 1 hour

RAM

$0.0105 / 1 gibibyte hour

Ray on Vertex AI

Training

The tables below provide the approximate price per hour of various training configurations. You can choose a custom configuration of selected machine types. To calculate pricing, sum the costs of the virtual machines you use.

If you use Compute Engine machine types and attach accelerators, the cost of the accelerators is separate. To calculate this cost, multiply the prices in the table of accelerators below by how many machine hours of each type of accelerator you use.

Machine types

  • Johannesburg (africa-south1)
  • Taiwan (asia-east1)
  • Hong Kong (asia-east2)
  • Tokyo (asia-northeast1)
  • Osaka (asia-northeast2)
  • Seoul (asia-northeast3)
  • Mumbai (asia-south1)
  • Singapore (asia-southeast1)
  • Jakarta (asia-southeast2)
  • Sydney (australia-southeast1)
  • Melbourne (australia-southeast2)
  • Warsaw (europe-central2)
  • Finland (europe-north1)
  • Madrid (europe-southwest1)
  • Belgium (europe-west1)
  • Turin (europe-west12)
  • London (europe-west2)
  • Frankfurt (europe-west3)
  • Netherlands (europe-west4)
  • Zurich (europe-west6)
  • Milan (europe-west8)
  • Paris (europe-west9)
  • Doha (me-central1)
  • Dammam (me-central2)
  • Tel Aviv (me-west1)
  • Montreal (northamerica-northeast1)
  • Toronto (northamerica-northeast2)
  • Sao Paulo (southamerica-east1)
  • Santiago (southamerica-west1)
  • Iowa (us-central1)
  • South Carolina (us-east1)
  • Northern Virginia (us-east4)
  • Columbus (us-east5)
  • Dallas (us-south1)
  • Oregon (us-west1)
  • Los Angeles (us-west2)
  • Salt Lake City (us-west3)
  • Las Vegas (us-west4)

Machine type

Price (USD)

n1-standard-4

$0.2279988 / 1 hour

n1-standard-8

$0.4559976 / 1 hour

n1-standard-16

$0.9119952 / 1 hour

n1-standard-32

$1.8239904 / 1 hour

n1-standard-64

$3.6479808 / 1 hour

n1-standard-96

$5.4719712 / 1 hour

n1-highmem-2

$0.1419636 / 1 hour

n1-highmem-4

$0.2839272 / 1 hour

n1-highmem-8

$0.5678544 / 1 hour

n1-highmem-16

$1.1357088 / 1 hour

n1-highmem-32

$2.2714176 / 1 hour

n1-highmem-64

$4.5428352 / 1 hour

n1-highmem-96

$6.8142528 / 1 hour

n1-highcpu-16

$0.68014656 / 1 hour

n1-highcpu-32

$1.36029312 / 1 hour

n1-highcpu-64

$2.72058624 / 1 hour

n1-highcpu-96

$4.08087936 / 1 hour

a2-highgpu-1g*

$4.408062 / 1 hour

a2-highgpu-2g*

$8.816124 / 1 hour

a2-highgpu-4g*

$17.632248 / 1 hour

a2-highgpu-8g*

$35.264496 / 1 hour

a2-highgpu-16g*

$70.528992 / 1 hour

a3-highgpu-8g*

$105.39898088 / 1 hour

a3-megagpu-8g*

$110.65714224 / 1 hour

a4-highgpu-8g*

$148.212 / 1 hour

e2-standard-4

$0.16082748 / 1 hour

e2-standard-4

$0.32165496 / 1 hour

e2-standard-16

$0.64330992 / 1 hour

e2-standard-32

$1.28661984 / 1 hour

e2-highmem-2

$0.10847966 / 1 hour

e2-highmem-4

$0.21695932 / 1 hour

e2-highmem-8

$0.43391864 / 1 hour

e2-highmem-16

$0.86783728 / 1 hour

e2-highcpu-16

$0.4749144 / 1 hour

e2-highcpu-32

$0.9498288 / 1 hour

n2-standard-4

$0.2330832 / 1 hour

n2-standard-8

$0.4661664 / 1 hour

n2-standard-16

$0.9323328 / 1 hour

n2-standard-32

$1.8646656 / 1 hour

n2-standard-48

$2.7969984 / 1 hour

n2-standard-64

$3.7293312 / 1 hour

n2-standard-80

$4.661664 / 1 hour

n2-highmem-2

$0.1572168 / 1 hour

n2-highmem-4

$0.3144336 / 1 hour

n2-highmem-8

$0.6288672 / 1 hour

n2-highmem-16

$1.2577344 / 1 hour

n2-highmem-32

$2.5154688 / 1 hour

n2-highmem-48

$3.7732032 / 1 hour

n2-highmem-64

$5.0309376 / 1 hour

n2-highmem-80

$6.288672 / 1 hour

n2-highcpu-16

$0.6882816 / 1 hour

n2-highcpu-32

$1.3765632 / 1 hour

n2-highcpu-48

$2.0648448 / 1 hour

n2-highcpu-64

$2.7531264 / 1 hour

n2-highcpu-80

$3.441408 / 1 hour

c2-standard-4

$0.2505696 / 1 hour

c2-standard-8

$0.5011392 / 1 hour

c2-standard-16

$1.0022784 / 1 hour

c2-standard-30

$1.879272 / 1 hour

c2-standard-60

$3.758544 / 1 hour

m1-ultramem-40

$7.55172 / 1 hour

m1-ultramem-80

$15.10344 / 1 hour

m1-ultramem-160

$30.20688 / 1 hour

m1-megamem-96

$12.782592 / 1 hour

cloud-tpu

Pricing is determined by the accelerator type. See 'Accelerators'.

If you pay in a currency other than USD, the prices listed in your currency on Cloud Platform SKUs apply.

Accelerators

  • Taiwan (asia-east1)
  • Hong Kong (asia-east2)
  • Tokyo (asia-northeast1)
  • Osaka (asia-northeast2)
  • Seoul (asia-northeast3)
  • Mumbai (asia-south1)
  • Singapore (asia-southeast1)
  • Jakarta (asia-southeast2)
  • Sydney (australia-southeast1)
  • Finland (europe-north1)
  • Belgium (europe-west1)
  • London (europe-west2)
  • Frankfurt (europe-west3)
  • Netherlands (europe-west4)
  • Zurich (europe-west6)
  • Montreal (northamerica-northeast1)
  • Sao Paulo (southamerica-east1)
  • Iowa (us-central1)
  • South Carolina (us-east1)
  • Northern Virginia (us-east4)
  • Columbus (us-east5)
  • Oregon (us-west1)
  • Los Angeles (us-west2)
  • Salt Lake City (us-west3)
  • Las Vegas (us-west4)

GPU type

Price (USD)

NVIDIA_TESLA_A100

$3.5206896 / 1 hour

NVIDIA_TESLA_A100_80GB

$4.517292 / 1 hour

NVIDIA_H100_80GB

$11.75586073 / 1 hour

NVIDIA_TESLA_P4

$0.72 / 1 hour

NVIDIA_TESLA_P100

$1.752 / 1 hour

NVIDIA_TESLA_T4

$0.42 / 1 hour

NVIDIA_TESLA_V100

$2.976 / 1 hour

TPU_V2 Single (8 cores)

$5.40 / 1 hour

TPU_V2 Pod (32 cores)*

$28.80 / 1 hour

TPU_V3 Single (8 cores)

$9.60 / 1 hour

TPU_V3 Pod (32 cores)*

$38.40 / 1 hour

If you pay in a currency other than USD, the prices listed in your currency on Cloud Platform SKUs apply.

* The price for training using a Cloud TPU Pod is based on the number of cores in the Pod. The number of cores in a pod is always a multiple of 32. To determine the price of training on a Pod that has more than 32 cores, take the price for a 32-core Pod, and multiply it by the number of cores, divided by 32. For example, for a 128-core Pod, the price is (32-core Pod price) * (128/32). For information about which Cloud TPU Pods are available for a specific region, see System Architecture in the Cloud TPU documentation.

Disks

  • Johannesburg (africa-south1)
  • Taiwan (asia-east1)
  • Hong Kong (asia-east2)
  • Tokyo (asia-northeast1)
  • Osaka (asia-northeast2)
  • Seoul (asia-northeast3)
  • Mumbai (asia-south1)
  • Singapore (asia-southeast1)
  • Jakarta (asia-southeast2)
  • Sydney (australia-southeast1)
  • Melbourne (australia-southeast2)
  • Warsaw (europe-central2)
  • Finland (europe-north1)
  • Madrid (europe-southwest1)
  • Belgium (europe-west1)
  • Turin (europe-west12)
  • London (europe-west2)
  • Frankfurt (europe-west3)
  • Netherlands (europe-west4)
  • Zurich (europe-west6)
  • Milan (europe-west8)
  • Paris (europe-west9)
  • Doha (me-central1)
  • Dammam (me-central2)
  • Montreal (northamerica-northeast1)
  • Toronto (northamerica-northeast2)
  • Sao Paulo (southamerica-east1)
  • Santiago (southamerica-west1)
  • Iowa (us-central1)
  • South Carolina (us-east1)
  • Northern Virginia (us-east4)
  • Columbus (us-east5)
  • Dallas (us-south1)
  • Oregon (us-west1)
  • Los Angeles (us-west2)
  • Salt Lake City (us-west3)
  • Las Vegas (us-west4)

Disk type

Price (USD)

pd-standard

$0.000065753 / 1 gibibyte hour

pd-ssd

$0.000279452 / 1 gibibyte hour

If you pay in a currency other than USD, the prices listed in your currency on Cloud Platform SKUs apply.


You are charged for training your models from the moment when resources are provisioned for a job until the job finishes.

Warning: Your training jobs are limited by the Vertex AI quota policy. If you choose a very powerful processing cluster for your first training jobs, it's likely you will exceed your quota.

Prediction and explanation

The following tables provide the prices of batch prediction, online prediction, and online explanation per node hour. A node hour represents the time a virtual machine spends running your prediction job or waiting in an active state (an endpoint with one or more models deployed) to handle prediction or explanation requests.

You can use Spot VMs with Vertex AI Inference. Spot VMs are billed according to Compute Engine Spot VMs pricing. There are Vertex AI Inference management fees in addition to your infrastructure usage, captured in the following tables.

You can use Compute Engine reservations with Vertex AI Inference. When using Compute Engine reservations, you're billed according to Compute Engine Pricing, including any applicable committed use discounts (CUDs). There are Vertex AI Inference management fees in addition to your infrastructure usage, captured in the following tables.

E2 Series

  • Johannesburg (africa-south1)
  • Taiwan (asia-east1)
  • Hong Kong (asia-east2)
  • Tokyo (asia-northeast1)
  • Osaka (asia-northeast2)
  • Seoul (asia-northeast3)
  • Mumbai (asia-south1)
  • Singapore (asia-southeast1)
  • Jakarta (asia-southeast2)
  • Sydney (australia-southeast1)
  • Melbourne (australia-southeast2)
  • Warsaw (europe-central2)
  • Finland (europe-north1)
  • Madrid (europe-southwest1)
  • Belgium (europe-west1)
  • Turin (europe-west12)
  • London (europe-west2)
  • Frankfurt (europe-west3)
  • Netherlands (europe-west4)
  • Zurich (europe-west6)
  • Milan (europe-west8)
  • Paris (europe-west9)
  • Doha (me-central1)
  • Dammam (me-central2)
  • Tel Aviv (me-west1)
  • Montreal (northamerica-northeast1)
  • Toronto (northamerica-northeast2)
  • Sao Paulo (southamerica-east1)
  • Santiago (southamerica-west1)
  • Iowa (us-central1)
  • South Carolina (us-east1)
  • Northern Virginia (us-east4)
  • Columbus (us-east5)
  • Dallas (us-south1)
  • Oregon (us-west1)
  • Los Angeles (us-west2)
  • Salt Lake City (us-west3)
  • Las Vegas (us-west4)

Machine type

Price (USD)

e2-standard-2

$0.0770564 / 1 hour

e2-standard-4

$0.1541128 / 1 hour

e2-standard-8

$0.3082256 / 1 hour

e2-standard-16

$0.6164512 / 1 hour

e2-standard-32

$1.2329024 / 1 hour

e2-highmem-2

$0.1039476 / 1 hour

e2-highmem-4

$0.2078952 / 1 hour

e2-highmem-8

$0.4157904 / 1 hour

e2-highmem-16

$0.8315808 / 1 hour

e2-highcpu-2

$0.056888 / 1 hour

e2-highcpu-4

$0.113776 / 1 hour

e2-highcpu-8

$0.227552 / 1 hour

e2-highcpu-16

$0.455104 / 1 hour

e2-highcpu-32

$0.910208 / 1 hour

N1 Series

  • Taiwan (asia-east1)
  • Hong Kong (asia-east2)
  • Tokyo (asia-northeast1)
  • Osaka (asia-northeast2)
  • Seoul (asia-northeast3)
  • Mumbai (asia-south1)
  • Singapore (asia-southeast1)
  • Jakarta (asia-southeast2)
  • Sydney (australia-southeast1)
  • Melbourne (australia-southeast2)
  • Warsaw (europe-central2)
  • Madrid (europe-southwest1)
  • Belgium (europe-west1)
  • London (europe-west2)
  • Frankfurt (europe-west3)
  • Netherlands (europe-west4)
  • Milan (europe-west8)
  • Paris (europe-west9)
  • Tel Aviv (me-west1)
  • Montreal (northamerica-northeast1)
  • Toronto (northamerica-northeast2)
  • Sao Paulo (southamerica-east1)
  • Santiago (southamerica-west1)
  • Iowa (us-central1)
  • South Carolina (us-east1)
  • Northern Virginia (us-east4)
  • Oregon (us-west1)
  • Los Angeles (us-west2)
  • Salt Lake City (us-west3)
  • Las Vegas (us-west4)

Machine type

Price (USD)

n1-standard-2

$0.1095 / 1 hour

n1-standard-4

$0.219 / 1 hour

n1-standard-8

$0.438 / 1 hour

n1-standard-16

$0.876 / 1 hour

n1-standard-32

$1.752 / 1 hour

n1-highmem-2

$0.137 / 1 hour

n1-highmem-4

$0.274 / 1 hour

n1-highmem-8

$0.548 / 1 hour

n1-highmem-16

$1.096 / 1 hour

n1-highcpu-2

$0.081 / 1 hour

n1-highcpu-4

$0.162 / 1 hour

n1-highcpu-8

$0.324 / 1 hour

n1-highcpu-16

$0.648 / 1 hour

n1-highcpu-32

$1.296 / 1 hour

N2 Series

  • Johannesburg (africa-south1)
  • Taiwan (asia-east1)
  • Hong Kong (asia-east2)
  • Tokyo (asia-northeast1)
  • Osaka (asia-northeast2)
  • Seoul (asia-northeast3)
  • Mumbai (asia-south1)
  • Singapore (asia-southeast1)
  • Jakarta (asia-southeast2)
  • Sydney (australia-southeast1)
  • Melbourne (australia-southeast2)
  • Warsaw (europe-central2)
  • Finland (europe-north1)
  • Madrid (europe-southwest1)
  • Belgium (europe-west1)
  • Turin (europe-west12)
  • London (europe-west2)
  • Frankfurt (europe-west3)
  • Netherlands (europe-west4)
  • Zurich (europe-west6)
  • Milan (europe-west8)
  • Paris (europe-west9)
  • Doha (me-central1)
  • Dammam (me-central2)
  • Tel Aviv (me-west1)
  • Montreal (northamerica-northeast1)
  • Toronto (northamerica-northeast2)
  • Sao Paulo (southamerica-east1)
  • Santiago (southamerica-west1)
  • Iowa (us-central1)
  • South Carolina (us-east1)
  • Northern Virginia (us-east4)
  • Columbus (us-east5)
  • Dallas (us-south1)
  • Oregon (us-west1)
  • Los Angeles (us-west2)
  • Salt Lake City (us-west3)
  • Las Vegas (us-west4)

Machine type

Price (USD)

n2-standard-2

$0.1116854 / 1 hour

n2-standard-4

$0.2233708 / 1 hour

n2-standard-8

$0.4467416 / 1 hour

n2-standard-16

$0.8934832 / 1 hour

n2-standard-32

$1.7869664 / 1 hour

n2-highmem-2

$0.1506654 / 1 hour

n2-highmem-4

$0.3013308 / 1 hour

n2-highmem-8

$0.6026616 / 1 hour

n2-highmem-16

$1.2053232 / 1 hour

n2-highcpu-2

$0.0824504 / 1 hour

n2-highcpu-4

$0.1649008 / 1 hour

n2-highcpu-8

$0.3298016 / 1 hour

n2-highcpu-16

$0.6596032 / 1 hour

n2-highcpu-32

$1.3192064 / 1 hour

N2D Series

  • Taiwan (asia-east1)
  • Hong Kong (asia-east2)
  • Tokyo (asia-northeast1)
  • Osaka (asia-northeast2)
  • Seoul (asia-northeast3)
  • Mumbai (asia-south1)
  • Singapore (asia-southeast1)
  • Jakarta (asia-southeast2)
  • Sydney (australia-southeast1)
  • Melbourne (australia-southeast2)
  • Warsaw (europe-central2)
  • Finland (europe-north1)
  • Madrid (europe-southwest1)
  • Belgium (europe-west1)
  • London (europe-west2)
  • Frankfurt (europe-west3)
  • Netherlands (europe-west4)
  • Zurich (europe-west6)
  • Milan (europe-west8)
  • Paris (europe-west9)
  • Tel Aviv (me-west1)
  • Montreal (northamerica-northeast1)
  • Sao Paulo (southamerica-east1)
  • Iowa (us-central1)
  • South Carolina (us-east1)
  • Northern Virginia (us-east4)
  • Oregon (us-west1)
  • Los Angeles (us-west2)
  • Salt Lake City (us-west3)
  • Las Vegas (us-west4)

Machine type

Price (USD)

n2d-standard-2

$0.0971658 / 1 hour

n2d-standard-4

$0.1943316 / 1 hour

n2d-standard-8

$0.3886632 / 1 hour

n2d-standard-16

$0.7773264 / 1 hour

n2d-standard-32

$1.5546528 / 1 hour

n2d-highmem-2

$0.131077 / 1 hour

n2d-highmem-4

$0.262154 / 1 hour

n2d-highmem-8

$0.524308 / 1 hour

n2d-highmem-16

$1.048616 / 1 hour

n2d-highcpu-2

$0.0717324 / 1 hour

n2d-highcpu-4

$0.1434648 / 1 hour

n2d-highcpu-8

$0.2869296 / 1 hour

n2d-highcpu-16

$0.5738592 / 1 hour

n2d-highcpu-32

$1.1477184 / 1 hour

C2 Series

  • Taiwan (asia-east1)
  • Hong Kong (asia-east2)
  • Tokyo (asia-northeast1)
  • Seoul (asia-northeast3)
  • Mumbai (asia-south1)
  • Singapore (asia-southeast1)
  • Sydney (australia-southeast1)
  • Finland (europe-north1)
  • Belgium (europe-west1)
  • London (europe-west2)
  • Frankfurt (europe-west3)
  • Netherlands (europe-west4)
  • Zurich (europe-west6)
  • Montreal (northamerica-northeast1)
  • Sao Paulo (southamerica-east1)
  • Santiago (southamerica-west1)
  • Iowa (us-central1)
  • South Carolina (us-east1)
  • Northern Virginia (us-east4)
  • Columbus (us-east5)
  • Oregon (us-west1)
  • Los Angeles (us-west2)
  • Salt Lake City (us-west3)
  • Las Vegas (us-west4)

Machine type

Price (USD)

c2-standard-4

$0.240028 / 1 hour

c2-standard-8

$0.480056 / 1 hour

c2-standard-16

$0.960112 / 1 hour

c2-standard-30

$1.80021 / 1 hour

c2-standard-60

$3.60042 / 1 hour

C2D Series

  • Taiwan (asia-east1)
  • Mumbai (asia-south1)
  • Singapore (asia-southeast1)
  • Belgium (europe-west1)
  • London (europe-west2)
  • Frankfurt (europe-west3)
  • Netherlands (europe-west4)
  • Iowa (us-central1)
  • South Carolina (us-east1)
  • Northern Virginia (us-east4)
  • Oregon (us-west1)
  • Las Vegas (us-west4)

Machine type

Price (USD)

c2d-standard-2

$0.1044172 / 1 hour

c2d-standard-4

$0.2088344 / 1 hour

c2d-standard-8

$0.4176688 / 1 hour

c2d-standard-16

$0.8353376 / 1 hour

c2d-standard-32

$1.6706752 / 1 hour

c2d-standard-56

$2.9236816 / 1 hour

c2d-standard-112

$5.8473632 / 1 hour

c2d-highmem-2

$0.1408396 / 1 hour

c2d-highmem-4

$0.2816792 / 1 hour

c2d-highmem-8

$0.5633584 / 1 hour

c2d-highmem-16

$1.1267168 / 1 hour

c2d-highmem-32

$2.2534336 / 1 hour

c2d-highmem-56

$3.9435088 / 1 hour

c2d-highmem-112

$7.8870176 / 1 hour

c2d-highcpu-2

$0.086206 / 1 hour

c2d-highcpu-4

$0.172412 / 1 hour

c2d-highcpu-8

$0.344824 / 1 hour

c2d-highcpu-16

$0.689648 / 1 hour

c2d-highcpu-32

$1.379296 / 1 hour

c2d-highcpu-56

$2.413768 / 1 hour

c2d-highcpu-112

$4.827536 / 1 hour

C3 Series

  • Singapore (asia-southeast1)
  • Belgium (europe-west1)
  • Netherlands (europe-west4)
  • Iowa (us-central1)
  • South Carolina (us-east1)
  • Northern Virginia (us-east4)
  • Columbus (us-east5)
  • Dallas (us-south1)

Machine type

Price (USD)

c3-highcpu-4

$0.19824 / 1 hour

c3-highcpu-8

$0.39648 / 1 hour

c3-highcpu-22

$1.09032 / 1 hour

c3-highcpu-44

$2.18064 / 1 hour

c3-highcpu-88

$4.36128 / 1 hour

c3-highcpu-176

$8.72256 / 1 hour

A2 Series

  • Tokyo (asia-northeast1)
  • Seoul (asia-northeast3)
  • Singapore (asia-southeast1)
  • Netherlands (europe-west4)
  • Iowa (us-central1)
  • South Carolina (us-east1)
  • Northern Virginia (us-east4)
  • Columbus (us-east5)
  • Oregon (us-west1)
  • Salt Lake City (us-west3)

Machine type

Price (USD)

a2-highgpu-1g

$4.2244949 / 1 hour

a2-highgpu-2g

$8.4489898 / 1 hour

a2-highgpu-4g

$16.8979796 / 1 hour

a2-highgpu-8g

$33.7959592 / 1 hour

a2-megagpu-16g

$64.1020592 / 1 hour

a2-ultragpu-1g

$5.7818474 / 1 hour

a2-ultragpu-2g

$11.5636948 / 1 hour

a2-ultragpu-4g

$23.1273896 / 1 hour

a2-ultragpu-8g

$46.2547792 / 1 hour

When consuming from a reservation or spot capacity, billing is spread across two SKUs: the GCE SKU with the label 'vertex-ai-online-prediction' and the Vertex AI Management Fee SKU. This enables you to use your Committed Use Discounts (CUDs) in Vertex AI.

A3 Series

  • Tokyo (asia-northeast1)
  • Seoul (asia-northeast3)
  • Mumbai (asia-south1)
  • Delhi (asia-south2)
  • Singapore (asia-southeast1)
  • Sydney (australia-southeast1)
  • Belgium (europe-west1)
  • London (europe-west2)
  • Frankfurt (europe-west3)
  • Netherlands (europe-west4)
  • Toronto (northamerica-northeast2)
  • Iowa (us-central1)
  • Northern Virginia (us-east4)
  • Columbus (us-east5)
  • Dallas (us-south1)
  • Oregon (us-west1)
  • Las Vegas (us-west4)

Machine type

Price (USD)

a3-ultragpu-8g

$96.015616 / 1 hour

a3-megagpu-8g

$106.65474 / 1 hour

When consuming from a reservation or spot capacity, billing is spread across two SKUs: the GCE SKU with the label 'vertex-ai-online-prediction' and the Vertex AI Management Fee SKU. This enables you to use your Committed Use Discounts (CUDs) in Vertex AI.

A4 Series

  • Singapore (asia-southeast1)
  • Iowa (us-central1)
  • Los Angeles (us-west2)

Machine type

Price (USD)

a4-highgpu-8g

$148.212 / 1 hour

When consuming from a reservation or spot capacity, billing is spread across two SKUs: the GCE SKU with the label 'vertex-ai-online-prediction' and the Vertex AI Management Fee SKU. This enables you to use your Committed Use Discounts (CUDs) in Vertex AI.

A4X Series

  • Netherlands (europe-west4)
  • Iowa (us-central1)
  • Northern Virginia (us-east4)
  • Columbus (us-east5)

Machine type

Price (USD)

a4x-highgpu-4g

$74.75 / 1 hour

When consuming from a reservation or spot capacity, billing is spread across two SKUs: the GCE SKU with the label 'vertex-ai-online-prediction' and the Vertex AI Management Fee SKU. This enables you to use your Committed Use Discounts (CUDs) in Vertex AI.

a4x-highgpu-4g requires at least 18 VMs.

G2 Series

  • Taiwan (asia-east1)
  • Tokyo (asia-northeast1)
  • Seoul (asia-northeast3)
  • Mumbai (asia-south1)
  • Singapore (asia-southeast1)
  • Belgium (europe-west1)
  • London (europe-west2)
  • Frankfurt (europe-west3)
  • Netherlands (europe-west4)
  • Zurich (europe-west6)
  • Dammam (me-central2)
  • Toronto (northamerica-northeast2)
  • Iowa (us-central1)
  • South Carolina (us-east1)
  • Northern Virginia (us-east4)
  • Oregon (us-west1)
  • Las Vegas (us-west4)

Machine type

Price (USD)

g2-standard-4

$0.81293 / 1 hour

g2-standard-8

$0.98181 / 1 hour

g2-standard-12

$1.15069 / 1 hour

g2-standard-16

$1.31957 / 1 hour

g2-standard-24

$2.30138 / 1 hour

g2-standard-32

$1.99509 / 1 hour

g2-standard-48

$4.60276 / 1 hour

g2-standard-96

$9.20552 / 1 hour

When consuming from a reservation or spot capacity, billing is spread across two SKUs: the GCE SKU with the label 'vertex-ai-online-prediction' and the Vertex AI Management Fee SKU. This enables you to use your Committed Use Discounts (CUDs) in Vertex AI.

TPU v5e pricing

  • Singapore (asia-southeast1)
  • Netherlands (europe-west4)
  • Iowa (us-central1)
  • South Carolina (us-east1)
  • Oregon (us-west1)
  • Las Vegas (us-west4)

Machine type

Price (USD)

ct5lp-hightpu-1t

$1.38 / 1 hour

ct5lp-hightpu-4t

$5.52 / 1 hour

ct5lp-hightpu-8t

$5.52 / 1 hour

Each machine type is charged as the following SKUs on your Google Cloud bill:

  • vCPU cost: measured in vCPU hours
  • RAM cost: measured in GB hours
  • GPU cost: if either built into the machine or optionally configured, measured in GPU hours

The prices for machine types are used to approximate the total hourly cost for each prediction node of a model version using that machine type.

For example, a machine type of n1-highcpu-32 includes 32 vCPUs and 32 GB of RAM. Therefore, the hourly pricing equals 32 vCPU hours + 32 GB hours.

E2 Series

  • Johannesburg (africa-south1)
  • Taiwan (asia-east1)
  • Hong Kong (asia-east2)
  • Tokyo (asia-northeast1)
  • Osaka (asia-northeast2)
  • Seoul (asia-northeast3)
  • Mumbai (asia-south1)
  • Singapore (asia-southeast1)
  • Jakarta (asia-southeast2)
  • Sydney (australia-southeast1)
  • Melbourne (australia-southeast2)
  • Warsaw (europe-central2)
  • Finland (europe-north1)
  • Madrid (europe-southwest1)
  • Belgium (europe-west1)
  • Turin (europe-west12)
  • London (europe-west2)
  • Frankfurt (europe-west3)
  • Netherlands (europe-west4)
  • Zurich (europe-west6)
  • Milan (europe-west8)
  • Paris (europe-west9)
  • Doha (me-central1)
  • Dammam (me-central2)
  • Tel Aviv (me-west1)
  • Montreal (northamerica-northeast1)
  • Toronto (northamerica-northeast2)
  • Sao Paulo (southamerica-east1)
  • Santiago (southamerica-west1)
  • Iowa (us-central1)
  • South Carolina (us-east1)
  • Northern Virginia (us-east4)
  • Columbus (us-east5)
  • Dallas (us-south1)
  • Oregon (us-west1)
  • Los Angeles (us-west2)
  • Salt Lake City (us-west3)
  • Las Vegas (us-west4)

Item

Price (USD)

vCPU

$0.0250826 / 1 hour

RAM

$0.0033614 / 1 gibibyte hour

N1 Series

  • Taiwan (asia-east1)
  • Hong Kong (asia-east2)
  • Tokyo (asia-northeast1)
  • Osaka (asia-northeast2)
  • Seoul (asia-northeast3)
  • Mumbai (asia-south1)
  • Singapore (asia-southeast1)
  • Jakarta (asia-southeast2)
  • Sydney (australia-southeast1)
  • Melbourne (australia-southeast2)
  • Warsaw (europe-central2)
  • Finland (europe-north1)
  • Madrid (europe-southwest1)
  • Belgium (europe-west1)
  • London (europe-west2)
  • Frankfurt (europe-west3)
  • Netherlands (europe-west4)
  • Zurich (europe-west6)
  • Milan (europe-west8)
  • Paris (europe-west9)
  • Tel Aviv (me-west1)
  • Montreal (northamerica-northeast1)
  • Toronto (northamerica-northeast2)
  • Sao Paulo (southamerica-east1)
  • Santiago (southamerica-west1)
  • Iowa (us-central1)
  • South Carolina (us-east1)
  • Northern Virginia (us-east4)
  • Oregon (us-west1)
  • Los Angeles (us-west2)
  • Salt Lake City (us-west3)
  • Las Vegas (us-west4)

Item

Price (USD)

vCPU

$0.036 / 1 hour

RAM

$0.005 / 1 gibibyte hour

N2 Series

  • Johannesburg (africa-south1)
  • Taiwan (asia-east1)
  • Hong Kong (asia-east2)
  • Tokyo (asia-northeast1)
  • Osaka (asia-northeast2)
  • Seoul (asia-northeast3)
  • Mumbai (asia-south1)
  • Singapore (asia-southeast1)
  • Jakarta (asia-southeast2)
  • Sydney (australia-southeast1)
  • Melbourne (australia-southeast2)
  • Warsaw (europe-central2)
  • Finland (europe-north1)
  • Madrid (europe-southwest1)
  • Belgium (europe-west1)
  • Turin (europe-west12)
  • London (europe-west2)
  • Frankfurt (europe-west3)
  • Netherlands (europe-west4)
  • Zurich (europe-west6)
  • Milan (europe-west8)
  • Paris (europe-west9)
  • Doha (me-central1)
  • Dammam (me-central2)
  • Tel Aviv (me-west1)
  • Montreal (northamerica-northeast1)
  • Toronto (northamerica-northeast2)
  • Sao Paulo (southamerica-east1)
  • Santiago (southamerica-west1)
  • Iowa (us-central1)
  • South Carolina (us-east1)
  • Northern Virginia (us-east4)
  • Columbus (us-east5)
  • Dallas (us-south1)
  • Oregon (us-west1)
  • Los Angeles (us-west2)
  • Salt Lake City (us-west3)
  • Las Vegas (us-west4)

Item

Price (USD)

vCPU

$0.0363527 / 1 hour

RAM

$0.0048725 / 1 gibibyte hour

N2D Series

  • Taiwan (asia-east1)
  • Hong Kong (asia-east2)
  • Tokyo (asia-northeast1)
  • Osaka (asia-northeast2)
  • Seoul (asia-northeast3)
  • Mumbai (asia-south1)
  • Singapore (asia-southeast1)
  • Jakarta (asia-southeast2)
  • Sydney (australia-southeast1)
  • Melbourne (australia-southeast2)
  • Warsaw (europe-central2)
  • Finland (europe-north1)
  • Madrid (europe-southwest1)
  • Belgium (europe-west1)
  • London (europe-west2)
  • Frankfurt (europe-west3)
  • Netherlands (europe-west4)
  • Zurich (europe-west6)
  • Milan (europe-west8)
  • Paris (europe-west9)
  • Tel Aviv (me-west1)
  • Montreal (northamerica-northeast1)
  • Sao Paulo (southamerica-east1)
  • Iowa (us-central1)
  • South Carolina (us-east1)
  • Northern Virginia (us-east4)
  • Oregon (us-west1)
  • Los Angeles (us-west2)
  • Salt Lake City (us-west3)
  • Las Vegas (us-west4)

Item

Price (USD)

vCPU

$0.0316273 / 1 hour

RAM

$0.0042389 / 1 gibibyte hour

C2 Series

  • Taiwan (asia-east1)
  • Hong Kong (asia-east2)
  • Tokyo (asia-northeast1)
  • Seoul (asia-northeast3)
  • Mumbai (asia-south1)
  • Singapore (asia-southeast1)
  • Sydney (australia-southeast1)
  • Finland (europe-north1)
  • Belgium (europe-west1)
  • London (europe-west2)
  • Frankfurt (europe-west3)
  • Netherlands (europe-west4)
  • Zurich (europe-west6)
  • Montreal (northamerica-northeast1)
  • Sao Paulo (southamerica-east1)
  • Santiago (southamerica-west1)
  • Iowa (us-central1)
  • South Carolina (us-east1)
  • Northern Virginia (us-east4)
  • Columbus (us-east5)
  • Oregon (us-west1)
  • Los Angeles (us-west2)
  • Salt Lake City (us-west3)
  • Las Vegas (us-west4)

Item

Price (USD)

vCPU

$0.039077 / 1 hour

RAM

$0.0052325 / 1 gibibyte hour

C2D Series

  • Taiwan (asia-east1)
  • Mumbai (asia-south1)
  • Singapore (asia-southeast1)
  • Belgium (europe-west1)
  • London (europe-west2)
  • Frankfurt (europe-west3)
  • Netherlands (europe-west4)
  • Iowa (us-central1)
  • South Carolina (us-east1)
  • Northern Virginia (us-east4)
  • Oregon (us-west1)
  • Las Vegas (us-west4)

Item

Price (USD)

vCPU

$0.0339974 / 1 hour

RAM

$0.0045528 / 1 gibibyte hour

C3 Series

  • Singapore (asia-southeast1)
  • Belgium (europe-west1)
  • Netherlands (europe-west4)
  • Iowa (us-central1)
  • South Carolina (us-east1)
  • Northern Virginia (us-east4)
  • Columbus (us-east5)
  • Dallas (us-south1)

Item

Price (USD)

vCPU

$0.03908 / 1 hour

RAM

$0.00524 / 1 gibibyte hour

A2 Series

  • Tokyo (asia-northeast1)
  • Seoul (asia-northeast3)
  • Singapore (asia-southeast1)
  • Netherlands (europe-west4)
  • Tel Aviv (me-west1)
  • Iowa (us-central1)
  • South Carolina (us-east1)
  • Northern Virginia (us-east4)
  • Columbus (us-east5)
  • Oregon (us-west1)
  • Salt Lake City (us-west3)

Item

Price (USD)

vCPU

$0.0363527 / 1 hour

RAM

$0.0048725 / 1 gibibyte hour

GPU (A100 40 GB)

$3.3741 / 1 hour

GPU (A100 80 GB)

$4.51729 / 1 hour

A3 Series

  • Tokyo (asia-northeast1)
  • Seoul (asia-northeast3)
  • Mumbai (asia-south1)
  • Delhi (asia-south2)
  • Singapore (asia-southeast1)
  • Belgium (europe-west1)
  • London (europe-west2)
  • Frankfurt (europe-west3)
  • Netherlands (europe-west4)
  • Toronto (northamerica-northeast2)
  • Iowa (us-central1)
  • Northern Virginia (us-east4)
  • Columbus (us-east5)
  • Dallas (us-south1)
  • Oregon (us-west1)
  • Las Vegas (us-west4)

Item

Price (USD)

vCPU

$0.0293227 / 1 hour

RAM

$0.0025534 / 1 gibibyte hour

GPU (H100 80 GB)

$11.2660332 / 1 hour

GPU (H200)

$10.708501 / 1 hour

G2 Series

  • Taiwan (asia-east1)
  • Tokyo (asia-northeast1)
  • Seoul (asia-northeast3)
  • Mumbai (asia-south1)
  • Singapore (asia-southeast1)
  • Belgium (europe-west1)
  • London (europe-west2)
  • Frankfurt (europe-west3)
  • Netherlands (europe-west4)
  • Zurich (europe-west6)
  • Dammam (me-central2)
  • Toronto (northamerica-northeast2)
  • Iowa (us-central1)
  • South Carolina (us-east1)
  • Northern Virginia (us-east4)
  • Oregon (us-west1)
  • Las Vegas (us-west4)

Item

Price (USD)

vCPU

$0.02874 / 1 hour

RAM

$0.00337 / 1 gibibyte hour

GPU (L4)

$0.64405 / 1 hour

Some machine types allow you to add optional GPU accelerators for prediction. Optional GPUs incur an additional charge, separate from those described in the previous table. View each pricing table, which describes the pricing for each type of optional GPU.

Accelerators - price per hour

  • Taiwan (asia-east1)
  • Hong Kong (asia-east2)
  • Tokyo (asia-northeast1)
  • Osaka (asia-northeast2)
  • Seoul (asia-northeast3)
  • Mumbai (asia-south1)
  • Singapore (asia-southeast1)
  • Jakarta (asia-southeast2)
  • Sydney (australia-southeast1)
  • Warsaw (europe-central2)
  • Finland (europe-north1)
  • Belgium (europe-west1)
  • London (europe-west2)
  • Frankfurt (europe-west3)
  • Netherlands (europe-west4)
  • Zurich (europe-west6)
  • Tel Aviv (me-west1)
  • Montreal (northamerica-northeast1)
  • Sao Paulo (southamerica-east1)
  • Iowa (us-central1)
  • South Carolina (us-east1)
  • Northern Virginia (us-east4)
  • Oregon (us-west1)
  • Los Angeles (us-west2)
  • Salt Lake City (us-west3)
  • Las Vegas (us-west4)

GPU type

Price (USD)

NVIDIA_TESLA_P4

$0.69 / 1 hour

NVIDIA_TESLA_P100

$1.679 / 1 hour

NVIDIA_TESLA_T4

$0.402 / 1 hour

NVIDIA_TESLA_V100

$2.852 / 1 hour

Pricing is per GPU. If you use multiple GPUs per prediction node (or if your version scales to use multiple nodes),the costs scale accordingly.

AI Platform Prediction serves predictions from your model by running a number of virtual machines ("nodes"). By default, Vertex AI automatically scales the number of nodes running at any time. For online prediction, the number of nodes scales to meet demand. Each node can respond to multiple prediction requests. For batch prediction, the number of nodes scales to reduce the total time it takes to run a job. You can customize how prediction nodes scale.

You are charged for the time that each node runs for your model, including:

  • When the node is processing a batch prediction job.
  • When the node is processing an online prediction request.
  • When the node is in a ready state for serving online predictions.

The cost of one node running for one hour is a node hour. The table of prediction prices describes the price of a node hour, which varies across regions and between online prediction and batch prediction.

You can consume node hours in fractional increments. For example, one node running for 30 minutes costs 0.5 node hours.

Cost calculations for Compute Engine (N1) machine types

  • The running time of a node is billed in 30-second increments. This means that every 30 seconds, your project is billed for 30 seconds worth of whatever vCPU, RAM, and GPU resources that your node is using at that moment.

More about automatic scaling of prediction nodes

Online prediction

Batch prediction

The priority of the scaling is to reduce the latency of individual requests. The service keeps your model in a ready state for a few idle minutes after servicing a request.

The priority of the scaling is to reduce the total elapsed time of the job.

Scaling affects your total charges each month: the more numerous and frequent your requests, the more nodes will be used.

Scaling should have little effect on the price of your job, though there is some overhead involved in bringing up a new node.

You can choose to let the service scale in response to traffic (automatic scaling) or you can specify a number of nodes to run constantly to avoid latency (manual scaling).

  • If you choose automatic scaling, the number of nodes scales automatically. For AI Platform Prediction legacy (MLS1) machine type deployments, the number of nodes can scale down to zero for no-traffic durations. Vertex AI deployments and other types of AI Platform Prediction deployments cannot scale down to zero nodes.
  • If you choose manual scaling, you specify a number of nodes to keep running all the time. You are charged for all of the time that these nodes are running, starting at the time of deployment and persisting until you delete the model version.

You can affect scaling by setting a maximum number of nodes to use for a batch prediction job, and by setting the number of nodes to keep running for a model when you deploy it.

Batch prediction jobs are charged after job completion

Batch prediction jobs are charged after job completion, not incrementally during the job. Any Cloud Billing budget alerts that you have configured aren't triggered while a job is running. Before starting a large job, consider running some cost benchmark jobs with small input data first.

Example of a prediction calculation

A real-estate company in an Americas region runs a weekly prediction of housing values in areas it serves. In one month, it runs predictions for four weeks in batches of 3920, 4277, 3849, and 3961. Jobs are limited to one node and each instance takes an average of 0.72 seconds of processing.

First calculate the length of time that each job ran:

  • Calculations
Loading...

Each job ran for more than ten minutes, so it is charged for each minute of processing:

  • Calculations
Loading...

The total charge for the month is $0.26.

This example assumed jobs ran on a single node and took a consistent amount of time per input instance. In real usage, make sure to account for multiple nodes and use the actual amount of time each node spends running for your calculations.

Charges for Vertex Explainable AI

Feature-based explanations

Feature-based explanations come at no extra charge to prediction prices. However, explanations take longer to process than normal predictions, so heavy usage of Vertex Explainable AI along with auto-scaling could result in more nodes being started, which would increase prediction charges.

Example-based explanations

Pricing for example-based explanations consists of the following:

  • When you upload a model or update a model's dataset, you are billed:
  • per node hour for the batch prediction job that is used to generate the latent space representations of examples. This is billed at the same rate as prediction.
  • a cost for building or updating indexes. This cost is the same as the indexing costs for Vector Search, which is number of examples * number of dimensions * 4 bytes per float * $3.00 per GB. For example, if you have 1 million examples and 1,000 dimension latent space, the cost is $12 (1,000,000 * 1,000 * 4 * 3.00 / 1,000,000,000).
  • When you deploy to an endpoint, you are billed per node hour for each node in your endpoint. All compute associated with the endpoint is charged at same rate as prediction. However, because Example-based explanations require additional compute resources to serve the Vector Search index, this results in more nodes being started which increases prediction charges.

Vertex AI Pipelines

Vertex AI Pipelines charges a run execution fee of $0.03 per Pipeline Run. You are not charged the execution fee during the Preview release. You also pay for Google Cloud resources you use with Vertex AI Pipelines, such as Compute Engine resources consumed by pipeline components (charged at the same rate as for Vertex AI training). Finally, you are responsible for the cost of any services (such as Dataflow) called by your pipeline.

Vertex AI Feature Store

Vertex AI Feature Store is Generally Available (GA) since November 2023. For information on the previous version of the product go to Vertex AI Feature Store (Legacy).

New Vertex AI Feature Store

The new Vertex AI Feature Store supports functionality across 2 types of operations:

  • Offline operations are operations to transfer, store, retrieve and transform data in the offline store (BigQuery)
  • Online operations are operations to transfer data into the online store(s) and operations on data while it is in the online store(s).

Offline Operations Pricing

Since BigQuery is used for offline operations, please refer to BigQuery pricing for functionality such as ingestion to the offline store, querying the offline store, and offline storage.

Online Operations Pricing

For online operations, Vertex AI Feature Store charges for any GA features to transfer data into the online store, serve data or store data. A node-hour represents the time a virtual machine spends to complete an operation, charged to the minute.

  • Johannesburg (africa-south1)
  • Tokyo (asia-northeast1)
  • Osaka (asia-northeast2)
  • Seoul (asia-northeast3)
  • Mumbai (asia-south1)
  • Singapore (asia-southeast1)
  • Warsaw (europe-central2)
  • Madrid (europe-southwest1)
  • Belgium (europe-west1)
  • Turin (europe-west12)
  • London (europe-west2)
  • Frankfurt (europe-west3)
  • Netherlands (europe-west4)
  • Milan (europe-west8)
  • Doha (me-central1)
  • Dammam (me-central2)
  • Iowa (us-central1)
  • South Carolina (us-east1)
  • Northern Virginia (us-east4)
  • Columbus (us-east5)
  • Dallas (us-south1)
  • Oregon (us-west1)
  • Los Angeles (us-west2)
  • Salt Lake City (us-west3)

Operation

Price (USD)

Data processing node

Data processing (e.g. ingesting into any online

store, monitoring, etc.)

$0.08 / 1 hour

Optimized online serving node

Low latency serving and embeddings serving

Each node includes 200GB of storage

$0.30 / 1 hour

Bigtable online serving node

Serving with Cloud Bigtable

$0.94 / 1 hour

Bigtable online serving storage

Storage for serving with Cloud Bigtable

$0.000342466 / 1 gibibyte hour

Optimized online serving and Bigtable online serving use different architectures, so their nodes are not comparable.

If you pay in a currency other than USD, the prices listed in your currency on Cloud Platform SKUs apply.

Online Operations Workload Estimates

Consider the following guidelines when estimating your workloads. The number of nodes required for a given workload may differ across each serving approach.

  • Data processing:
  • Ingestion - One node can ingest approximately a minimum of 100 MiB of data per hour into a Bigtable Online Store or an Optimized Online Store if no analytical functions are used.
  • Bigtable online serving: Each node can support approximately 15,000 QPS and up to 5 TB of storage.
  • Optimized online serving: Performance is based on the machine type and replicas, which are automatically configured to minimize costs subject to the workload. Each node can have a minimum of 2 and a maximum of 6 replicas for high availability and autoscaling. You're charged for the number of replicas accordingly. For more details, see the example monthly scenarios.
  • For non embeddings-related workloads, each node can support approximately 500 QPS and up to 200 GB of storage.
  • For embeddings-related workloads, each node can support approximately 500 QPS and up to 4 GB of storage of 512 dimensional data.

You can view the number of nodes (with replicas) in the Metric Explorer:

Metric Explorer to figure out the number of nodes been used.
Metric Explorer to figure out the number of nodes been used.

Example Monthly Scenarios (assuming us-central1)

Data streaming workload - Bigtable online serving with 2.5 TB of data (1 GB refreshed daily) and 1200 QPS

Operations

Monthly Usage

Monthly Cost

Data processing node

(1 GB/day) * (30 days/month) * (1,000 MB/GB) * (1 node-hr / 100 MB) = 300 node-hr

300 node-hr * ($0.08 per node-hr) = $24

Optimized online serving node

N/A

N/A

Bigtable online serving node

(1 node) * (24 hr/day) * (30 days/month) = 720 node-hr

720 node-hr * ($0.94 per node-hr) = $677

Bigtable online serving storage

(2.5 TB-month) * (1000 GB/TB) = 2500 GB-month

2500 GB-month * ($0.25 per GB-month) = $625

Total

$1,326

High QPS workload - Optimized online serving with 10GB of non-embedding data (5GB refreshed daily) and 2000QPS

Operations

Monthly Usage

Monthly Cost

Data processing node

(5 GB/day) * (30 days/month) * (1,000 MB/GB) * (1 node-hr / 100MB) = 1500 node-hr

1500 node-hr * ($0.08 per node-hr) = $120

Optimized online serving node

Roundup(10GB * (1 node / 200 GB)) = 1 * max(2 default replicas, 2000 QPS * (1 replica / 500 QPS)) = 4 total nodes * (24 hr/day) * (30days/month) =2880 node-hr

2880 node-hr * (0.30 per node-hr) = $864

Bigtable online serving node

N/A

N/A

Bigtable online serving storage

N/A

N/A

Total

$984

Embeddings serving workload - Optimized online serving with 20GB of embeddings data (2GB refreshed daily) and 800QPS

Operations

Monthly Usage

Monthly Cost

Data processing node

(2 GB/day) * (30 days/month) * (1,000 MB/GB) * (1 node-hr / 100MB) = 600 node-hr

600 node-hr * ($0.08 per node-hr) = $48

Optimized online serving node

Roundup(20GB* (1 node / 4GB) = 5 * max(2 default replicas, 800 QPS * (1 replica / 500 QPS)) = 10 total nodes * (24 hr/day) * (30days/month) = 7200 node-hr

7200 node-hr * (0.30 per node-hr) = $2160

Bigtable online serving node

N/A

N/A

Bigtable online serving storage

N/A

N/A

Total

$2,208

Vertex AI Feature Store (Legacy)

Prices for Vertex AI Feature Store (Legacy) are based on the amount of feature data in online and offline storage as well as the availability of online serving. A node per hour represents the time a virtual machine spends serving feature data or waiting in a ready state to handle feature data requests.

Operation

Price (USD)

Online storage

$0.25 per GB-month

Offline storage

$0.023 per GB-month

Online serving

$0.94 per node per hour

Batch export

$0.005 per GB

Streaming ingestion

$0.10 per GB of ingestion

If you pay in a currency other than USD, the prices listed in your currency on Cloud Platform SKUs apply.

When you enable feature value monitoring, billing includes applicable charges above in addition to applicable charges that follow:

  • $3.50 per GB for all data analyzed. With snapshot analysis enabled, snapshots taken for data in Vertex AI Feature Store (Legacy) are included. With import feature analysis enabled, batches of ingested data are included.
  • Additional charges for other Vertex AI Feature Store (Legacy) operations used with feature value monitoring include the following:
  • The snapshot analysis feature periodically takes a snapshot of the feature values based on your configuration for the monitoring interval.
  • The charge for a snapshot export is the same as a regular batch export operation.

Snapshot Analysis Example

A data scientist enables feature value monitoring for their Vertex AI Feature Store (Legacy) and turns on monitoring for a daily snapshot analysis. A pipeline runs daily for the entity types monitoring. The pipeline scans 2GB of data in Vertex AI Feature Store (Legacy) and exports a snapshot containing 0.1GB of data. The total charge for one day's analysis is:

(0.1 GB * $3.50) + (2 GB * $0.005) = $0.36

Ingestion Analysis Example

A data scientist enables feature value monitoring for their Vertex AI Feature Store (Legacy) and turns on monitoring for ingestion operations. An ingestion operation imports 1GB of data into Vertex AI Feature Store (Legacy). The total charge for feature value monitoring is:

(1 GB * $3.50) = $3.50

Vertex ML Metadata

Metadata storage is measured in binary gigabytes (GiB), where 1 GiB is 1,073,741,824 bytes. This unit of measurement is also known as a gibibyte.

Vertex ML Metadata charges $10 per gibibyte (GiB) per month for metadata storage. Prices are pro-rated per megabyte (MB). For example, if you store 10 MB of metadata, you are charged $0.10 per month for that 10 MB of metadata.

Prices are the same in all regions where Vertex ML Metadata is supported.

Vertex AI TensorBoard

To use Vertex AI TensorBoard, request that the IAM administrator of the project assign you to the role "Vertex AI TensorBoard Web App User". The Vertex AI Administrator role also has access.

Beginning in August 2023, Vertex AI TensorBoard pricing changed from a per-user monthly license of $300/month to $10 GiB/month for data storage of logs and metrics. This means no more subscription fees. You will only pay for the storage you’ve used. See the Vertex AI TensorBoard: Delete Outdated TensorBoard Experiments tutorial for how to manage storage.

Vertex AI Vizier

Vertex AI Vizier is a black-box optimization service inside Vertex AI. The Vertex AI Vizier pricing model consists of the following:

  • There is no charge for trials that use RANDOM_SEARCH and GRID_SEARCH. Learn more about the search algorithms.
  • The first 100 Vertex AI Vizier trials per calendar month are available at no charge (trials using RANDOM_SEARCH and GRID_SEARCH do not count against this total).
  • After 100 Vertex AI Vizier trials, subsequent trials during the same calendar month are charged at $1 per trial (trials that use RANDOM_SEARCH or GRID_SEARCH incur no charges).

Vertex AI Model Registry

The Vertex AI Model Registry is a central repository which tracks and lists your models and model versions. You can import models into Vertex AI and they appear in the Vertex AI Model Registry. There is no cost associated with having your models in the Model Registry. Cost is only incurred when you deploy the model to an endpoint or perform a batch prediction on the model. This cost is determined by the type of model you are deploying.

To learn more about pricing for deploying custom models from the Vertex AI Model Registry, see Custom-trained models. To learn more about pricing for deploying AutoML models, see Pricing for AutoML models.

Vertex AI Model Monitoring

Vertex AI enables you to monitor the continued effectiveness of your model after you deploy it to production. For more information, see Introduction to Vertex AI Model Monitoring.

When you use Vertex AI Model Monitoring, you are billed for the following:

  • $3.50 per GB for all data analyzed, including the training data provided and prediction data logged in a BigQuery table.
  • Charges for other Google Cloud products that you use with Model Monitoring, such as BigQuery storage or Batch Explain when attribution monitoring is enabled.

Vertex AI Model Monitoring is supported in the following regions: us-central1, europe-west4, asia-east1, and asia-southeast1. Prices are the same for all regions.

Data sizes are measured after they are converted to TfRecord format.

Training datasets incur a one-time charge when you set up a Vertex AI Model Monitoring job.

Prediction Datasets consist of logs collected from the Online Prediction service. As prediction requests arrive during different time windows, the data for each time window is collected and the sum of the data analyzed for each prediction window is used to calculate the charge.

Example: A data scientist runs model monitoring on the prediction traffic belonging to their model.

  • The model is trained from a BigQuery dataset. The data size after converting to TfRecord is 1.5GB.
  • Prediction data logged between 1:00 - 2:00 p.m. is 0.1 GB, between 3:00 - 4:00 p.m. is 0.2 GB.
  • The total price for setting up the model monitoring job is:
  • (1.5 GB * $3.50) + ((0.1 GB + 0.2 GB) * $3.50) = $6.30

Vertex AI Workbench

Select instances, managed notebooks, or user-managed notebooks for pricing information.

Instances

The tables below provide the approximate price per hour of various VM configurations. You can choose a custom configuration of selected machine types. To calculate pricing, sum the costs of the virtual machines you use.

If you use Compute Engine machine types and attach accelerators, the cost of the accelerators is separate. To calculate this cost, multiply the prices in the table of accelerators below by how many machine hours of each type of accelerator you use.

CPUs

  • Taiwan (asia-east1)
  • Hong Kong (asia-east2)
  • Tokyo (asia-northeast1)
  • Seoul (asia-northeast3)
  • Mumbai (asia-south1)
  • Singapore (asia-southeast1)
  • Jakarta (asia-southeast2)
  • Sydney (australia-southeast1)
  • Warsaw (europe-central2)
  • Belgium (europe-west1)
  • Turin (europe-west12)
  • London (europe-west2)
  • Frankfurt (europe-west3)
  • Netherlands (europe-west4)
  • Zurich (europe-west6)
  • Doha (me-central1)
  • Dammam (me-central2)
  • Tel Aviv (me-west1)
  • Montreal (northamerica-northeast1)
  • Toronto (northamerica-northeast2)
  • Sao Paulo (southamerica-east1)
  • Santiago (southamerica-west1)
  • Iowa (us-central1)
  • South Carolina (us-east1)
  • Northern Virginia (us-east4)
  • Columbus (us-east5)
  • Dallas (us-south1)
  • Oregon (us-west1)
  • Los Angeles (us-west2)
  • Salt Lake City (us-west3)
  • Las Vegas (us-west4)

Machine type

Price per vCPU (USD)

N1

$0.0379332 / 1 hour

N2

$0.0379332 / 1 hour

E2

$0.026173908 / 1 hour

A2

$0.0379332 / 1 hour

Memory

  • Johannesburg (africa-south1)
  • Taiwan (asia-east1)
  • Hong Kong (asia-east2)
  • Tokyo (asia-northeast1)
  • Seoul (asia-northeast3)
  • Mumbai (asia-south1)
  • Singapore (asia-southeast1)
  • Jakarta (asia-southeast2)
  • Sydney (australia-southeast1)
  • Warsaw (europe-central2)
  • Finland (europe-north1)
  • Madrid (europe-southwest1)
  • Belgium (europe-west1)
  • Turin (europe-west12)
  • London (europe-west2)
  • Frankfurt (europe-west3)
  • Netherlands (europe-west4)
  • Zurich (europe-west6)
  • Milan (europe-west8)
  • Doha (me-central1)
  • Dammam (me-central2)
  • Tel Aviv (me-west1)
  • Montreal (northamerica-northeast1)
  • Sao Paulo (southamerica-east1)
  • Iowa (us-central1)
  • South Carolina (us-east1)
  • Northern Virginia (us-east4)
  • Columbus (us-east5)
  • Dallas (us-south1)
  • Oregon (us-west1)
  • Los Angeles (us-west2)
  • Las Vegas (us-west4)

Machine type

Price (USD)

N1

$0.0050844 / 1 gibibyte hour

N2

$0.0050844 / 1 gibibyte hour

E2

$0.003508236 / 1 gibibyte hour

A2

$0.0050844 / 1 gibibyte hour

Accelerators

  • Taiwan (asia-east1)
  • Hong Kong (asia-east2)
  • Tokyo (asia-northeast1)
  • Osaka (asia-northeast2)
  • Seoul (asia-northeast3)
  • Mumbai (asia-south1)
  • Singapore (asia-southeast1)
  • Jakarta (asia-southeast2)
  • Sydney (australia-southeast1)
  • Finland (europe-north1)
  • Belgium (europe-west1)
  • London (europe-west2)
  • Frankfurt (europe-west3)
  • Netherlands (europe-west4)
  • Zurich (europe-west6)
  • Tel Aviv (me-west1)
  • Montreal (northamerica-northeast1)
  • Sao Paulo (southamerica-east1)
  • Iowa (us-central1)
  • South Carolina (us-east1)
  • Northern Virginia (us-east4)
  • Columbus (us-east5)
  • Oregon (us-west1)
  • Los Angeles (us-west2)
  • Salt Lake City (us-west3)
  • Las Vegas (us-west4)

Machine type

Price (USD)

Nvidia Tesla A100

$4.400862 / 1 hour

Nvidia Tesla A100 80GB

$4.51729 / 1 hour

Nvidia Tesla T4

$0.525 / 1 hour

Nvidia Tesla V100

$3.72 / 1 hour

Nvidia Tesla P100

$2.19 / 1 hour

Disks

  • Taiwan (asia-east1)
  • Hong Kong (asia-east2)
  • Tokyo (asia-northeast1)
  • Seoul (asia-northeast3)
  • Mumbai (asia-south1)
  • Singapore (asia-southeast1)
  • Jakarta (asia-southeast2)
  • Sydney (australia-southeast1)
  • Warsaw (europe-central2)
  • Belgium (europe-west1)
  • Turin (europe-west12)
  • London (europe-west2)
  • Frankfurt (europe-west3)
  • Netherlands (europe-west4)
  • Zurich (europe-west6)
  • Doha (me-central1)
  • Dammam (me-central2)
  • Tel Aviv (me-west1)
  • Montreal (northamerica-northeast1)
  • Toronto (northamerica-northeast2)
  • Sao Paulo (southamerica-east1)
  • Santiago (southamerica-west1)
  • Iowa (us-central1)
  • South Carolina (us-east1)
  • Northern Virginia (us-east4)
  • Columbus (us-east5)
  • Dallas (us-south1)
  • Oregon (us-west1)
  • Los Angeles (us-west2)
  • Salt Lake City (us-west3)
  • Las Vegas (us-west4)

Machine type

Price (USD)

Hyperdisk Extreme provisioned space

$0.000205479 / 1 gibibyte hour

Balanced provisioned space

$0.000164384 / 1 gibibyte hour

Extreme provisioned space

$0.000205479 / 1 gibibyte hour

SSD provisioned space

$0.000279452 / 1 gibibyte hour

Standard provisioned space

$0.000065753 / 1 gibibyte hour

Your Vertex AI Workbench instance incurs charges as follows:

  • For CPU and accelerator usage, you're charged when the instance is in the following states:
  • STARTING
  • PROVISIONING
  • ACTIVE
  • UPGRADING
  • ROLLBACKING
  • RESTORING
  • STOPPING
  • SUSPENDING
  • For disk storage, you're charged when the instance is in the following states:
  • STARTING
  • PROVISIONING
  • ACTIVE
  • UPGRADING
  • ROLLBACKING
  • RESTORING
  • STOPPING
  • STOPPED
  • SUSPENDING
  • SUSPENDED

Managed notebooks

Pricing is composed of the compute and storage resources that you use, management fees for your Vertex AI Workbench instances, and any additional cloud resources that you use. See the following sections for more details.

Compute and storage resources

Compute and storage resources are charged at the same rate you currently pay for Compute Engine and Cloud Storage.

Management fees

There are Vertex AI Workbench management fees in addition to your infrastructure usage, captured in the tables below.

SKU

Price (USD)

vCPU

$0.05 per vCPU

T4 and P4 (Standard GPU)

$0.35 per GPU

P100, V100, L4, and A100 GPU (Premium GPU)

$2.48 per GPU

User-managed notebooks

Pricing is composed of the compute and storage resources that you use, management fees for your Vertex AI Workbench instances, and any additional cloud resources that you use. See the following sections for more details.

Compute and storage resources

Compute and storage resources are charged at the same rate you currently pay for Compute Engine and Cloud Storage.

Management fees

There are Vertex AI Workbench management fees in addition to your infrastructure usage, captured in the tables below.

SKU

Price (USD)

vCPU

$0.005 per vCPU

T4 and P4 (Standard GPU)

$0.035 per GPU

P100, V100, and A100 GPU (Premium GPU)

$0.25 per GPU

Additional Google Cloud resources

In addition to the costs mentioned previously, you also pay for any Google Cloud resources that you use. For example:

  • Data analysis services: You incur BigQuery costs when you issue SQL queries within a notebook (see BigQuery pricing).
  • Customer-managed encryption keys: You incur costs when you use customer-managed encryption keys. Each time your managed notebooks or user-managed notebooks instance uses a Cloud Key Management Service key, that operation is billed at the rate of Cloud KMS key operations (see Cloud Key Management Service pricing).

Colab Enterprise

For Colab Enterprise pricing information, see Colab Enterprise pricing.

Deep Learning Containers, Deep Learning VM, and AI Platform Pipelines

For Deep Learning Containers, Deep Learning VM Images, and AI Platform Pipelines, pricing is calculated based on the compute and storage resources that you use. These resources are charged at the same rate you currently pay for Compute Engine and Cloud Storage.

In addition to the compute and storage costs, you also pay for any Google Cloud resources that you use. For example:

  • Data analysis services: You incur BigQuery costs when you issue SQL queries within a notebook (see BigQuery pricing).
  • Customer-managed encryption keys: You incur costs when you use customer-managed encryption keys. Each time your managed notebooks or user-managed notebooks instance uses a Cloud Key Management Service key, that operation is billed at the rate of Cloud KMS key operations (see Cloud Key Management Service pricing).

Data labeling

Vertex AI enables you to request human labeling for a collection of data that you plan to use to train a custom machine learning model. Prices for the service are computed based on the type of labeling task.

  • For regular labeling tasks, the prices are determined by the number of annotation units.
  • For an image classification task, units are determined the number of images and the number of human labelers. For example, an image with 3 human labelers counts for 1 * 3 = 3 units. The price for single-label and multi-label classification are the same.
  • For an image bounding box task, units are determined by the number of bounding boxes identified in the images and the number of human labelers. For example, if an image with 2 bounding boxes and 3 human labelers counts for 2 * 3 = 6 units. Images without bounding boxes will not be charged.
  • For an image segmentation/rotated box/polyline/polygon task, units are determined in the same way as a image bounding box task.
  • For a video classification task, units are determined by the video length (every 5 seconds is a price unit) and the number of human labelers. For example, a 25 seconds video with 3 human labelers counts for 25 / 5 * 3 = 15 units. The price for single-label and multi-label classification are the same.
  • For a video object tracking task, unit are determined by the number of objects identified in the video and the number of human labelers. For example, for a video with 2 objects and 3 human labelers, it counts for 2 * 3 = 6 units. Video without objects will not be charged.
  • For a video action recognition task, units are determined in the same way as a video object tracking task.
  • For a text classification task, units are determined by text length (every 50 words is a price unit) and the number of human labelers. For example, one piece of text with 100 words and 3 human labelers counts for 100 / 50 * 3 = 6 units. The price for single-label and multi-label classification is the same.
  • For a text sentiment task, units are determined in the same way as a text classification task.
  • For a text entity extraction task, units are determined by text length (every 50 words is a price unit), the number of entities identified, and the number of human labelers. For example, a piece of text with 100 words, 2 entities identified, and 3 human labelers counts for 100 / 50 * 2 * 3 = 12 units. Text without entities will not be charged.
  • For image/video/text classification and text sentiment tasks, human labelers may lose track of classes if the label set size is too large. As a result, we send at most 20 classes to the human labelers at a time. For example, if the label set size of a labeling task is 40, each data item will be sent for human review 40 / 20 = 2 times, and we will charge 2 times of the price (calculated above) accordingly.
  • For a labeling task that enables the custom labeler feature, each data item is counted as 1 custom labeler unit.
  • For an active learning labeling task for data items with annotations that are generated by models (without a human labeler's help), each data item is counted as 1 active learning unit.
  • For an active learning labeling task for data items with annotations that are generated by human labelers, each data item is counted as a regular labeling task as described above.

The table below provides the price per 1,000 units per human labeler, based on the unit listed for each objective. Tier 1 pricing applies to the first 50,000 units per month in each Google Cloud project; Tier 2 pricing applies to the next 950,000 units per month in the project, up to 1,000,000 units. Contact us for pricing above 1,000,000 units per month.

Data type

Objective

Unit

Tier 1 price (USD)

Tier 2 price (USD)

Image

Classification

Image

$35

$25

Bounding box

Bounding box

$63

$49

Segmentation

Segment

$870

$850

Rotated box

Bounding box

$86

$60

Polygon/polyline

Polygon/Polyline

$257

$180

Video

Classification

5 sec video

$86

$60

Object tracking

Bounding box

$86

$60

Action recognition

Event in 30 sec video

$214

$150

Text

Classification

50 words

$129

$90

Sentiment

50 words

$200

$140

Entity extraction

Entity

$86

$60

Active Learning

All

Data item

$80

$56

Custom Labeler

All

Data item

$80

$56

Required use of Cloud Storage

In addition to the costs described in this document, you are required to store data and program files in Cloud Storage buckets during the Vertex AI lifecycle. This storage is subject to the Cloud Storage pricing policy.

Required use of Cloud Storage includes:

  • Staging your training application package for custom-trained models.
  • Storing your training input data.
  • Storing the output of your training jobs. Vertex AI does not require long-term storage of these items. You can remove the files as soon as the operation is complete.

Free operations for managing your resources

The resource management operations provided by AI Platform are available free of charge. The AI Platform quota policy does limit some of these operations.

Resource

Free operations

models

create, get, list, delete

versions

create, get, list, delete, setDefault

jobs

get, list, cancel

operations

get, list, cancel, delete

Google Cloud costs

If you store images to be analyzed in Cloud Storage or use other Google Cloud resources in tandem with Vertex AI, then you will also be billed for the use of those services.

To view your current billing status in the Google Cloud console, including usage and your current bill, see the Billing page. For more details about managing your account, see the Cloud Billing Documentation or Billing and Payments Support.

Request a custom quote

With Google Cloud's pay-as-you-go pricing, you only pay for the services you use. Connect with our sales team to get a custom quote for your organization.
Google Cloud