Quotas and limits

Stay organized with collections Save and categorize content based on your preferences.

This page lists the default quotas that apply to Vertex AI. Your quotas might differ from the defaults if you have previously requested higher quotas.

For more information about quotas in Google Cloud, including how to request an increase, see the quotas documentation.

Request quotas

The following quotas apply to Vertex AI requests for a given project and region. For example, in a single project, you can have up to 30,000 online prediction requests per minute in one region and another 30,000 online prediction requests per minute in another region.

Request quota Value
Resource management requests* per minute 600
Job or long-running operation requests per minute 60
Online prediction requests per minute+ 30,000
Online prediction request throughput per minute 1.5 GB
Online explanation requests per minute 600
Vertex AI Vizier requests per minute 6,000
Vertex AI Feature Store online serving requests per minute 300,000
Vertex ML Metadata requests per minute 12,000

* Resource management requests include any request that is not a job, long-running operation, online prediction request, or Vertex AI Vizier request.

+ This quota applies for public endpoints only. Private endpoints have unlimited requests per minute.

Jobs or long-running operations include the following requests:

  • Creating or deleting a dataset
  • Importing or exporting data to or from a dataset
  • Creating an endpoint
  • Creating or deleting a custom job
  • Creating or deleting a data labeling job
  • Creating or deleting a hyper parameter tuning job
  • Creating or deleting a batch prediction job
  • Creating or deleting a model
  • Uploading, deleting, or exporting a model

AutoML model quotas

The following quotas apply to each data type and objective for a given project and region. For example, in a particular project and region, you can deploy 10 AutoML image classification models and 10 AutoML image object detection models for a total of 20 deployed models.

Image

Classification

Quota Value
Concurrent training jobs 5
Concurrent training jobs with explainable AI 2
Concurrent batch prediction jobs 5
Concurrent model deployment jobs 5
Concurrent model undeployment jobs 5
Number of deployed models 10

Object detection

Quota Value
Concurrent training jobs 5
Concurrent batch prediction jobs 5
Number of deployed models 10

Tabular

Quota Value
Concurrent training jobs 5
Concurrent batch prediction jobs 5
Number of deployed models 30

Text

Classification

Quota Value
Concurrent training jobs 5
Concurrent batch prediction jobs 5
Number of deployed models 10

Entity extraction

Quota Value
Concurrent training jobs 5
Concurrent batch prediction jobs 5
Number of deployed models 10

Sentiment analysis

Quota Value
Concurrent training jobs 5
Concurrent batch prediction jobs 5
Number of deployed models 10

Video

Action Recognition

Quota Value
Concurrent training jobs 5
Concurrent batch prediction jobs 5

Classification

Quota Value
Concurrent training jobs 5
Concurrent batch prediction jobs 5

Object tracking

Quota Value
Concurrent training jobs 5
Concurrent batch prediction jobs 5

AutoML Video Intelligence specific limits

The following describes current restrictions on the use of AutoML Video Intelligence.

Type of limits Value
Maximum video length 3 hours
Maximum video file size 50GB
Minimum labels per dataset 2
Minimum videos per label 10 (1000 is recommended)
Batch input CSV file size Maximum: 100MB
Number of video segments in batch input Maximum: 1,000

Custom-trained model quotas

The following quotas apply to Vertex AI custom-trained models for a given project and region.

Training

Number of concurrent N1 + E2 CPUs for training, per region
RegionValue
us-west1 2,200
us-west2 20
us-west3 2,200
us-west4 20
us-central1 2,200
us-east1 2,200
us-east4 20
us-south1 450
northamerica-northeast1 2,200
northamerica-northeast2 20
southamerica-east1 20
europe-west2 2,200
europe-west1 2,200
europe-west4 2,200
europe-west6 20
europe-west3 2,200
europe-central2 450
europe-west9 450
asia-south1 2,200
asia-southeast1 2,200
asia-southeast2 2,200
asia-east2 2,200
asia-east1 2,200
asia-northeast1 2,200
australia-southeast1 2,200
asia-northeast3 2,200
me-west1 450
Number of concurrent A2 CPUs for training, per region
RegionValue
us-west1 Not available
us-west2 Not available
us-west3 Not available
us-west4 Not available
us-central1 96
us-east1 Not available
us-east4 Not available
us-south1 Not available
northamerica-northeast1 Not available
northamerica-northeast2 Not available
southamerica-east1 Not available
europe-west2 Not available
europe-west1 Not available
europe-west4 96
europe-west6 Not available
europe-west3 Not available
europe-central2 Not available
europe-west9 Not available
asia-south1 Not available
asia-southeast1 96
asia-southeast2 Not available
asia-east2 Not available
asia-east1 Not available
asia-northeast1 Not available
australia-southeast1 Not available
asia-northeast3 Not available
me-west1 Not available
Number of concurrent N2 CPUs for training, per region
RegionValue
us-west1 20
us-west2 20
us-west3 20
us-west4 20
us-central1 450
us-east1 20
us-east4 20
us-south1 20
northamerica-northeast1 20
northamerica-northeast2 20
southamerica-east1 20
europe-west2 20
europe-west1 20
europe-west4 450
europe-west6 20
europe-west3 20
europe-central2 20
europe-west9 450
asia-south1 20
asia-southeast1 20
asia-southeast2 20
asia-east2 20
asia-east1 450
asia-northeast1 20
australia-southeast1 20
asia-northeast3 20
me-west1 20
Number of concurrent C2 CPUs for training, per region
RegionValue
us-west1 20
us-west2 20
us-west3 20
us-west4 20
us-central1 450
us-east1 20
us-east4 20
us-south1 20
northamerica-northeast1 20
northamerica-northeast2 20
southamerica-east1 20
europe-west2 20
europe-west1 20
europe-west4 450
europe-west6 20
europe-west3 20
europe-central2 20
europe-west9 450
asia-south1 20
asia-southeast1 20
asia-southeast2 20
asia-east2 20
asia-east1 450
asia-northeast1 20
australia-southeast1 20
asia-northeast3 20
me-west1 20
Number of concurrent A100 GPUs for training, per region
RegionValue
us-west1 Not available
us-west2 Not available
us-west3 Not available
us-west4 Not available
us-central1 8
us-east1 Not available
us-east4 Not available
us-south1 Not available
northamerica-northeast1 Not available
northamerica-northeast2 Not available
southamerica-east1 Not available
europe-west2 Not available
europe-west1 Not available
europe-west4 8
europe-west6 Not available
europe-west3 Not available
europe-central2 Not available
europe-west9 Not available
asia-south1 Not available
asia-southeast1 8
asia-southeast2 Not available
asia-east2 Not available
asia-east1 Not available
asia-northeast1 Not available
australia-southeast1 Not available
asia-northeast3 Not available
me-west1 Not available
Number of concurrent A100 80GB GPUs for training, per region
RegionValue
us-west1 Not available
us-west2 Not available
us-west3 Not available
us-west4 Not available
us-central1 0
us-east1 Not available
us-east4 0
us-south1 Not available
northamerica-northeast1 Not available
northamerica-northeast2 Not available
southamerica-east1 Not available
europe-west2 Not available
europe-west1 Not available
europe-west4 Not available
europe-west6 Not available
europe-west3 Not available
europe-central2 Not available
europe-west9 Not available
asia-south1 Not available
asia-southeast1 Not available
asia-southeast2 Not available
asia-east2 Not available
asia-east1 Not available
asia-northeast1 Not available
australia-southeast1 Not available
asia-northeast3 Not available
me-west1 Not available

If interested, please see the quotas documentation.

Number of concurrent K80 GPUs for training, per region
RegionValue
us-west1 30
us-west2 Not available
us-west3 Not available
us-west4 Not available
us-central1 56
us-east1 30
us-east4 Not available
us-south1 Not available
northamerica-northeast1 Not available
northamerica-northeast2 Not available
southamerica-east1 Not available
europe-west2 Not available
europe-west1 30
europe-west4 Not available
europe-west6 Not available
europe-west3 Not available
europe-central2 Not available
europe-west9 Not available
asia-south1 Not available
asia-southeast1 Not available
asia-southeast2 Not available
asia-east2 Not available
asia-east1 56
asia-northeast1 Not available
australia-southeast1 Not available
asia-northeast3 Not available
me-west1 Not available
Number of concurrent P100 GPUs for training, per region
RegionValue
us-west1 30
us-west2 Not available
us-west3 Not available
us-west4 Not available
us-central1 56
us-east1 30
us-east4 Not available
us-south1 Not available
northamerica-northeast1 Not available
northamerica-northeast2 Not available
southamerica-east1 Not available
europe-west2 Not available
europe-west1 30
europe-west4 Not available
europe-west6 Not available
europe-west3 Not available
europe-central2 Not available
europe-west9 Not available
asia-south1 Not available
asia-southeast1 Not available
asia-southeast2 Not available
asia-east2 Not available
asia-east1 30
asia-northeast1 Not available
australia-southeast1 6
asia-northeast3 Not available
me-west1 Not available
Number of concurrent P4 GPUs for training, per region
RegionValue
us-west1 Not available
us-west2 6
us-west3 Not available
us-west4 Not available
us-central1 6
us-east1 Not available
us-east4 1
us-south1 Not available
northamerica-northeast1 6
northamerica-northeast2 Not available
southamerica-east1 Not available
europe-west2 Not available
europe-west1 Not available
europe-west4 6
europe-west6 Not available
europe-west3 Not available
europe-central2 Not available
europe-west9 Not available
asia-south1 Not available
asia-southeast1 6
asia-southeast2 Not available
asia-east2 Not available
asia-east1 Not available
asia-northeast1 Not available
australia-southeast1 6
asia-northeast3 Not available
me-west1 Not available
Number of concurrent T4 GPUs for training, per region
RegionValue
us-west1 2
us-west2 Not available
us-west3 Not available
us-west4 Not available
us-central1 12
us-east1 2
us-east4 Not available
us-south1 Not available
northamerica-northeast1 Not available
northamerica-northeast2 Not available
southamerica-east1 Not available
europe-west2 6
europe-west1 Not available
europe-west4 2
europe-west6 Not available
europe-west3 0
europe-central2 Not available
europe-west9 Not available
asia-south1 6
asia-southeast1 1
asia-southeast2 Not available
asia-east2 Not available
asia-east1 Not available
asia-northeast1 6
australia-southeast1 Not available
asia-northeast3 1
me-west1 Not available
Number of concurrent V100 GPUs for training, per region
RegionValue
us-west1 6
us-west2 Not available
us-west3 Not available
us-west4 Not available
us-central1 6
us-east1 Not available
us-east4 Not available
us-south1 Not available
northamerica-northeast1 Not available
northamerica-northeast2 Not available
southamerica-east1 Not available
europe-west2 Not available
europe-west1 Not available
europe-west4 6
europe-west6 Not available
europe-west3 Not available
europe-central2 Not available
europe-west9 Not available
asia-south1 Not available
asia-southeast1 Not available
asia-southeast2 Not available
asia-east2 Not available
asia-east1 6
asia-northeast1 Not available
australia-southeast1 Not available
asia-northeast3 Not available
me-west1 Not available
Number of concurrent TPU V2 cores for training, per region
RegionValue
us-west1 Not available
us-west2 Not available
us-west3 Not available
us-west4 Not available
us-central1 8
us-east1 Not available
us-east4 Not available
us-south1 Not available
northamerica-northeast1 Not available
northamerica-northeast2 Not available
southamerica-east1 Not available
europe-west2 Not available
europe-west1 Not available
europe-west4 8
europe-west6 Not available
europe-west3 Not available
europe-central2 Not available
europe-west9 Not available
asia-south1 Not available
asia-southeast1 Not available
asia-southeast2 Not available
asia-east2 Not available
asia-east1 8
asia-northeast1 Not available
australia-southeast1 Not available
asia-northeast3 Not available
me-west1 Not available
Number of concurrent TPU V2 pod cores for training, per region
RegionValue
us-west1 Not available
us-west2 Not available
us-west3 Not available
us-west4 Not available
us-central1 Not available
us-east1 Not available
us-east4 Not available
us-south1 Not available
northamerica-northeast1 Not available
northamerica-northeast2 Not available
southamerica-east1 Not available
europe-west2 Not available
europe-west1 Not available
europe-west4 Not available
europe-west6 Not available
europe-west3 Not available
europe-central2 Not available
europe-west9 Not available
asia-south1 Not available
asia-southeast1 Not available
asia-southeast2 Not available
asia-east2 Not available
asia-east1 Not available
asia-northeast1 Not available
australia-southeast1 Not available
asia-northeast3 Not available
me-west1 Not available
Number of concurrent TPU V3 cores for training, per region
RegionValue
us-west1 Not available
us-west2 Not available
us-west3 Not available
us-west4 Not available
us-central1 8
us-east1 Not available
us-east4 Not available
us-south1 Not available
northamerica-northeast1 Not available
northamerica-northeast2 Not available
southamerica-east1 Not available
europe-west2 Not available
europe-west1 Not available
europe-west4 8
europe-west6 Not available
europe-west3 Not available
europe-central2 Not available
europe-west9 Not available
asia-south1 Not available
asia-southeast1 Not available
asia-southeast2 Not available
asia-east2 Not available
asia-east1 8
asia-northeast1 Not available
australia-southeast1 Not available
asia-northeast3 Not available
me-west1 Not available
Number of concurrent TPU V3 pod cores for training, per region
RegionValue
us-west1 Not available
us-west2 Not available
us-west3 Not available
us-west4 Not available
us-central1 Not available
us-east1 Not available
us-east4 Not available
us-south1 Not available
northamerica-northeast1 Not available
northamerica-northeast2 Not available
southamerica-east1 Not available
europe-west2 Not available
europe-west1 Not available
europe-west4 Not available
europe-west6 Not available
europe-west3 Not available
europe-central2 Not available
europe-west9 Not available
asia-south1 Not available
asia-southeast1 Not available
asia-southeast2 Not available
asia-east2 Not available
asia-east1 Not available
asia-northeast1 Not available
australia-southeast1 Not available
asia-northeast3 Not available
me-west1 Not available
HDD usage (GB) during training, per region
RegionValue
us-west1 180,000
us-west2 3,600
us-west3 180,000
us-west4 3,600
us-central1 180,000
us-east1 180,000
us-east4 3,600
us-south1 180,000
northamerica-northeast1 180,000
northamerica-northeast2 3,600
southamerica-east1 3,600
europe-west2 180,000
europe-west1 180,000
europe-west4 180,000
europe-west6 3,600
europe-west3 180,000
europe-central2 180,000
europe-west9 180,000
asia-south1 180,000
asia-southeast1 180,000
asia-southeast2 180,000
asia-east2 180,000
asia-east1 180,000
asia-northeast1 180,000
australia-southeast1 180,000
asia-northeast3 180,000
me-west1 180,000
SSD usage (GB) during training, per region
RegionValue
us-west1 75,000
us-west2 450
us-west3 75,000
us-west4 450
us-central1 75,000
us-east1 75,000
us-east4 450
us-south1 75,000
northamerica-northeast1 75,000
northamerica-northeast2 450
southamerica-east1 450
europe-west2 75,000
europe-west1 75,000
europe-west4 75,000
europe-west6 450
europe-west3 75,000
europe-central2 75,000
europe-west9 75,000
asia-south1 75,000
asia-southeast1 75,000
asia-southeast2 75,000
asia-east2 75,000
asia-east1 75,000
asia-northeast1 75,000
australia-southeast1 75,000
asia-northeast3 75,000
me-west1 75,000

Serving

Quota Value
Number of deployed custom models 100
Number of concurrent CPUs for serving, per region
RegionValue
us-west1 2,200
us-west2 2,200
us-west3 2,200
us-west4 16
us-central1 2,200
us-east1 2,200
us-east4 2,200
us-south1 450
northamerica-northeast1 2,200
northamerica-northeast2 450
southamerica-east1 16
europe-west2 2,200
europe-west1 2,200
europe-west4 2,200
europe-west6 2,200
europe-west3 2,200
europe-central2 450
europe-west9 16
asia-south1 2,200
asia-southeast1 2,200
asia-southeast2 2,200
asia-east2 2,200
asia-east1 2,200
asia-northeast1 2,200
australia-southeast1 2,200
asia-northeast3 2,200
me-west1 450
Number of concurrent K80 GPUs for serving, per region
RegionValue
us-west1 Not available
us-west2 Not available
us-west3 Not available
us-west4 Not available
us-central1 56
us-east1 30
us-east4 Not available
us-south1 Not available
northamerica-northeast1 Not available
northamerica-northeast2 Not available
southamerica-east1 Not available
europe-west2 Not available
europe-west1 30
europe-west4 Not available
europe-west6 Not available
europe-west3 Not available
europe-central2 Not available
europe-west9 Not available
asia-south1 Not available
asia-southeast1 Not available
asia-southeast2 Not available
asia-east2 Not available
asia-east1 56
asia-northeast1 Not available
australia-southeast1 Not available
asia-northeast3 Not available
me-west1 Not available
Number of concurrent P100 GPUs for serving, per region
RegionValue
us-west1 30
us-west2 Not available
us-west3 Not available
us-west4 Not available
us-central1 56
us-east1 30
us-east4 Not available
us-south1 Not available
northamerica-northeast1 Not available
northamerica-northeast2 Not available
southamerica-east1 Not available
europe-west2 Not available
europe-west1 30
europe-west4 Not available
europe-west6 Not available
europe-west3 Not available
europe-central2 Not available
europe-west9 Not available
asia-south1 Not available
asia-southeast1 Not available
asia-southeast2 Not available
asia-east2 Not available
asia-east1 30
asia-northeast1 Not available
australia-southeast1 Not available
asia-northeast3 Not available
me-west1 Not available
Number of concurrent P4 GPUs for serving, per region
RegionValue
us-west1 Not available
us-west2 6
us-west3 Not available
us-west4 Not available
us-central1 6
us-east1 Not available
us-east4 6
us-south1 Not available
northamerica-northeast1 6
northamerica-northeast2 Not available
southamerica-east1 Not available
europe-west2 Not available
europe-west1 Not available
europe-west4 6
europe-west6 Not available
europe-west3 Not available
europe-central2 Not available
europe-west9 Not available
asia-south1 Not available
asia-southeast1 6
asia-southeast2 Not available
asia-east2 Not available
asia-east1 Not available
asia-northeast1 Not available
australia-southeast1 6
asia-northeast3 Not available
me-west1 Not available
Number of concurrent T4 GPUs for serving, per region
RegionValue
us-west1 12
us-west2 Not available
us-west3 Not available
us-west4 Not available
us-central1 12
us-east1 12
us-east4 Not available
us-south1 Not available
northamerica-northeast1 Not available
northamerica-northeast2 Not available
southamerica-east1 Not available
europe-west2 12
europe-west1 Not available
europe-west4 12
europe-west6 Not available
europe-west3 0
europe-central2 Not available
europe-west9 Not available
asia-south1 6
asia-southeast1 6
asia-southeast2 Not available
asia-east2 Not available
asia-east1 6
asia-northeast1 6
australia-southeast1 Not available
asia-northeast3 6
me-west1 Not available
Number of concurrent V100 GPUs for serving, per region
RegionValue
us-west1 6
us-west2 Not available
us-west3 Not available
us-west4 Not available
us-central1 6
us-east1 Not available
us-east4 Not available
us-south1 Not available
northamerica-northeast1 Not available
northamerica-northeast2 Not available
southamerica-east1 Not available
europe-west2 Not available
europe-west1 Not available
europe-west4 6
europe-west6 Not available
europe-west3 Not available
europe-central2 Not available
europe-west9 Not available
asia-south1 Not available
asia-southeast1 Not available
asia-southeast2 Not available
asia-east2 Not available
asia-east1 Not available
asia-northeast1 Not available
australia-southeast1 Not available
asia-northeast3 Not available
me-west1 Not available

Vertex AI Feature Store

The following quotas apply to a given project and region. For example, in a single project, you can have 75 concurrent batch jobs in us-central1 and another 75 jobs in europe-west4.

Quota Value
Online serving requests per minute 300,000
Streaming ingestion requests per minute 60,000
Streaming ingestion write throughput per minute 1.2 GB
Feature creation requests per minute 100
Online serving nodes across all featurestores 30
Concurrent batch jobs (ingestion, serving, and delete feature values combined) 75
Concurrent requests to delete feature values 1
Entity types across all featurestores 75

Vertex AI Feature Store also has the following limits. Note that, unlike quotas, you cannot request an increase to a limit.

Limit Value
Storage limit for an online serving node 5 TB
Total data in the offline store Unlimited
Features per entity type 5,000
Number of create, update, and delete featurestore requests per day per project per region 500
For streaming ingestion, the size per request 1 MB
For streaming read, the number of entities that can be included per request 100
For batch import, the number of files that can be included per request 5,000 for Avro or 500 for CSV
For batch serving and exports, the number of features you can request 5,000
Data retention (oldest feature value timestamp after which the values are deleted) 4,000 days from the current date
For batch ingestion and streaming ingestion, the oldest timestamp for which feature data can be ingested 4,000 days from current date

Vertex AI Matching Engine

The following quotas apply to Vertex AI Matching Engine for a given project in each region.

Quota Value
Concurrent index creation operations 5
Concurrent index update operations 5
Number of deployed index nodes 50
Number of deployed index N2D nodes 5
Number of Index 100
Streaming Update requests per minute 6000
Streaming Update throughput(in KB) per minute 120000

Vertex AI Pipelines

The following quotas and limits apply to Vertex AI Pipelines for a given project in each region.

Quota Value
Running pipeline tasks in parallel 600
Concurrent pipeline job* 300

* Pipeline runs beyond this limit will be queued until resources are available.

Vertex AI Pipelines also has the following limits. Note that, unlike quotas, you cannot request an increase to a limit.

Limit Value
Number of pipeline tasks per job 10,000
Input and output artifacts per pipeline task 100
Input and output artifacts per pipeline job 10,000

Quota increases

If you want to increase any of your quotas for Vertex AI, you can use the Google Cloud console to request a quota increase.

For more information about submitting a quota increase request, see the following sections of Working with quotas: