Google Cloud uses regions, subdivided into zones, to define the geographic location of physical computing resources. When you run a job on AI Platform Prediction, you specify the region that you want it to run in.
You should typically use the region closest to your physical location or the physical location of your intended users, but note the available regions for each service as listed below.
Available regions
AI Platform Prediction is available in the following regions:
Americas
Region | Oregon us-west1 |
Los Angeles us-west2 |
Salt Lake City us-west3 |
Iowa us-central1 |
South Carolina us-east1 |
N. Virginia us-east4 |
Montréal northamerica-northeast1 |
São Paulo southamerica-east1 |
---|---|---|---|---|---|---|---|---|
Online prediction (legacy MLS1 machine types) | ||||||||
Online prediction (N1 machine types) | ||||||||
Batch prediction | * | * | * | * | * |
Europe
Region | London europe-west2 |
Belgium europe-west1 |
Netherlands europe-west4 |
Zurich europe-west6 |
Frankfurt europe-west3 |
Finland europe-north1 |
---|---|---|---|---|---|---|
Online prediction (legacy MLS1 machine types) | ||||||
Online prediction (N1 machine types) | ||||||
Batch prediction | * | * | * | * | * |
Asia Pacific
Region | Mumbai asia-south1 |
Singapore asia-southeast1 |
Hong Kong asia-east2 |
Taiwan asia-east1 |
Tokyo asia-northeast1 |
Osaka asia-northeast2 |
Sydney australia-southeast1 |
Seoul asia-northeast3 |
---|---|---|---|---|---|---|---|---|
Online prediction (legacy MLS1 machine types) | ||||||||
Online prediction (N1 machine types) | ||||||||
Batch prediction | * | * | * | * | * | * | * |
Google Cloud also provides additional regions for products other than AI Platform Prediction.
Region considerations
Insufficient resources
Demand is high for GPUs and for compute resources in the us-central1
region.
You may get an error message in your job logs that says: Resources are
insufficient in region: <region>. Please try a different region.
.
To resolve this, try using a different region or try again later.
Cloud Storage
You should run your AI Platform Prediction job in the same region as the Cloud Storage bucket that you're using to read and write data for the job.
You should use the Standard Storage class for any Cloud Storage buckets that you're using to read and write data for your AI Platform Prediction job.
Online prediction
When you deploy a model for online prediction, you specify the region that you want prediction to run in. Whether you interact with online prediction through the global endpoint (
ml.googleapis.com
) or a regional endpoint (REGION-ml.googleapis.com
), online predictions are always served from the default region specified for the model. Using a regional endpoint for online prediction provides additional protection for your model against outages in other regions, because it isolates your model and version resources from other regions. Learn more about the differences between using a regional endpoint and using the global endpoint.Compute Engine (N1) machine types for online prediction are only available on regional endpoints. Compute Engine (N1) machine types are not available when you use the global endpoint.
Using GPUs for online prediction
Using GPUs for online prediction is only available in specific regions, on regional endpoints. You cannot use GPUs on the global endpoint. The following table lists all the available accelerators for online prediction, for each regional endpoint:
Americas
Region | Oregon us-west1 |
Iowa us-central1 |
South Carolina us-east1 |
N. Virginia us-east4 |
Montréal northamerica-northeast1 |
---|---|---|---|---|---|
NVIDIA Tesla P4 | |||||
NVIDIA Tesla P100 | |||||
NVIDIA Tesla T4 | |||||
NVIDIA Tesla V100 |
Europe
Region | London europe-west2 |
Belgium europe-west1 |
Netherlands europe-west4 |
Frankfurt europe-west3 |
---|---|---|---|---|
NVIDIA Tesla P4 | ||||
NVIDIA Tesla P100 | ||||
NVIDIA Tesla T4 | ||||
NVIDIA Tesla V100 |
Asia Pacific
Region | Singapore asia-southeast1 |
Taiwan asia-east1 |
Tokyo asia-northeast1 |
Sydney australia-southeast1 |
---|---|---|---|---|
NVIDIA Tesla P4 | ||||
NVIDIA Tesla P100 | ||||
NVIDIA Tesla T4 | ||||
NVIDIA Tesla V100 |
Batch prediction
To perform batch prediction, you must use the global API endpoint, not a regional endpoint.
You can only deploy models and model versions for batch prediction in the following regions:
us-central1
us-east1
us-east4
europe-west1
asia-northeast1
To perform batch prediction in other available regions, which are marked with asterisks in the Available regions table, you must use a TensorFlow SavedModel stored in Cloud Storage.
For best performance in batch prediction, you should run your prediction job and store your input and output data in the same region, especially for very large datasets.
When you deploy a model for batch prediction, you specify the default region that you want prediction to run in. When you start a batch prediction job, you can specify a region to run the job in, overriding the default region.
Restricting resource locations
Organization policy administrators can restrict the regions available for models and batch prediction jobs by creating a resource locations constraint. Read about how a resource locations constraint applies to AI Platform Prediction