This legacy version of AI Platform Prediction is deprecated and will no longer be available on Google Cloud after January 31, 2025. All models, associated metadata, and deployments will be deleted after January 31, 2025. Migrate your resources to Vertex AI to get new machine learning features that are unavailable in AI Platform.

Regions

Google Cloud uses regions, subdivided into zones, to define the geographic location of physical computing resources. When you run a job on AI Platform Prediction, you specify the region that you want it to run in.

You should typically use the region closest to your physical location or the physical location of your intended users, but note the available regions for each service as listed below.

Available regions

AI Platform Prediction is available in the following regions:

Americas

Region	Oregon us-west1	Los Angeles us-west2	Salt Lake City us-west3	Montréal northamerica-northeast1	São Paulo southamerica-east1
Online prediction (legacy MLS1 machine types)
Online prediction (N1 machine types)
Batch prediction	*	*	*	*	*

Europe

Region	London europe-west2	Netherlands europe-west4	Zurich europe-west6	Frankfurt europe-west3	Finland europe-north1
Online prediction (legacy MLS1 machine types)
Online prediction (N1 machine types)
Batch prediction	*	*	*	*	*

Asia Pacific

Region	Mumbai asia-south1	Singapore asia-southeast1	Hong Kong asia-east2	Taiwan asia-east1	Osaka asia-northeast2	Sydney australia-southeast1	Seoul asia-northeast3
Online prediction (legacy MLS1 machine types)
Online prediction (N1 machine types)
Batch prediction	*	*	*	*	*	*	*

Google Cloud also provides additional regions for products other than AI Platform Prediction.

Region considerations

Insufficient resources

Demand is high for GPUs and for compute resources in the us-central1 region. You may get an error message in your job logs that says: Resources are insufficient in region: <region>. Please try a different region..

To resolve this, try using a different region or try again later.

Cloud Storage

You should run your AI Platform Prediction job in the same region as the Cloud Storage bucket that you're using to read and write data for the job.
You should use the Standard Storage class for any Cloud Storage buckets that you're using to read and write data for your AI Platform Prediction job.

Online prediction

When you deploy a model for online prediction, you specify the region that you want prediction to run in. Whether you interact with online prediction through the global endpoint (ml.googleapis.com) or a regional endpoint (REGION-ml.googleapis.com), online predictions are always served from the default region specified for the model. Using a regional endpoint for online prediction provides additional protection for your model against outages in other regions, because it isolates your model and version resources from other regions. Learn more about the differences between using a regional endpoint and using the global endpoint.
Compute Engine (N1) machine types for online prediction are only available on regional endpoints. Compute Engine (N1) machine types are not available when you use the global endpoint.

Using GPUs for online prediction

Using GPUs for online prediction is only available in specific regions, on regional endpoints. You cannot use GPUs on the global endpoint. The following table lists all the available accelerators for online prediction, for each regional endpoint:

Americas

Region	Oregon us-west1	Iowa us-central1	South Carolina us-east1	N. Virginia us-east4	Montréal northamerica-northeast1
NVIDIA Tesla K80
NVIDIA Tesla P4
NVIDIA Tesla P100
NVIDIA Tesla T4
NVIDIA Tesla V100

Europe

Region	London europe-west2	Belgium europe-west1	Netherlands europe-west4	Frankfurt europe-west3
NVIDIA Tesla K80
NVIDIA Tesla P4
NVIDIA Tesla P100
NVIDIA Tesla T4
NVIDIA Tesla V100

Asia Pacific

Region	Singapore asia-southeast1	Taiwan asia-east1	Tokyo asia-northeast1	Sydney australia-southeast1
NVIDIA Tesla K80
NVIDIA Tesla P4
NVIDIA Tesla P100
NVIDIA Tesla T4
NVIDIA Tesla V100

Batch prediction

To perform batch prediction, you must use the global API endpoint, not a regional endpoint.
You can only deploy models and model versions for batch prediction in the following regions:
- us-central1
- us-east1
- us-east4
- europe-west1
- asia-northeast1
To perform batch prediction in other available regions, which are marked with asterisks in the Available regions table, you must use a TensorFlow SavedModel stored in Cloud Storage.
For best performance in batch prediction, you should run your prediction job and store your input and output data in the same region, especially for very large datasets.
When you deploy a model for batch prediction, you specify the default region that you want prediction to run in. When you start a batch prediction job, you can specify a region to run the job in, overriding the default region.

Restricting resource locations

Organization policy administrators can restrict the regions available for models and batch prediction jobs by creating a resource locations constraint. Read about how a resource locations constraint applies to AI Platform Prediction