Quota policy

AI Platform Training limits resource allocation and use, and enforces appropriate quotas on a per-project basis. Specific policies vary depending on resource availability, user profile, service usage history, and other factors, and are subject to change without notice.

The sections below outline the current quota limits of the system.

Limits on service requests

You may only make a limited number of individual API requests per 60-second interval. Each limit applies to a particular API or group of APIs as described in the following sections.

You can see your project's request quotas in the API Manager for AI Platform Training on Google Cloud Console. You can apply for a quota increase by clicking the edit icon next to the quota limit and then clicking Apply for higher quota.

Job requests

The following limits apply to projects.jobs.create requests (training and batch prediction jobs combined):

Period Limit
60 seconds 60

Online prediction requests

The following limits apply to projects.predict requests:

Period Limit
60 seconds 6000

Resource management requests

The following limits apply to the combined total all of the supported requests in this list:

Period Limit
60 seconds 300

In addition, all of the delete requests listed above and all version create requests are limited to 10 concurrent combined total requests.

Limits on concurrent usage of virtual machines

Your project's usage of Google Cloud processing resources is measured by the number of virtual machines that it uses. This section describes the limits to concurrent usage of these resources across your project.

Limits on concurrent CPU usage for training

The number of concurrent virtual CPUs for a typical project is scaled, based on the usage history of your project.

  • Total concurrent number of CPUs: Starting from 20 CPUs, scaling to a typical value of 450 CPUs. These limits represent the combined maximum number of CPUs in concurrent use, including all machine types.

The CPUs that you use when training a model are not counted as CPUs for Compute Engine, and the quota for AI Platform Training does not give you access to any Compute Engine VMs for other computing requirements. If you want to spin up a Compute Engine VM, you must request Compute Engine quota, as described in the Compute Engine documentation.

Limits on concurrent GPU usage for training

A typical project, when first using AI Platform Training, is limited to the following number of concurrent GPUs used in training ML models:

  • Total concurrent number of GPUs: This is the maximum number of GPUs in concurrent use, split per type as follows:

    • Concurrent number of Tesla K80 GPUs: 30.
    • Concurrent number of Tesla P4 GPUs: 8.
    • Concurrent number of Tesla P100 GPUs: 30.
    • Concurrent number of Tesla V100 GPUs: 8.
    • Concurrent number of Tesla T4 GPUs: 6.

Note that a project's quotas depend on multiple factors, so the quotas in a specific project may be lower than the numbers listed above. The GPUs that you use when training a model are not counted as GPUs for Compute Engine, and the quota for AI Platform Training does not give you access to any Compute Engine VMs using GPUs. If you want to spin up a Compute Engine VM using a GPU, you must request Compute Engine GPU quota, as described in the Compute Engine documentation.

For more information about GPUs, see how to use GPUs to train models in the cloud.

Limits on concurrent TPU usage for training

As with GPUs, the TPU quota for AI Platform Training is separate from your Cloud TPU quota, which you can use directly with Compute Engine VMs. The TPUs that you use when training a model are not counted as TPUs for Compute Engine, and the quota for AI Platform Training does not give you access to any Compute Engine VMs using TPUs.

Cloud Console only displays your Cloud TPU quota with Compute Engine. To request Cloud TPU quota for use with Compute Engine, submit a request to the Cloud TPU team.

All Google Cloud projects are allocated a default AI Platform Training quota for at least one Cloud TPU. Quota is allocated in units of 8 TPU cores per Cloud TPU. This quota is not displayed on Cloud Console.

Requesting a quota increase

The quotas listed on this page are allocated per project, and may increase over time with use. If you need more processing capability, you can apply for a quota increase in one of the following ways:

  • Use the Google Cloud Console to request increases for quotas that are listed in the API Manager for AI Platform Training:

    1. Find the section of the quota that you want to increase.

    2. Click the pencil icon next to the quota value at the bottom of the usage chart for that quota.

    3. Enter your requested increase:

      • If your desired quota value is within the range displayed on the quota limit dialog, enter your new value and click Save.

      • If you want to increase the quota beyond the maximum displayed, click Apply for higher quota and follow the instructions for the second way to request an increase.

  • If you want to increase a quota not listed in the Cloud Console, then do one of the following:

What's next