Error code 429

This guide shows you how to troubleshoot 429 resource exhausted errors for different quota frameworks in Vertex AI. A 429 error indicates that the number of your requests exceeds the capacity allocated to process them.

The following table shows the error message for each quota framework:

Quota framework Message
Pay-as-you-go Resource exhausted, please try again later.
Provisioned Throughput Too many requests. Exceeded the Provisioned Throughput.

Troubleshoot pay-as-you-go errors

In the pay-as-you-go model, you use a shared pool of resources. If resources aren't available when you make a request, Vertex AI returns a 429 error. This error doesn't count against your error rate as described in your service level agreement (SLA).

To resolve 429 errors, consider the following options:

Troubleshoot Provisioned Throughput errors

If you have a Provisioned Throughput subscription, you receive a 429 error when your requests exceed your reserved throughput and you have configured your endpoint to reject overages.

To resolve 429 errors, you can do one of the following:

  • Configure your endpoint to process overages on-demand, which is the Default behavior example. With this setting, overages are billed as pay-as-you-go instead of being rejected.
  • Increase the number of GSUs in your Provisioned Throughput subscription.

Provisioned Throughput behavior

When you subscribe to Provisioned Throughput, Vertex AI reserves the purchased amount of throughput for your project. How Vertex AI handles requests varies depending on whether you use more or less than your purchased throughput:

  • Under-utilization: If you use less than your purchased throughput, the handling of capacity-related errors depends on the type of Provisioned Throughput subscription:
    • Standard: Capacity-related errors that would otherwise be 429 are returned as 5XX and count toward the SLA error rate.
    • Single Zone: Capacity-related 429 errors are treated as 5XX but don't count toward the SLA error rate.
  • Over-utilization: By default, when you exceed your purchased throughput, additional requests are processed on-demand and billed as pay-as-you-go.

What's next