Starting April 29, 2025, Gemini 1.5 Pro and Gemini 1.5 Flash models are not available in projects that have no prior usage of these models, including new projects. For details, see Model versions and lifecycle.
The client doesn't have sufficient permission to call the API.
A service account doesn't have permission to access the Cloud Storage bucket that hosts image or video resources.
1. Verify that all necessary APIs are enabled and the service account has the correct permission to access the selected Vertex AI service. 2. Ensure the Vertex AI per-product, per-project service account (P4SA) is granted the necessary permission to access resources referenced in the input.
404
NOT_FOUND
A valid object was not found at the specified URL.
An image file isn't found in the storage URL.
Check and fix the file location.
429
RESOURCE_EXHAUSTED
Depending on the error message, the cause could be one of the following: 1. API quota is over the limit. 2. The server is overloaded due to shared server capacity. 3. You've reached the daily limit for requests using logprobs.
The Gemini API exceeds the request per minute limit.
A server error occurred due to overload or dependency failure.
The request is throttled because the service is temporarily overloaded.
Retry after a few seconds. If the error persists for a prolonged period (hours), contact Vertex AI support.
503
UNAVAILABLE
The service is temporarily unavailable.
The server isn't responding to incoming requests.
The unavailable status might be temporary. If the error persists, contact Vertex AI support.
504
DEADLINE_EXCEEDED
The client set a deadline that is shorter than the server's default deadline (10 minutes), and the request didn't finish within the client-provided deadline.
Consider increasing the client-provided deadline.
Handle errors
To manage API requests and avoid errors, follow these best practices:
Avoid traffic spikes: Sudden, large increases in requests within a short period can cause quota enforcement issues and server overloads. To avoid this, distribute your requests more evenly over time.
Implement retry logic carefully: When you retry a failed request, limit the number of retries to a maximum of two. Use exponential backoff, and start with a minimum delay of one second between retries.
What's next
Generative AI on Vertex AI has some limitations. To learn more, see PaLM API limitations.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-27 UTC."],[],[],null,["This guide provides a list of errors that you might encounter from using the\n[Model API reference for Generative\nAI](/vertex-ai/generative-ai/docs/model-reference/overview). The errors follow\nthe [error model](/apis/design/errors) of the Google Cloud API, which recommends\nthat we provide guidance on the causes and the solutions specific to the\ngenerative AI models.\n\nAPI errors\n\nThis table provides API error codes and descriptions.\n\n| HTTP error code | Canonical error code | Cause | Example | Solution |\n|-----------------|------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| 400 | `INVALID_ARGUMENT / FAILED_PRECONDITION` | Request fails API validation, or you tried to access a model that requires allowlisting or is disallowed by the organization's policy. | Request exceeds the model's input token limit. | Refer to the [Model API reference for Generative AI](/vertex-ai/generative-ai/docs/model-reference/overview) for request parameters, token count, and other parameters. |\n| 403 | `PERMISSION_DENIED` | Client doesn't have sufficient permission to call the API. | Service account doesn't have permission to access the Cloud Storage bucket hosting image or video resources. | 1. Verify that all necessary APIs are enabled, and the service account has the right [permission](/vertex-ai/generative-ai/docs/access-control) to access the selected Vertex AI service. 2. Vertex AI per-product, per-project service account (P4SA) is granted the necessary permission to access resources referenced in the input. |\n| 404 | `NOT_FOUND` | No valid object is found from the designated URL. | Image file not found in the storage URL. | Check and fix the file location. |\n| 429 | `RESOURCE_EXHAUSTED` | Depending on the error message, the error could be caused by the following: 1. API quota over the limit. 2. Server overload due to shared server capacity. 3. You've reached the daily limit for requests using `logprobs`. | Gemini API exceeds request per minute limit. | 1. Check [Vertex AI Generative AI quota limits](/vertex-ai/generative-ai/docs/quotas). If needed, apply for a higher quota. 2. Retry after a few seconds. If the error persists after a prolonged period of time (hours), contact [Vertex AI support](/vertex-ai/docs/support/getting-support). 3. Consider purchasing [Provisioned Throughput](/vertex-ai/generative-ai/docs/provisioned-throughput/error-code-429). |\n| 499 | `CANCELLED` | Request is cancelled by the client. | | |\n| 500 | `UNKNOWN / INTERNAL` | Server error due to overload or dependency failure. | Request is throttled, because the service is temporarily overloaded. | Retry after a few seconds. If the error persists after a prolonged period of time (hours), contact [Vertex AI support](/vertex-ai/docs/support/getting-support). |\n| 503 | `UNAVAILABLE` | Service is temporarily unavailable. | Server isn't responding to the incoming requests. | The unavailable status might be temporary. However, if the error persists, contact [Vertex AI support](/vertex-ai/docs/support/getting-support). |\n| 504 | `DEADLINE_EXCEEDED` | The client sets a deadline shorter than the server's default deadline (10 minutes), and the request didn't finish within the client-provided deadline. | Consider increasing the client-provided deadline. | |\n\nHandle errors\n\nAvoid spikes in traffic. Spikes are sudden and significant increases in the\nnumber of requests within a very short period of time. Sometimes, spikes in\ntraffic might cause issues for quota enforcement and might increase the chance\nof server overloading.\n\nBe careful about retrying an event. We recommend retrying no more than two\ntimes. The minimum delay is one second with subsequent requests backing up\nexponentially.\n\nWhat's next\n\n- Generative AI on Vertex AI has some limitations. To learn more, see [PaLM API limitations](/vertex-ai/generative-ai/docs/learn/responsible-ai#limitations).\n- Try a quickstart tutorial using [Vertex AI Studio](/vertex-ai/generative-ai/docs/start/quickstarts/quickstart) or the [Vertex AI API](/vertex-ai/generative-ai/docs/start/quickstarts/quickstart-multimodal).\n- Explore pretrained models in [Model Garden](/vertex-ai/generative-ai/docs/model-garden/explore-models).\n- Learn about [quotas and limits](/vertex-ai/docs/quotas).\n- Learn about [pricing](/vertex-ai/pricing#generative_ai_models)."]]