Vertex AI release notes

This page documents production updates to Vertex AI. Check this page for announcements about new or updated features, bug fixes, known issues, and deprecated functionality.

You can see the latest product updates for all of Google Cloud on the Google Cloud page, browse and filter all release notes in the Google Cloud console, or programmatically access release notes in BigQuery.

To get the latest product updates delivered to you, add the URL of this page to your feed reader, or add the feed URL directly.

October 09, 2025

Imagen

Imagen's virtual try-on model, virtual-try-on-preview-08-04 was updated on September 30, 2025, to more accurately preserve the person's body shape and preserve the garment's identity.

October 07, 2025

The following Qwen models are available in Model Garden:

Qwen-Image
Qwen-Image-Edit
Qwen-Image-Edit-2509

Save and share prompts in Vertex AI Studio: You can now save and share prompts in Vertex AI Studio. Sharing prompts lets you collaborate with team members, ensure consistency, and build a library of effective prompts for various tasks. For more information, see Save and share prompts.

October 06, 2025

Updated pricing for Vertex AI Agent Engine: Starting on November 6, 2025, Vertex AI Agent Engine Runtime will start charging for runtime usage for the following regions:

asia-southeast1 (Singapore)
australia-southeast2 (Melbourne)
europe-west2 (London)
europe-west3 (Frankfurt)
europe-west4 (Netherlands)

For more details, see Pricing for Vertex AI Agent Engine.

Access Transparency for Vertex AI Agent Engine: Access Transparency is now available for Vertex AI Agent Engine. For more information, see the overview for Enterprise security.

October 03, 2025

Prompt management

Vertex AI offers tooling to help manage prompts and prompt versions. In addition to the prompt management capabilities in Vertex AI Studio, prompts can be stored and versioned using the Vertex AI SDK.

For more information, see the Prompt management API reference.

October 02, 2025

Gemini 2.5 Flash Image (gemini-2.5-flash-image) is now generally available. This GA release adds support for aspect ratio controls, image-only response modality, regional endpoints, support for batch predictions, image generation from multiple reference images, and improved multi-turn image editing.

See Gemini 2.5 Flash Image for more information.

Google Gen AI SDK in C# Preview

Preview: The Google Gen AI SDK is available in C#. See googleapis/dotnet-genai.

This release includes support for GenerateContentAsync, GenerateContentStreamAsync, GenerateImagesAsync, and three Live APIs, which includes SendClientContentAsync, SendRealtimeInputAsync, and SendToolResponseAsync.

September 30, 2025

DeepSeek-V3.2-Exp is available through Model Garden.

September 25, 2025

New preview models for Gemini 2.5 Flash and 2.5 Flash-Lite are now available. These models are available at the following versioned endpoints:

gemini-2.5-flash-preview-09-2025
gemini-2.5-flash-lite-preview-09-2025

September 24, 2025

Access to Gemini's 1.5 models has been discontinued. For more information, see our Model versions page.

September 23, 2025

Gemini 2.5 Flash with Live API Native Audio Preview

Gemini 2.5 Flash with Live API Native Audio (gemini-live-2.5-flash-preview-native-audio-09-2025) is available in Preview. A single, unified model processes audio input and generates audio output directly, eliminating separate text-to-speech/speech-to-text conversions. This results in-low latency, high-quality, and incredibly human-like conversations. New features and capabilities include:

Improved Barge-in: Interrupt Gemini more naturally and reliably, even in loud and noisy environments.
Robust Function Calling: We've improved the triggering rate, allowing Gemini to successfully execute the functions you define with greater precision.
Accurate Transcription: The accuracy of audio-to-text transcription has been significantly enhanced.
Seamless Multilingual Support: Speak to Gemini in multiple languages, and it will effortlessly switch between them without any pre-configuration. Language is no longer a barrier!
Enhanced Audio Quality: Experience a dramatically improved audio quality that truly feels like speaking with a person.
Proactive Audio: Define Gemini's expertise and set conditions for when it should respond. Gemini can act as a "silent listener," only chiming in when the conversation touches upon its designated area of expertise.
Affective Dialog: Gemini can adapt and adjust its generated voice to match the emotional tone of the speaker, creating more empathetic and natural interactions.

Watch our comprehensive demo to see these features in action, including seamless language switching, expert mode, emotionally aware responses, memory recall, and interactive screen sharing for engineering tasks – all demonstrated directly within Vertex AI Studio without writing a single line of code!

September 22, 2025

DeepSeek-V3.1-Terminus is available through Model Garden.

September 18, 2025

Grounding with Google Maps

Grounding with Google Maps has implemented the following changes:

Removed the following fields from the API response:
- grounding_chunk.maps.text
- grounding_chunk.maps.place_answer_sources.review_snippets.author_attribution
- grounding_chunk.maps.place_answer_sources.flag_content_uri
- grounding_chunk.maps.place_answer_sources.review_snippets.flag_content_uri
The widget context token is only returned when the optional widget_token_enable input flag is set.

To learn more, see Grounding with Google Maps.

September 15, 2025

Imagen

We improved Imagen's virtual try-on model, virtual-try-on-preview-08-04, so that it is better at preserving the person's body shape and preserving the garment product's identity.

September 10, 2025

Vertex AI Agent Engine

Agent Engine now supports the following features:

Agent Engine Code Execution, now in Preview, lets your agent run code in an isolated sandbox environment. For more information, see Code Execution.
You can now develop, deploy, and use agents that support the Agent-to-Agent (A2A) protocol on Agent Engine. For more information, see Develop an Agent2Agent agent.
Agent Engine now supports bidirectional streaming. For more information, see Bidirectional streaming.
The Agent Engine page in the Cloud Console UI now has a new Memory Bank tab for displaying and managing memories.

Vertex AI Agent Engine

In version v1.112.0 of the Vertex AI SDK for Python, the agent_engines module has been refactored to a client-based design. For information about updating your existing code to the new design, see the Migration guide.

September 09, 2025

AI Singapore's SEA-LION V4 models are available through Model Garden. They are open models for Southeast Asian languages, built by leveraging Vertex Model Development Service for enhanced training efficiency and model accuracy.

EmbeddingGemma and DeepSeek-V3.1 models are available through Model Garden.

September 08, 2025

Veo video generation

Veo 3 support for short-duration videos is generally available. You can use Veo 3 to create 4, 6, or 8 second videos. For more information, see the following:

September 03, 2025

Vertex AI RAG Engine: Managed Database (Spanner)

Customers will be charged for the use of a Google-managed Spanner instance that's provisioned in a Google tenant project, using standard Spanner SKUs.

For more information, see Vertex AI RAG Engine billing.

August 26, 2025

Gemini 2.5 Flash Image Preview

Gemini 2.5 Flash Image (gemini-2.5-flash-image-preview) is available in Preview. Gemini 2.5 Flash Image Preview supports additional image generation and editing features such as image generation from multiple reference images and improved multi-turn image editing.

Vertex AI model tuning and Gen AI evaluation service

Vertex AI model tuning now supports integration with the Gen AI evaluation service in Preview. You can automatically run evaluations on your tuned models and intermediate checkpoints. For more information, see Create a tuning job.

August 21, 2025

Vertex AI Agent Engine

Agent Engine now supports the following enterprise security features:

You can now deploy your agents in a private VPC environment, configuring a Private Service Connect interface, to ensure data privacy and meet security and compliance requirements. For more information, see Configure Private Service Connect interface.
You can now use your own customer-managed encryption keys (CMEK) to protect data at rest.
You can now specify customized resource controls, such as the minimum and maximum number of application instances, resource limits for each container, and concurrency for each container.
As a part of Vertex AI Platform, Vertex AI Agent Engine now supports HIPAA workloads.

For more information, see Agent Engine overview.

August 14, 2025

Imagen

Imagen 4 is Generally Available.

Imagen 4 introduces the following models:

For more information, see Generate images using text prompts and Image generation API.

Gemma 3 270M, Wan 2.2 and Wan 2.1 models are available through Model Garden.

August 13, 2025

OpenAI's gpt-oss-120b and gpt-oss-20b are available as Model as a Service (MaaS) models in Model Garden.

Qwen3 Coder and Qwen3 235B are available as Model as a Service (MaaS) models in Model Garden.

August 08, 2025

Gemini 2.5 Flash-Lite and Gemini 2.5 Pro now support supervised fine-tuning. For more information, see About supervised fine-tuning for Gemini models.

August 07, 2025

Vertex AI prompt optimizer

The Vertex AI prompt optimizer is now generally available. For more information, see Optimize prompts.

We now offer a zero-shot prompt optimizer.

Vertex AI Agent Engine

You can use your own custom service account for agent identity to manage permissions and access according to your organization's security policies.

Model tuning

You can now perform supervised fine-tuning on open models such as Llama 3.1. For more information, see Tune an open model.

August 06, 2025

OpenAI's gpt-oss models are available through Model Garden.

Imagen

Virtual try-on lets you generate virtual try-on images from an image of a person and product photos that you provide, and is available in Preview. For more information, see Generate Virtual Try-On Images and Virtual Try-On API.

This release note is incorrect; see entry for October 9, 2025.

July 29, 2025

Veo video generation Veo 3 and Veo 3 Fast are now generally available. For more information, see Generate videos using text prompts.

July 23, 2025

Grounding with Google Maps is available in all regions (except for the EEA) as a Preview (Pre-GA) feature.

July 22, 2025

Gemini 2.5 Flash-Lite is now generally available and accessible using the API and Vertex AI Studio. This GA release includes support for explicit caching and batch prediction, as well as expanded region support.

See Gemini 2.5 Flash-Lite for more information.

July 17, 2025

Veo 3 preview models now support upscaling for 1080p resolution using the new resolution parameter. For more information, see Veo on Vertex AI.

July 16, 2025

Added Gemma 3 fine-tuning notebook using Axolotl docker with support for 1b, 4b, 12b, and 27b variants.

July 14, 2025

Multimodal MedGemma 27B IT, MedSigLIP, and T5Gemma models are available through Model Garden.

July 08, 2025

Vertex AI Agent Engine

Vertex AI Agent Engine Memory Bank is now available in Preview. Memory Bank lets you dynamically generate long-term memories based on users' conversations with your agent.

July 03, 2025

Vertex AI Agent Garden

Vertex AI Agent Garden now supports filtering by tags.

June 27, 2025

Gemma 3n models are now available through Model Garden.

Multimodal datasets are now available in preview. For more information, see Multimodal datasets.

June 24, 2025

Starting on June 24, 2025, Imagen versions 1 and 2, image captioning, and visual question answering are deprecated.

On September 24, 2025, the following features and models will be removed:

image captioning
visual question answering
Imagen 1 model imagegeneration@002
Imagen 2 models imagegeneration@005 and imagegeneration@006

For more information, see Migrate to Imagen 3.

June 23, 2025

Veo 2 support for advanced video controls is Generally Available. In addition to a providing a first frame of a video, you can specify the last frame of a video or a video to extend in length. For more information, see Veo on Vertex AI API.

June 17, 2025

Provisioned Throughput (PT): Once a model is GA, all new PT purchases will be for GA endpoints only. If you've purchased PT for a specific preview version, it will still work for that specific preview. However, you must migrate the existing PT to the GA endpoint or purchase new PT for the GA endpoint by July 15, 2025.

Gemini 2.5 Flash and Gemini 2.5 Pro are now generally available and accessible using the API and Vertex AI Studio.

See Gemini 2.5 Flash and Gemini 2.5 Pro for more information.

Gemini 2.5 Flash-Lite is now available as a preview offering in both the API and Vertex AI Studio.

See Gemini 2.5 Flash-Lite for more information.

Live API is now available as a private general availability offering in the API and Vertex AI Studio. Reach out to your Google account team representative to request access.

See Live API for more information.

Preview endpoint availability and removal: All existing Gemini 2.5 Flash and Pro preview endpoints (listed below) will continue to be available with their current preview pricing until July 15, 2025. After this date, these preview endpoints will be shut down.

gemini-2.5-flash-preview-04-17
gemini-2.5-flash-preview-05-20
gemini-2.5-pro-preview-03-25
gemini-2.5-pro-preview-05-06
gemini-2.5-pro-preview-06-05

Updated pricing for Gemini 2.5 Flash GA: The price for Gemini 2.5 Flash in GA will be adjusted to reflect its quality and unified output token pricing. This includes lower prices for thinking output, higher prices for non-thinking output. These pricing changes will take effect on the new GA endpoint as shared above. Preview pricing will only continue on existing preview endpoints for 30 days post-GA on July 15, 2025.

Updated preview endpoints: Effective June 19, 2025, gemini-2.5-flash-preview-04-17 endpoint will serve the Gemini 2.5 Flash model version released on 05-20, which has been promoted to GA. Similarly, the gemini-2.5-pro-preview-05-06 and 03-25 endpoints will serve the Gemini 2.5 Pro model version released on 06-05, also promoted to GA. This update ensures continuity during your transition.

June 16, 2025

The DeepSeek API service on Vertex AI is in Preview. For more information, see the DeepSeek model card in Model Garden.

June 11, 2025

Imagen 4's public preview models are updated to the following:

imagen-4.0-generate-preview-06-06
imagen-4.0-fast-generate-preview-06-06
imagen-4.0-ultra-generate-preview-06-06

For more information about each model, see Preview Imagen models.

To avoid service interruption, migrate from imagen-4.0-ultra-generate-exp-05-20 and imagen-4.0-generate-preview-05-20 before 2025-07-07.

June 09, 2025

Gemini API

The logprobs and response_logprobs parameters for the Gemini API are now generally available. For more information, see Generate content with Gemini API.

June 05, 2025

Gemini 2.5 Pro's public preview version has been updated to gemini-2.5-pro-preview-06-05 and includes expanded support for thinking. This model version is available in the API and Vertex AI Studio.

See Gemini 2.5 Pro for model details.

June 03, 2025

Model Garden now includes DeepSeek-R1-0528 variants.

In Model Garden, the following fine tuning features have been added:

Gemma 3 UI fine-tuning using PEFT docker.
Qwen 2.5 fine-tuning notebook using PEFT docker.
Qwen 3 fine-tuning notebook using Axolotl docker.
lm-evaluation-harness as an evaluation service in the Llama 3.3, Llama 3.1, Gemma 3 and Gemma 2 fine-tuning notebooks.

May 23, 2025

Mistral OCR is an Optical Character Recognition API for document understanding. It is GA on Vertex AI. For more information, see the Mistral OCR model card in Model Garden.

May 22, 2025

Anthropic's Claude Opus 4 and Claude Sonnet 4 are GA on Vertex AI and support Provision Throughput. For more information, see the Claude Opus 4 or Claude Sonnet 4 model card in Model Garden.

May 20, 2025

Vertex AI Agent Engine

The following features are now available in Preview:

Gemini 2.5 Flash's public preview version has been updated to gemini-2.5-flash-preview-5-20.

See Gemini 2.5 Flash for model details.