This guide shows you how to use the thinking feature in Gemini models to improve their reasoning capabilities. This page covers the following topics:
- Supported models: Learn which models support the thinking feature.
- Use a thinking model: Enable and use a model with thinking capabilities.
- View thought summaries: See the abbreviated output of the model's thinking process.
- Control the thinking budget: Manage the token limit for the model's thought process.
- Prompting techniques: Discover best practices for prompting thinking models to get better responses.
Thinking models are trained to generate the "thinking process" that the model goes through as part of its response. This process gives thinking models stronger reasoning capabilities than equivalent base models.
The thinking process is enabled by default. When you use Vertex AI Studio, you can view the full thinking process along with the model's generated response.
Supported models
The thinking feature is supported in the following models:
Use a thinking model
To use the thinking feature with a supported model, follow these steps:
Console
- Go to Vertex AI Studio > Create prompt.
- In the Model panel, click Switch model and select a supported model from the menu.
- Optional: In the System instructions field, provide detailed instructions on how the model should format its responses.
- In the Write your prompt field, enter a prompt.
- Click Run.
When you select the Gemini 2.5 Flash model, the Thinking budget is set to Auto by default. To turn off thinking for this model, set Thinking budget to Off.
Gemini returns a response after it finishes generating. Depending on the complexity of the prompt, this can take several seconds.
Python
Install
pip install --upgrade google-genai
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=global export GOOGLE_GENAI_USE_VERTEXAI=True
View thought summaries
A thought summary is an abbreviated version of the model's thinking process. You can view thought summaries for both Gemini 2.5 Flash and Gemini 2.5 Pro.
Console
Thought summaries are enabled by default in Vertex AI Studio. To see the model's summarized thought process, expand the Thoughts panel.
Python
Install
pip install --upgrade google-genai
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=global export GOOGLE_GENAI_USE_VERTEXAI=True
Control the thinking budget
You can control the extent of the model's thinking process for a response. This upper limit is called the thinking budget and applies to the model's full thought process. By default, the model automatically controls its thinking process up to a maximum of 8,192 tokens.
You can manually set this token limit. For example, you might set a lower limit for less complex tasks or a higher limit for more complex ones.
The following table shows the minimum and maximum token budget amounts for each supported model:
Model | Minimum token amount | Maximum token amount |
---|---|---|
Gemini 2.5 Flash | 1 | 24,576 |
Gemini 2.5 Pro | 128 | 32,768 |
Gemini 2.5 Flash-Lite | 512 | 24,576 |
If you set the thinking budget to 0
for Gemini 2.5 Flash or Gemini 2.5 Flash-Lite, thinking is turned off. Thinking can't be turned off for Gemini 2.5 Pro.
If you want the model to control the thinking budget when using the API, set the thinking budget to -1
.
Console
- Go to Vertex AI Studio > Create prompt.
- In the Model panel, click Switch model and select a supported model from the menu.
- From the Thinking budget drop-down menu, select Manual, and then use the slider to adjust the token limit.
Python
Install
pip install --upgrade google-genai
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=global export GOOGLE_GENAI_USE_VERTEXAI=True
Prompting techniques
Effective prompting helps you get the most out of Gemini thinking models. This section outlines several best practices.
Technique | Description | Best for... |
---|---|---|
Provide step-by-step instructions | Guide the model through a problem-solving process with explicit steps. | Complex tasks where the model might miss a step or needs a more focused approach. |
Multishot prompting | Provide several input-output examples to demonstrate the desired behavior and format. | Improving accuracy and ensuring the model understands the expected output format. |
System instructions | Set a persistent role, persona, or formatting rules for the model to follow across multiple prompts. | Building applications where consistent model behavior and output style are required. |
Verification and reflection | Ask the model to check its sources, review its reasoning, and verify its final answer. | Factual queries or tasks where accuracy is critical and you want to reduce hallucinations. |
Debugging and steering | Analyze the model's thought process to identify misunderstandings and provide corrective feedback. | Iteratively refining prompts and improving model performance when initial results are not as expected. |
Start with a simple prompt that describes your desired outcome and observe the model's initial thoughts.
Describe the most unusual or striking architectural feature you've ever seen in a building or structure. <...> The most unusual architectural feature that has truly stayed with me is the "floating roof" of the Heydar Aliyev Center in Baku, Azerbaijan, designed by Zaha Hadid Architects. Visually, it's breathtaking. Imagine a vast, undulating landscape of pure white, seemingly draped over the building. It's not just a roof in the traditional sense; it's an extension of the ground, flowing seamlessly upwards and over the building's various spaces before curving back down to meet the earth again. There are no sharp angles, no rigid lines. Instead, everything is fluid, organic, and continuous. It resembles a giant, pristine white wave frozen in motion, or perhaps a sand dune sculpted by an unseen wind. <...> |
Provide step-by-step instructions
If the initial response isn't what you want, refine your prompt with explicit, step-by-step instructions. This helps guide the model's problem-solving process for a more focused approach. This technique lets you steer the model's understanding of the task with explicit suggestions.
Now imagine that architectural feature is not just for looks, but has a specific function or purpose related to the building's environment or the people who use it. Explain what that function could be and how it enhances the building's overall design and usability. <...> Let's imagine that the "floating roof" of the Heydar Aliyev Center is not purely aesthetic but serves a crucial environmental and user-centric function: it's designed as a sophisticated climate moderator and natural ventilation system for the Baku environment. Here's how this function would work and enhance the building: Function: Passive Climate Control and Natural Ventilation System Shading and Solar Heat Mitigation: The undulating curves of the white GFRC roof are strategically shaped and oriented to provide optimal shading throughout the day, year-round. Baku experiences intense sun, especially during the summer. The overhanging curves, acting like giant brise-soleils, would block direct sunlight from hitting the building's glass facade, significantly reducing solar heat gain. The varying heights and depths of the curves would create dynamic shade patterns, ensuring that different parts of the building are shaded at different times of the day, optimizing comfort and reducing the need for excessive air conditioning. The white color of the GFRC further enhances this by reflecting a large portion of the solar radiation. <...> |
Multishot prompting with thinking
Using multishot prompting with the thinking feature can improve Gemini's understanding and the accuracy of its responses. In your prompt, provide several input-output examples to demonstrate the desired behavior and format.
Example 1: User: What is the tallest mountain in the world? Assistant: Mount Everest Example 2: User: What is the largest ocean? Assistant: Pacific Ocean User: What is the longest river in the world? Assistant: Amazon River |
Use system instructions
If you are building an application where users interact directly with the model, you can use system instructions to guide the format and tone of Gemini's responses. A system instruction is a set of instructions that the model processes before it processes a prompt. These instructions are recalled each time the model is prompted and tell it how to behave and respond. For example, you can include the role or persona, contextual information, and formatting instructions.
System Instruction: You are a helpful and concise coding tutor. Always provide code examples and explain them briefly. User Prompt: Explain what is a closure in Javascript and give an example. function outerFunction(outerVar) { function innerFunction(innerVar) { console.log(outerVar); // Accessing outerVar from outerFunction's scope console.log(innerVar); } return innerFunction; } const closureExample = outerFunction('I am from outer'); closureExample('I am from inner'); Explanation: A closure is created when a function (innerFunction in this example) is defined inside another function (outerFunction) and the inner function remembers and has access to the variables from the outer function's scope, even after the outer function has finished executing. In the example: outerFunction is called with 'I am from outer' and returns innerFunction. closureExample now holds a reference to innerFunction. When closureExample('I am from inner') is called, innerFunction executes. Even though outerFunction has already finished, innerFunction can still access and use outerVar (which is 'I am from outer') because of the closure. In essence, the innerFunction "closes over" the variables of its outer function's scope, creating a closure. |
Use verification and reflection
You can include specific verification steps in your system instructions or directly in your prompts. For example, you can ask Gemini to verify its sources, review its reasoning, identify potential errors, and check its final answer.
Verify the following statement and provide your sources: The capital of Brazil is Rio de Janeiro. The statement "The capital of Brazil is Rio de Janeiro" is incorrect. The current capital of Brazil is Brasília. Rio de Janeiro was the capital of Brazil for a significant period, from 1763 to 1960. However, the capital was moved to Brasília in 1960 as part of a plan to develop the country's interior and create a more centrally located capital. Verification and Sources: You can easily verify this information from numerous reliable sources: Encyclopedias: Britannica: Searching for "Brazil" on Britannica will clearly state Brasília as the capital. https://www.britannica.com/place/Brazil (Look for the "Capital" information in the introductory summary or "Geography" section) Wikipedia: The Wikipedia article for "Brazil" also states Brasília as the capital. https://en.wikipedia.org/wiki/Brazil (Look in the infobox or introductory section) Government and Official Sources: CIA World Factbook: This is a highly reliable source for country information. Searching for "Brazil" will list Brasília as the capital. https://www.cia.gov/the-world-factbook/countries/brazil/ (Look under "Government" > "Capital") Official Website of the Brazilian Government (if accessible and in English): While navigating official government websites can sometimes be complex, they would definitively list Brasília as the capital. Geographic Resources: National Geographic: Their website and publications will confirm Brasília as the capital of Brazil. Atlases and Geography Textbooks: Any modern atlas or geography textbook will list Brasília as the capital. Why the confusion? The confusion might arise because: Rio de Janeiro is a very famous Brazilian city: It's a major tourist destination, known for its beaches, Carnival, and iconic landmarks like Christ the Redeemer. Its past status as capital adds to its historical significance and public recognition. Historical context: Rio de Janeiro was the capital for a long time. People who learned about Brazil in the past might remember it as the capital. In conclusion, the statement is false. The capital of Brazil is Brasília, not Rio de Janeiro. Rio de Janeiro was the former capital but is no longer the current capital. You can confidently use the sources listed above to verify this information. |
Debug and steer the model
Vertex AI Studio gives you access to the model's thoughts in the UI, which is helpful for debugging your prompts.
Analyze the thought process
Review the model's thought process to identify where it might have misinterpreted your instructions. This analysis can help you refine your prompts to get your expected response more consistently.
Example Scenario: If the model's thought process shows it misunderstood a step in a complex instruction, you can refine the prompt to clarify that specific step.
Step 1: Summarize the following article about quantum physics. Step 2: Translate the summary into Spanish. Step 3: Extract the key entities from the Spanish summary. Article: ... [Article Text] ... The user wants me to process an article about Google's new quantum chip called Willow in three steps: Step 1: Summarize the article. Step 2: Translate the summary into Spanish. Step 3: Extract the key entities from the Spanish summary. Let's break down each step. Step 1: Summarize the article. I need to read the article and identify the main points. The article is about Google's new quantum chip, Willow, and its achievements. The key achievements are: Exponential error reduction as qubits scale up. Outperforming supercomputers in a benchmark computation (RCS). Willow is a step towards commercially relevant quantum computing. It's the first system below threshold for quantum error correction. It achieved a benchmark computation in under 5 minutes that would take a supercomputer 10 septillion years. It was fabricated in a new facility. It has state-of-the-art performance and high quality qubits. Google is looking towards real-world applications with Willow. Quantum computing will benefit AI. I will synthesize these points into a concise summary. Step 2: Translate the summary into Spanish. Once I have the summary in English, I will translate it into Spanish. I need to ensure the translation is accurate and natural-sounding in Spanish. Step 3: Extract the key entities from the Spanish summary. After translating the summary, I will identify and extract the key entities. Key entities are typically nouns or noun phrases that represent important objects, concepts, or actors. In this context, they will likely include: Names of chips (Willow, Sycamore) Companies/Organizations (Google Quantum AI, Google Research) Technologies (quantum chip, quantum computer, qubits, quantum error correction, superconducting quantum system, classical computers, supercomputers, AI) Benchmarks (random circuit sampling, RCS) People (Julian Kelly, Peter Shor, David Deutsch, Sergio Boixo, Hartmut Neven, John Preskill) Concepts (exponential error reduction, below threshold, beyond breakeven, state-of-the-art performance, commercially relevant applications, quantum mechanics, multiverse, T1 times, system benchmarks, algorithmic benchmarks) I will need to carefully review the Spanish summary to identify these entities and list them out. |
Provide corrective feedback
If you find errors or inconsistencies in the model's reasoning, provide corrective feedback in your prompt or system instructions. This helps you guide the model toward your desired outcome.
You are an expert in botany. When asked to identify a plant, focus on leaf shape, flower structure, and stem characteristics. Identify this plant based on these characteristics: ... [Plant Characteristics Description] ... Previous incorrect response focused on habitat only. Please disregard habitat and focus on morphology. Previous incorrect response focused on habitat only. Please disregard habitat and focus on morphology. Tabebuia vellosoi |
Work with long outputs and long-form thinking
The thinking feature generates "thoughts" as part of the token output that show the model's detailed reasoning. To make the most of these outputs, consider the following:
Review the reasoning
In the Vertex AI Studio UI, review the model's reasoning process to see how it broke down the task and the choices it made. This can help you understand how it arrived at its conclusion.
Provide guidance in reasoning
For lengthy outputs, consider providing more guidance in your prompt. This can constrain the amount of thinking required to complete the task and reserve more of the token output for the response.
Example: For a detailed report, you might specify sections and subsections in your prompt to guide the model's thinking and structure the output efficiently.
Generate a detailed report on the impact of climate change on coastal ecosystems. Report Structure: Introduction Rising Sea Levels 2.1. Impact on Mangroves 2.2. Impact on Coral Reefs Ocean Acidification 3.1. Impact on Shellfish 3.2. Impact on Plankton Extreme Weather Events 4.1. Increased Storm Intensity 4.2. Coastal Erosion Conclusion Provide detailed analysis within each section. <...> Climate Change Impact on Coastal Ecosystems: A Detailed Report 1. Introduction Coastal ecosystems, the dynamic interface between land and sea, are among the most productive and biodiverse environments on Earth. They provide crucial ecosystem services, including coastal protection, fisheries, carbon sequestration, and support for a vast array of species. However, these vital ecosystems are increasingly threatened by the multifaceted impacts of climate change. Driven by anthropogenic greenhouse gas emissions, climate change is manifesting in several interconnected ways that are profoundly altering the physical and chemical conditions of coastal environments. This report will delve into the key impacts of climate change on coastal ecosystems, focusing on rising sea levels, ocean acidification, and extreme weather events, and their specific consequences for critical habitats like mangroves and coral reefs, and key organisms like shellfish and plankton. Understanding these impacts is crucial for developing effective conservation and management strategies to mitigate the damage and enhance the resilience of these invaluable ecosystems. 2. Rising Sea Levels Rising sea levels are a direct and significant consequence of global warming, driven by thermal expansion of seawater and the melting of land ice (glaciers and ice sheets). This phenomenon poses a direct threat to low-lying coastal ecosystems by inundating habitats, altering salinity regimes, and increasing erosion rates. The rate of sea level rise is accelerating, and projections for the 21st century indicate a continued upward trend, with potentially devastating consequences for coastal environments. 2.1. Impact on Mangroves Mangrove forests are unique coastal ecosystems found in tropical and subtropical intertidal zones. They provide numerous benefits, including coastal protection against storms, nursery grounds for fish and invertebrates, and significant carbon sequestration... <...> |
What's next?
Try using a thinking model in the Colab notebook, or open the Vertex AI console to prompt the model yourself.