Gemini 2.0 Flash Thinking is an experimental model that's trained to generate the "thinking process" the model goes through as part of its response. As a result, Gemini 2.0 Flash Thinking is capable of stronger reasoning capabilities in its responses than the base Gemini 2.0 Flash model.
Use Flash Thinking
Flash Thinking models are available as an
experimental model in Vertex AI.
To use the latest Flash Thinking model,
select the
gemini-2.0-flash-thinking-exp-01-21
model in the Model drop-down menu.
Gen AI SDK for Python
Learn how to install or update the Google Gen AI SDK for Python.
For more information, see the
Gen AI SDK for Python API reference documentation or the
python-genai
GitHub repository.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=us-central1 export GOOGLE_GENAI_USE_VERTEXAI=True
Limitations
Flash Thinking is an experimental model and has the following limitations:
- 1M token input limit
- Text, image, audio, and video input
- 64k token output limit
- Text only output
- No built-in tool usage like Search or code execution
What's next?
Try Flash Thinking for yourself with our Colab notebook, or open the Vertex AI console and try prompting the model for yourself.