Gemini 2.5 Flash-Lite is our most balanced Gemini model, optimized for low latency use cases. It comes with the same capabilities that make other Gemini 2.5 models helpful, such as the ability to turn thinking on at different budgets, connecting to tools like Grounding with Google Search and code execution, multimodal input, and a 1 million-token context length.
For even more detailed technical information on Gemini 2.5 Flash-Lite (such as performance benchmarks, information on our training datasets, efforts on sustainability, intended usage and limitations, and our approach to ethics and safety), see our technical report on our Gemini 2.5 models.
2.5 Flash-Lite
Try in Vertex AI (Preview) Deploy example app
Model ID | gemini-2.5-flash-lite |
|
---|---|---|
Supported inputs & outputs |
|
|
Token limits |
|
|
Capabilities |
|
|
Usage types |
|
|
Input size limit | 500 MB | |
Technical specifications | ||
Images |
|
|
Documents |
|
|
Video |
|
|
Audio |
|
|
Parameter defaults |
|
|
Supported regions | ||
Model availability (Includes dynamic shared quota & Provisioned Throughput) |
|
|
ML processing |
|
|
See Data residency for more information. | ||
Knowledge cutoff date | January 2025 | |
Versions |
|
|
Security controls | ||
See Security controls for more information. | ||
Supported languages | See Supported languages. | |
Pricing | See Pricing. |
2.5 Flash-Lite
Try in Vertex AI (Preview) Deploy example app
Model ID | gemini-2.5-flash-lite-preview-09-2025 |
|
---|---|---|
Supported inputs & outputs |
|
|
Token limits |
|
|
Capabilities |
|
|
Usage types |
|
|
Technical specifications | ||
Images |
|
|
Documents |
|
|
Video |
|
|
Audio |
|
|
Parameter defaults |
|
|
Supported regions | ||
Model availability (Includes dynamic shared quota & Provisioned Throughput) |
|
|
See Data residency for more information. | ||
Knowledge cutoff date | January 2025 | |
Versions |
|
|
Security controls | ||
See Security controls for more information. | ||
Supported languages | See Supported languages. | |
Pricing | See Pricing. |