This guide shows you how to fine-tune RAG transformations by adjusting chunking parameters for your documents. This page covers the following topics: When you ingest documents, Vertex AI RAG Engine splits them into smaller pieces called chunks before it creates embeddings. The size and overlap of these chunks can significantly impact the quality of your RAG system's responses. The default chunking settings are optimized for a wide range of use cases. However, you might want to adjust them based on the nature of your documents and your specific application. The following table provides guidance on how to choose a chunk size. You can control chunking behavior by using the following parameters during data ingestion.
Understanding chunking and its impact
Chunk Size
Pros
Cons
Best for
Smaller
Question-answering on documents where answers are concise and factual, such as FAQs.
Larger
Summarization tasks or querying for broader themes and concepts within a document.
Available chunking parameters
Parameter
Description
chunk_size
The size of the chunk in tokens. The default is 1,024. To help you decide on a size, see the guidance in the preceding table.
chunk_overlap
The number of tokens that overlap between adjacent chunks. Overlap helps maintain context between chunks. A larger overlap can improve retrieval quality but also increases processing and storage costs. The default is 256.
What's next
Fine-tune RAG transformations
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-08-18 UTC.