RAG Engine API

The Vertex AI RAG Engine is a component of the Vertex AI platform for Retrieval-Augmented Generation (RAG). With RAG Engine, Large Language Models (LLMs) can access and use data from external knowledge sources, such as documents and databases, to generate more accurate and informative responses.

This document provides information and examples for using the RAG Engine API and covers the following topics:

  • Corpus management: Describes the API parameters for managing corpora and provides examples for creating, updating, listing, getting, and deleting a RAG corpus.
  • File management: Describes the API parameters for managing files and provides examples for uploading, importing, and managing files within a RAG corpus.
  • Retrieval and generation: Describes the API parameters for retrieving contexts and generating grounded responses, with examples.
  • Project management: Explains how to configure project-level settings for the RAG engine, with examples.

The following diagram summarizes the overall workflow for using the RAG Engine API:

Corpus management parameters

This section describes the parameters for managing a RAG corpus. For more information, see Corpus management.

Create a RAG corpus

The following tables describe the parameters used to create a RAG corpus.

Vector database options

You can choose one of the following vector database options for your RAG corpus.

Vector Database Option Description Use Case
rag_managed_db A fully managed, serverless vector database provided by Vertex AI. This is the default option. Recommended if you want a simple, integrated solution without managing your own vector database infrastructure.
pinecone Integration with a self-managed Pinecone vector database. Requires providing your Pinecone index name and API key. Use this option if you already have an existing Pinecone setup or prefer its specific features.
vertex_vector_search Integration with Vector Search. Requires providing the index and index endpoint resource names. Use this option if you need a high-performance, scalable vector search solution within the Google Cloud ecosystem.

Request body

Parameters
display_name

Required: string

The display name of the RAG corpus.

description

Optional: string

The description of the RAG corpus.

encryption_spec

Optional: Immutable: string

The CMEK key name used to encrypt at-rest data related to the RAG corpus. This key is only applicable to the RagManaged option for the vector database. This field can only be set during corpus creation.

Format: projects/{project}/locations/{location}/keyRings/{key_ring}/cryptoKeys/{key_name}

vector_db_config

Optional: Immutable: vectorDbConfig

The configuration for the Vector DB. This field is a oneof object. Choose one of the following:

  • rag_managed_db: The default, fully managed vector database.
  • pinecone: Specifies your Pinecone instance.
    • index_name (string): The name of the Pinecone index. This can be set later with an UpdateRagCorpus call.
    • api_auth.api_key_config.api_key_secret_version (string): The full resource name of the secret in Secret Manager that contains your Pinecone API key. Format: projects/{PROJECT_NUMBER}/secrets/{SECRET_ID}/versions/{VERSION_ID}. This can be set later.
  • vertex_vector_search: Specifies your Vector Search instance.
    • index (string): The resource name of the Vector Search index. Format: projects/{project}/locations/{location}/indexes/{index}. This can be set later.
    • index_endpoint (string): The resource name of the Vector Search index endpoint. Format: projects/{project}/locations/{location}/indexEndpoints/{index_endpoint}. This can be set later.
vertex_ai_search_config.serving_config

Optional: string

The configuration for Vertex AI Search.

Format: projects/{project}/locations/{location}/collections/{collection}/engines/{engine}/servingConfigs/{serving_config} or projects/{project}/locations/{location}/collections/{collection}/dataStores/{data_store}/servingConfigs/{serving_config}

rag_embedding_model_config.vertex_prediction_endpoint.endpoint

Optional: Immutable: string

The embedding model to use for the RAG corpus. This value can't be changed after it's set. If you leave it empty, text-embedding-005 is used as the default embedding model.

Update a RAG corpus

This table lists the parameters used to update a RAG corpus.

Request body

Parameters
display_name

Optional: string

The new display name of the RAG corpus.

description

Optional: string

The new description of the RAG corpus.

rag_vector_db.pinecone.index_name

string

The name of the Pinecone index. You can set this field if your RagCorpus was created with a Pinecone configuration and the index name has not been set before.

rag_vector_db.vertex_vector_search.index

string

The resource name of the Vector Search index. You can set this field if your RagCorpus was created with a Vector Search configuration and the index has not been set before.

Format: projects/{project}/locations/{location}/indexEndpoints/{index_endpoint}

rag_vector_db.vertex_vector_search.index_endpoint

string

The resource name of the Vector Search index endpoint. You can set this field if your RagCorpus was created with a Vector Search configuration and the index endpoint has not been set before.

Format: projects/{project}/locations/{location}/indexes/{index}

rag_vector_db.api_auth.api_key_config.api_key_secret_version

string

The full resource name of the secret in Secret Manager that contains your Pinecone API key.

Format: projects/{PROJECT_NUMBER}/secrets/{SECRET_ID}/versions/{VERSION_ID}

List RAG corpora

This table lists the parameters used to list RAG corpora.

Parameters

page_size

Optional: int

The maximum number of corpora to return per page.

page_token

Optional: string

The standard list page token. Typically obtained from [ListRagCorporaResponse.next_page_token][] of the previous [VertexRagDataService.ListRagCorpora][] call.

Get a RAG corpus

This table lists the parameter used to get a RAG corpus.

Parameters
name

Required: string

The name of the RagCorpus resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus_id}

Delete a RAG corpus

This table lists the parameter used to delete a RAG corpus.

Parameters
name

Required: string

The name of the RagCorpus resource to delete. Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus_id}

File management parameters

This section describes the parameters for managing files in a RAG corpus. For more information, see File management.

Upload a RAG file

This table lists the parameters used to upload a RAG file.

Request body

Parameters
parent

Required: string

The name of the RagCorpus resource to upload the file to. Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus_id}

rag_file

Required: RagFile

The file to upload. Contains the following fields:

  • display_name (Required, string): The display name of the RAG file.
  • description (Optional, string): The description of the RAG file.
upload_rag_file_config

Required: UploadRagFileConfig

The configuration for the RagFile. Contains the following fields:

  • rag_file_transformation_config.rag_file_chunking_config.fixed_length_chunking.chunk_size (int32): The number of tokens in each chunk.
  • rag_file_transformation_config.rag_file_chunking_config.fixed_length_chunking.chunk_overlap (int32): The token overlap between chunks.

Import RAG files

This table lists parameters used to import a RAG file.

Parameters

parent

Required: string

The name of the RagCorpus resource.

Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus_id}

gcs_source

oneof import_source: GcsSource

Cloud Storage location.

Supports importing individual files as well as entire Cloud Storage directories.

gcs_source.uris

list of string

Cloud Storage URI that contains the upload file.

google_drive_source

oneof import_source: GoogleDriveSource

Google Drive location.

Supports importing individual files as well as Google Drive folders.

slack_source

oneof import_source: SlackSource

The slack channel where the file is uploaded.

jira_source

oneof import_source: JiraSource

The Jira query where the file is uploaded.

share_point_sources

oneof import_source: SharePointSources

The SharePoint sources where the file is uploaded.

rag_file_transformation_config.rag_file_chunking_config.fixed_length_chunking.chunk_size

int32

Number of tokens each chunk has.

rag_file_transformation_config.rag_file_chunking_config.fixed_length_chunking.chunk_overlap

int32

The overlap between chunks.

rag_file_parsing_config

Optional: RagFileParsingConfig

Specifies the parsing configuration for RagFiles.

If this field isn't set, RAG uses the default parser.

max_embedding_requests_per_min

Optional: int32

The maximum number of queries per minute that this job is allowed to make to the embedding model specified on the corpus. This value is specific to this job and not shared across other import jobs. Consult the Quotas page on the project to set an appropriate value.

If unspecified, a default value of 1,000 QPM is used.

GoogleDriveSource

resource_ids.resource_id

Required: string

The ID of the Google Drive resource.

resource_ids.resource_type

Required: string

The type of the Google Drive resource.

SlackSource

channels.channels

Repeated: SlackSource.SlackChannels.SlackChannel

Slack channel information, include ID and time range to import.

channels.channels.channel_id

Required: string

The Slack channel ID.

channels.channels.start_time

Optional: google.protobuf.Timestamp

The starting timestamp for messages to import.

channels.channels.end_time

Optional: google.protobuf.Timestamp

The ending timestamp for messages to import.

channels.api_key_config.api_key_secret_version

Required: string

The full resource name of the secret that is stored in Secret Manager, which contains a Slack channel access token that has access to the slack channel IDs.
See: https://api.slack.com/tutorials/tracks/getting-a-token.

Format: projects/{PROJECT_NUMBER}/secrets/{SECRET_ID}/versions/{VERSION_ID}

JiraSource

jira_queries.projects

Repeated: string

A list of Jira projects to import in their entirety.

jira_queries.custom_queries

Repeated: string

A list of custom Jira queries to import. For information about JQL (Jira Query Language), see
Jira Support

jira_queries.email

Required: string

The Jira email address.

jira_queries.server_uri

Required: string

The Jira server URI.

jira_queries.api_key_config.api_key_secret_version

Required: string

The full resource name of the secret that is stored in Secret Manager, which contains Jira API key that has access to the slack channel IDs.
See: https://support.atlassian.com/atlassian-account/docs/manage-api-tokens-for-your-atlassian-account/

Format: projects/{PROJECT_NUMBER}/secrets/{SECRET_ID}/versions/{VERSION_ID}

SharePointSources

share_point_sources.sharepoint_folder_path

oneof in folder_source: string

The path of the SharePoint folder to download from.

share_point_sources.sharepoint_folder_id

oneof in folder_source: string

The ID of the SharePoint folder to download from.

share_point_sources.drive_name

oneof in drive_source: string

The name of the drive to download from.

share_point_sources.drive_id

oneof in drive_source: string

The ID of the drive to download from.

share_point_sources.client_id

string

The Application ID for the app registered in Microsoft Azure Portal.
The application must also be configured with MS Graph permissions "Files.ReadAll", "Sites.ReadAll" and BrowserSiteLists.Read.All.

share_point_sources.client_secret.api_key_secret_version

Required: string

The full resource name of the secret that is stored in Secret Manager, which contains the application secret for the app registered in Azure.

Format: projects/{PROJECT_NUMBER}/secrets/{SECRET_ID}/versions/{VERSION_ID}

share_point_sources.tenant_id

string

Unique identifier of the Azure Active Directory Instance.

share_point_sources.sharepoint_site_name

string

The name of the SharePoint site to download from. This can be the site name or the site id.

RagFileParsingConfig

layout_parser

oneof parser: RagFileParsingConfig.LayoutParser

The Layout Parser to use for RagFiles.

layout_parser.processor_name

string

The full resource name of a Document AI processor or processor version.

Format:
projects/{project_id}/locations/{location}/processors/{processor_id}
projects/{project_id}/locations/{location}/processors/{processor_id}/processorVersions/{processor_version_id}

layout_parser.max_parsing_requests_per_min

string

The maximum number of requests the job is allowed to make to the Document AI processor per minute.

Consult https://cloud.google.com/document-ai/quotas and the Quota page for your project to set an appropriate value here. If unspecified, a default value of 120 QPM is used.

llm_parser

oneof parser: RagFileParsingConfig.LlmParser

The LLM parser to use for RagFiles.

llm_parser.model_name

string

The resource name of an LLM model.

Format:
projects/{project_id}/locations/{location}/publishers/{publisher}/models/{model}

llm_parser.max_parsing_requests_per_min

string

The maximum number of requests the job is allowed to make to the LLM model per minute.

To set an appropriate value for your project, see model quota section and the Quota page for your project to set an appropriate value here. If unspecified, a default value of 5000 QPM is used.

Get a RAG file

This table lists the parameter used to get a RAG file.

Parameters
name

Required: string

The name of the RagFile resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_file_id}

Delete a RAG file

This table lists the parameter used to delete a RAG file.

Parameters
name

Required: string

The name of the RagFile resource to delete. Format: projects/{project}/locations/{location}/ragCorpora/{rag_file_id}

Retrieval and prediction parameters

This section lists the retrieval and prediction parameters.

Retrieval parameters

This table lists parameters for retrieveContexts API.

Parameters

parent

Required: string

The resource name of the Location to retrieve RagContexts.
The users must have permission to make a call in the project.

Format: projects/{project}/locations/{location}

vertex_rag_store

VertexRagStore

The data source for Vertex RagStore.

query

Required: RagQuery

Single RAG retrieve query.

VertexRagStore
VertexRagStore

rag_resources

list: RagResource

The representation of the RAG source. It can be used to specify the corpus only or RagFiles. Only support one corpus or multiple files from one corpus.

rag_resources.rag_corpus

Optional: string

RagCorpora resource name.

Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus}

rag_resources.rag_file_ids

list: string

A list of RagFile resources.

Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus}/ragFiles/{rag_file}

RagQuery

text

string

The query in text format to get relevant contexts.

rag_retrieval_config

Optional: RagRetrievalConfig

The retrieval configuration for the query.

RagRetrievalConfig

top_k

Optional: int32

The number of contexts to retrieve.

filter.vector_distance_threshold

oneof vector_db_threshold: double

Only returns contexts with a vector distance smaller than the threshold.

filter.vector_similarity_threshold

oneof vector_db_threshold: double

Only returns contexts with vector similarity larger than the threshold.

ranking.rank_service.model_name

Optional: string

The model name of the rank service.

Example: semantic-ranker-512@latest

ranking.llm_ranker.model_name

Optional: string

The model name used for ranking.

Example: gemini-2.5-flash

Prediction parameters

This table lists prediction parameters.

GenerateContentRequest

tools.retrieval.vertex_rag_store

VertexRagStore

Set to use a data source powered by Vertex AI RAG store.

See VertexRagStore for details.

Project management parameters

This table lists the project-level tier configurations for the RAG engine's managed database.

Tier Description Use Case
RagManagedDbConfig.scaled A production-scale tier that offers high performance and auto-scaling capabilities for your managed vector database. Recommended for production applications with high query loads or large data volumes.
RagManagedDbConfig.basic A cost-effective, low-compute tier for the managed vector database. Use for development, testing, or small-scale applications with low traffic.
RagManagedDbConfig.unprovisioned Deletes the managed vector database and its underlying resources. This effectively disables the managed DB for the project. Use to tear down the managed database infrastructure when it's no longer needed to help manage costs.

Corpus management examples

This section provides examples of how to use the API to manage your RAG corpus.

Create a RAG corpus example

These code samples demonstrate how to create a RAG corpus.

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • CORPUS_DISPLAY_NAME: The display name of the RAG corpus.
  • CORPUS_DESCRIPTION: The description of the RAG corpus.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora

Request JSON body:

{
  "display_name" : "CORPUS_DISPLAY_NAME",
  "description": "CORPUS_DESCRIPTION",
}

To send your request, choose one of these options:

Save the request body in a file named request.json, and run the following command:

curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json; charset=utf-8" \
    -d @request.json \
    "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora"

Save the request body in a file named request.json, and run the following command:

  $cred = gcloud auth print-access-token
  $headers = @{ "Authorization" = "Bearer $cred" }

  Invoke-WebRequest `
      -Method POST `
      -Headers $headers `
      -ContentType: "application/json; charset=utf-8" `
      -InFile request.json `
      -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora" | Select-Object -Expand Content

You should receive a successful status code (2xx).

The following example demonstrates how to create a RAG corpus by using the REST API.

  // CreateRagCorpus
  // Input: LOCATION, PROJECT_ID, CORPUS_DISPLAY_NAME
  // Output: CreateRagCorpusOperationMetadata
  curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora \
  -d '{
        "display_name" : "CORPUS_DISPLAY_NAME"
    }'

To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.


from vertexai import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# display_name = "test_corpus"
# description = "Corpus Description"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

# Configure backend_config
backend_config = rag.RagVectorDbConfig(
    rag_embedding_model_config=rag.RagEmbeddingModelConfig(
        vertex_prediction_endpoint=rag.VertexPredictionEndpoint(
            publisher_model="publishers/google/models/text-embedding-005"
        )
    )
)

corpus = rag.create_corpus(
    display_name=display_name,
    description=description,
    backend_config=backend_config,
)
print(corpus)
# Example response:
# RagCorpus(name='projects/1234567890/locations/us-central1/ragCorpora/1234567890',
# display_name='test_corpus', description='Corpus Description', embedding_model_config=...
# ...

Update a RAG corpus example

You can update a RAG corpus's display name, description, and vector database configuration. However, you can't change the following immutable parameters in your RAG corpus:

  • The vector database type. For example, you can't change the vector database from Pinecone to Vector Search.
  • If you're using the managed database option, you can't update the vector database configuration.

These examples demonstrate how to update a RAG corpus.

REST

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • CORPUS_ID: The corpus ID of your RAG corpus.
  • CORPUS_DISPLAY_NAME: The display name of the RAG corpus.
  • CORPUS_DESCRIPTION: The description of the RAG corpus.
  • INDEX_NAME: The resource name of the Vector Search Index. Format: projects/{project}/locations/{location}/indexes/{index}.
  • INDEX_ENDPOINT_NAME: The resource name of the Vector Search index endpoint. Format: projects/{project}/locations/{location}/indexEndpoints/{index_endpoint}.

HTTP method and URL:

PATCH https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/CORPUS_ID

Request JSON body:

{
  "display_name" : "CORPUS_DISPLAY_NAME",
  "description": "CORPUS_DESCRIPTION",
  "vector_db_config": {
    "vertex_vector_search": {
        "index": "INDEX_NAME",
        "index_endpoint": "INDEX_ENDPOINT_NAME",
    }
  }
}

To send your request, choose one of these options:

Save the request body in a file named request.json, and run the following command:

curl -X PATCH \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json; charset=utf-8" \
    -d @request.json \
    "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/CORPUS_ID"

Save the request body in a file named request.json, and run the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method PATCH `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/CORPUS_ID" | Select-Object -Expand Content

A successful request returns a 2xx status code.

List RAG corpora example

These code samples demonstrate how to list all of your RAG corpora.

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • PAGE_SIZE: The maximum number of RAG corpora to return per page.
  • PAGE_TOKEN: A page token from a previous ListRagCorpora response to retrieve the next page of results.

HTTP method and URL:

GET https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora?page_size=PAGE_SIZE&page_token=PAGE_TOKEN

To send your request, choose one of these options:

Run the following command:

curl -X GET \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora?page_size=PAGE_SIZE&page_token=PAGE_TOKEN"

Run the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method GET `
    -Headers $headers `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora?page_size=PAGE_SIZE&page_token=PAGE_TOKEN" | Select-Object -Expand Content

A successful request returns a 2xx status code and a list of RAG corpora for the specified project.

To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.


from vertexai import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

corpora = rag.list_corpora()
print(corpora)
# Example response:
# ListRagCorporaPager<rag_corpora {
#   name: "projects/[PROJECT_ID]/locations/us-central1/ragCorpora/2305843009213693952"
#   display_name: "test_corpus"
#   create_time {
# ...

Get a RAG corpus example

These code samples demonstrate how to get a RAG corpus.

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • RAG_CORPUS_ID: The ID of the RAG corpus resource.

HTTP method and URL:

GET https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID

To send your request, choose one of these options:

Run the following command:

curl -X GET \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID"

Run the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method GET `
    -Headers $headers `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID" | Select-Object -Expand Content

A successful response returns the RagCorpus resource.

The get and list commands are used in an example to demonstrate how RagCorpus uses the rag_embedding_model_config field with in the vector_db_config, which points to the embedding model you have chosen.

    PROJECT_ID: Your project ID.
    LOCATION: The region to process the request.
    RAG_CORPUS_ID: The corpus ID of your RAG corpus.
  ```

```sh
  // GetRagCorpus
  // Input: LOCATION, PROJECT_ID, RAG_CORPUS_ID
  // Output: RagCorpus
  curl -X GET \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID

  // ListRagCorpora
  curl -sS -X GET \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/
  ```

To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.


from vertexai import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# corpus_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

corpus = rag.get_corpus(name=corpus_name)
print(corpus)
# Example response:
# RagCorpus(name='projects/[PROJECT_ID]/locations/us-central1/ragCorpora/1234567890',
# display_name='test_corpus', description='Corpus Description',
# ...

Delete a RAG corpus example

These code samples demonstrate how to delete a RAG corpus.

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • RAG_CORPUS_ID: The ID of the RagCorpus resource.

HTTP method and URL:

DELETE https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID

To send your request, choose one of these options:

Run the following command:

curl -X DELETE \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID"

Run the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method DELETE `
    -Headers $headers `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID" | Select-Object -Expand Content

A successful response returns the DeleteOperationMetadata.

To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.


from vertexai import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# corpus_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

rag.delete_corpus(name=corpus_name)
print(f"Corpus {corpus_name} deleted.")
# Example response:
# Successfully deleted the RagCorpus.
# Corpus projects/[PROJECT_ID]/locations/us-central1/ragCorpora/123456789012345 deleted.

File management examples

This section provides examples of how to use the API to manage RAG files.

Upload a RAG file example

These code samples demonstrate how to upload a RAG file.

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • RAG_CORPUS_ID: The corpus ID of your RAG corpus.
  • LOCAL_FILE_PATH: The local path to the file to be uploaded.
  • DISPLAY_NAME: The display name of the RAG file.
  • DESCRIPTION: The description of the RAG file.

To send your request, use the following command:

curl -X POST \
-H "X-Goog-Upload-Protocol: multipart" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-F metadata="{'rag_file': {'display_name':' DISPLAY_NAME', 'description':'DESCRIPTION'}}" \
-F file=@LOCAL_FILE_PATH \
"https://LOCATION-aiplatform.googleapis.com/upload/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:upload"

To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.


from vertexai import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# corpus_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}"
# path = "path/to/local/file.txt"
# display_name = "file_display_name"
# description = "file description"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

rag_file = rag.upload_file(
    corpus_name=corpus_name,
    path=path,
    display_name=display_name,
    description=description,
)
print(rag_file)
# RagFile(name='projects/[PROJECT_ID]/locations/us-central1/ragCorpora/1234567890/ragFiles/09876543',
#  display_name='file_display_name', description='file description')

Import RAG files example

You can import files and folders from Drive or Cloud Storage. You can use response.metadata to view partial failures, request time, and response time in the SDK's response object.

The response.skipped_rag_files_count field contains the number of files that were skipped during import. The service skips a file if the following conditions are met:

  1. The file has already been imported.
  2. The file hasn't changed.
  3. The chunking configuration for the file hasn't changed.
from vertexai import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# corpus_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}"
# paths = ["https://drive.google.com/file/123", "gs://my_bucket/my_files_dir"]  # Supports Cloud Storage and Google Drive Links

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

response = rag.import_files(
    corpus_name=corpus_name,
    paths=paths,
    transformation_config=rag.TransformationConfig(
        rag.ChunkingConfig(chunk_size=1024, chunk_overlap=256)
    ),
    import_result_sink="gs://sample-existing-folder/sample_import_result_unique.ndjson",  # Optional: This must be an existing Cloud Storage bucket folder, and the filename must be unique (non-existent).
    llm_parser=rag.LlmParserConfig(
      model_name="gemini-2.5-pro-preview-05-06",
      max_parsing_requests_per_min=100,
    ),  # Optional
    max_embedding_requests_per_min=900,  # Optional
)
print(f"Imported {response.imported_rag_files_count} files.")

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • RAG_CORPUS_ID: The corpus ID of your RAG corpus.
  • FOLDER_RESOURCE_ID: The resource ID of your Drive folder.
  • GCS_URIS: A list of Cloud Storage locations. Example: gs://my-bucket1.
  • CHUNK_SIZE: Number of tokens each chunk should have.
  • CHUNK_OVERLAP: Number of tokens overlap between chunks.
  • EMBEDDING_MODEL_QPM_RATE: The QPM rate to limit RAG's access to your embedding model. Example: 1,000.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import

Request JSON body:

{
  "import_rag_files_config": {
    "gcs_source": {
      "uris": "GCS_URIS"
    },
    "rag_file_chunking_config": {
      "chunk_size": "CHUNK_SIZE",
      "chunk_overlap": "CHUNK_OVERLAP"
    }
  }
}

To send your request, choose one of these options:

Save the request body in a file named request.json, and run the following command:

curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json; charset=utf-8" \
    -d @request.json \
    "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import"

Save the request body in a file named request.json, and run the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import" | Select-Object -Expand Content

A successful response returns the ImportRagFilesOperationMetadata resource.

The following sample demonstrates how to import a file from Cloud Storage. Use the max_embedding_requests_per_min control field to limit the rate at which RAG Engine calls the embedding model during the ImportRagFiles indexing process. The field has a default value of 1000 calls per minute.

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • RAG_CORPUS_ID: The corpus ID of your RAG corpus.
  • GCS_URIS: A list of Cloud Storage locations. Example: gs://my-bucket1.
  • CHUNK_SIZE: Number of tokens each chunk should have.
  • CHUNK_OVERLAP: Number of tokens overlap between chunks.
  • EMBEDDING_MODEL_QPM_RATE: The QPM rate to limit RAGs access to your embedding model. Example: 1,000.
// ImportRagFiles
// Import a single Cloud Storage file or all files in a Cloud Storage bucket.
// Input: LOCATION, PROJECT_ID, RAG_CORPUS_ID, GCS_URIS
// Output: ImportRagFilesOperationMetadataNumber
// Use ListRagFiles to find the server-generated rag_file_id.
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import \
-d '{
  "import_rag_files_config": {
    "gcs_source": {
      "uris": "GCS_URIS"
    },
    "rag_file_chunking_config": {
      "chunk_size": CHUNK_SIZE,
      "chunk_overlap": CHUNK_OVERLAP
    },
    "max_embedding_requests_per_min": EMBEDDING_MODEL_QPM_RATE
  }
}'

The following sample demonstrates how to import a file from Drive. Use the max_embedding_requests_per_min control field to limit the rate at which RAG Engine calls the embedding model during the ImportRagFiles indexing process. The field has a default value of 1000 calls per minute.

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • RAG_CORPUS_ID: The corpus ID of your RAG corpus.
  • FOLDER_RESOURCE_ID: The resource ID of your Drive folder.
  • CHUNK_SIZE: Number of tokens each chunk should have.
  • CHUNK_OVERLAP: Number of tokens overlap between chunks.
  • EMBEDDING_MODEL_QPM_RATE: The QPM rate to limit RAG's access to your embedding model. Example: 1,000.
// ImportRagFiles
// Import all files in a Google Drive folder.
// Input: LOCATION, PROJECT_ID, RAG_CORPUS_ID, FOLDER_RESOURCE_ID
// Output: ImportRagFilesOperationMetadataNumber
// Use ListRagFiles to find the server-generated rag_file_id.
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import \
-d '{
  "import_rag_files_config": {
    "google_drive_source": {
      "resource_ids": {
        "resource_id": "FOLDER_RESOURCE_ID",
        "resource_type": "RESOURCE_TYPE_FOLDER"
      }
    },
    "max_embedding_requests_per_min": EMBEDDING_MODEL_QPM_RATE
  }
}'

List RAG files example

These code samples demonstrate how to list RAG files.

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • RAG_CORPUS_ID: The ID of the RagCorpus resource.
  • PAGE_SIZE: The maximum number of RagFiles to return per page.
  • PAGE_TOKEN: A page token from a previous ListRagFiles response to retrieve the next page of results.

HTTP method and URL:

GET https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles?page_size=PAGE_SIZE&page_token=PAGE_TOKEN

To send your request, choose one of these options:

Run the following command:

curl -X GET \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles?page_size=PAGE_SIZE&page_token=PAGE_TOKEN"

Run the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method GET `
    -Headers $headers `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles?page_size=PAGE_SIZE&page_token=PAGE_TOKEN" | Select-Object -Expand Content

A successful request returns a 2xx status code and a list of RagFiles for the specified RAG_CORPUS_ID.

To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.


from vertexai import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# corpus_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

files = rag.list_files(corpus_name=corpus_name)
for file in files:
    print(file.display_name)
    print(file.name)
# Example response:
# g-drive_file.txt
# projects/1234567890/locations/us-central1/ragCorpora/111111111111/ragFiles/222222222222
# g_cloud_file.txt
# projects/1234567890/locations/us-central1/ragCorpora/111111111111/ragFiles/333333333333

Get a RAG file example

These code samples demonstrate how to get a RAG file.

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • RAG_CORPUS_ID: The ID of the RagCorpus resource.
  • RAG_FILE_ID: The ID of the RagFile resource.

HTTP method and URL:

GET https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID

To send your request, choose one of these options:

Run the following command:

curl -X GET \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID"

Run the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method GET `
    -Headers $headers `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID" | Select-Object -Expand Content

A successful response returns the RagFile resource.

To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.


from vertexai import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# file_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}/ragFiles/{rag_file_id}"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

rag_file = rag.get_file(name=file_name)
print(rag_file)
# Example response:
# RagFile(name='projects/1234567890/locations/us-central1/ragCorpora/11111111111/ragFiles/22222222222',
# display_name='file_display_name', description='file description')

Delete a RAG file example

These code samples demonstrate how to delete a RAG file.

Before using any of the request data, make the following replacements:

  • PROJECT_ID>: Your project ID.
  • LOCATION: The region to process the request.
  • RAG_CORPUS_ID: The ID of the RagCorpus resource.
  • RAG_FILE_ID: The ID of the RagFile resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus}/ragFiles/{rag_file_id}.

HTTP method and URL:

DELETE https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID

To send your request, choose one of these options:

Run the following command:

curl -X DELETE \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID"

Run the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method DELETE `
    -Headers $headers `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID" | Select-Object -Expand Content

A successful response returns the DeleteOperationMetadata.

To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.


from vertexai import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# file_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}/ragFiles/{rag_file_id}"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

rag.delete_file(name=file_name)
print(f"File {file_name} deleted.")
# Example response:
# Successfully deleted the RagFile.
# File projects/1234567890/locations/us-central1/ragCorpora/1111111111/ragFiles/2222222222 deleted.

Retrieval query example

When you provide a query, the retrieval component in RAG searches its knowledge base to find relevant information.

To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.


from vertexai import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# corpus_name = "projects/[PROJECT_ID]/locations/us-central1/ragCorpora/[rag_corpus_id]"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

response = rag.retrieval_query(
    rag_resources=[
        rag.RagResource(
            rag_corpus=corpus_name,
            # Optional: supply IDs from `rag.list_files()`.
            # rag_file_ids=["rag-file-1", "rag-file-2", ...],
        )
    ],
    text="Hello World!",
    rag_retrieval_config=rag.RagRetrievalConfig(
        top_k=10,
        filter=rag.utils.resources.Filter(vector_distance_threshold=0.5),
    ),
)
print(response)
# Example response:
# contexts {
#   contexts {
#     source_uri: "gs://your-bucket-name/file.txt"
#     text: "....
#   ....

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • RAG_CORPUS_RESOURCE: The name of the RagCorpus resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus}.
  • VECTOR_DISTANCE_THRESHOLD: Only contexts with a vector distance smaller than the threshold are returned.
  • TEXT: The query text to get relevant contexts.
  • SIMILARITY_TOP_K: The number of top contexts to retrieve.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION:retrieveContexts

Request JSON body:

{
"vertex_rag_store": {
    "rag_resources": {
      "rag_corpus": "RAG_CORPUS_RESOURCE"
    },
    "vector_distance_threshold": VECTOR_DISTANCE_THRESHOLD
  },
  "query": {
  "text": TEXT
  "similarity_top_k": SIMILARITY_TOP_K
  }
}

Save the request body in a file named request.json, and run the following command:

curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json; charset=utf-8" \
    -d @request.json \
    "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION:retrieveContexts"

Save the request body in a file named request.json, and run the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION:retrieveContexts" | Select-Object -Expand Content

A successful request returns a 2xx status code and a list of related contexts.

Generation example

The LLM generates a grounded response using the retrieved contexts.

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • MODEL_ID: LLM model for content generation. Example: gemini-2.5-flash.
  • GENERATION_METHOD: LLM method for content generation. Options: generateContent, streamGenerateContent.
  • INPUT_PROMPT: The text sent to the LLM for content generation. Try to use a prompt relevant to the uploaded rag Files.
  • RAG_CORPUS_RESOURCE: The name of the RagCorpus resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus}.
  • SIMILARITY_TOP_K: Optional: The number of top contexts to retrieve.
  • VECTOR_DISTANCE_THRESHOLD: Optional: Contexts with a vector distance smaller than the threshold are returned.
  • USER: Your username.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:GENERATION_METHOD

Request JSON body:

{
"contents": {
  "role": "USER",
  "parts": {
    "text": "INPUT_PROMPT"
  }
},
"tools": {
  "retrieval": {
  "disable_attribution": false,
  "vertex_rag_store": {
    "rag_resources": {
      "rag_corpus": "RAG_CORPUS_RESOURCE"
    },
    "similarity_top_k": "SIMILARITY_TOP_K",
    "vector_distance_threshold": VECTOR_DISTANCE_THRESHOLD
  }
  }
}
}

To send your request, choose one of these options:

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json; charset=utf-8" \
    -d @request.json \
    "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:GENERATION_METHOD"

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:GENERATION_METHOD" | Select-Object -Expand Content

A successful response returns the generated content with citations.

To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.


from vertexai import rag
from vertexai.generative_models import GenerativeModel, Tool
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# corpus_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

rag_retrieval_tool = Tool.from_retrieval(
    retrieval=rag.Retrieval(
        source=rag.VertexRagStore(
            rag_resources=[
                rag.RagResource(
                    rag_corpus=corpus_name,
                    # Optional: supply IDs from `rag.list_files()`.
                    # rag_file_ids=["rag-file-1", "rag-file-2", ...],
                )
            ],
            rag_retrieval_config=rag.RagRetrievalConfig(
                top_k=10,
                filter=rag.utils.resources.Filter(vector_distance_threshold=0.5),
            ),
        ),
    )
)

rag_model = GenerativeModel(
    model_name="gemini-2.0-flash-001", tools=[rag_retrieval_tool]
)
response = rag_model.generate_content("Why is the sky blue?")
print(response.text)
# Example response:
#   The sky appears blue due to a phenomenon called Rayleigh scattering.
#   Sunlight, which contains all colors of the rainbow, is scattered
#   by the tiny particles in the Earth's atmosphere....
#   ...

Project management examples

The tier is a project-level setting in the RagEngineConfig resource that affects RAG corpora that use RagManagedDb. To get the tier configuration, use GetRagEngineConfig. To update the tier configuration, use UpdateRagEngineConfig.

For more information on managing your tier configuration, see Manage tiers.

Get project configuration

The following code samples demonstrate how to read your RagEngineConfig:

  1. In the Google Cloud console, go to the RAG Engine page.

    Go to RAG Engine

  2. Select the region in which your RAG Engine is running. Your list of RAG corpora is updated.
  3. Click Configure RAG Engine. The Configure RAG Engine pane appears. You can see the tier that's selected for your RAG Engine.
  4. Click Cancel.
from vertexai import rag
import vertexai

PROJECT_ID = YOUR_PROJECT_ID
LOCATION = YOUR_RAG_ENGINE_LOCATION

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location=LOCATION)

rag_engine_config = rag.rag_data.get_rag_engine_config(
    name=f"projects/{PROJECT_ID}/locations/{LOCATION}/ragEngineConfig"
)

print(rag_engine_config)
curl -X GET \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/ragEngineConfig

Update project configuration

This section provides code samples to demonstrate how to change your configuration to a Scaled, Basic, or Unprovisioned tier.

Update your RagEngineConfig to the Scaled tier

The following code samples demonstrate how to set the RagEngineConfig to the Scaled tier:

  1. In the Google Cloud console, go to the RAG Engine page.

    Go to RAG Engine

  2. Select the region in which your RAG Engine is running. Your list of RAG corpora is updated.
  3. Click Configure RAG Engine. The Configure RAG Engine pane appears.
  4. Select the tier that you want to run your RAG Engine.
  5. Click Save.
from vertexai import rag
import vertexai

PROJECT_ID = YOUR_PROJECT_ID
LOCATION = YOUR_RAG_ENGINE_LOCATION

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location=LOCATION)

rag_engine_config_name=f"projects/{PROJECT_ID}/locations/{LOCATION}/ragEngineConfig"

new_rag_engine_config = rag.RagEngineConfig(
name=rag_engine_config_name,
rag_managed_db_config=rag.RagManagedDbConfig(tier=rag.Scaled()),
)

updated_rag_engine_config = rag.rag_data.update_rag_engine_config(
rag_engine_config=new_rag_engine_config
)

print(updated_rag_engine_config)
curl -X PATCH \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/ragEngineConfig -d "{'ragManagedDbConfig': {'scaled': {}}}"

Update your RagEngineConfig to the Basic tier

The following code samples demonstrate how to set the RagEngineConfig to the Basic tier:

  1. In the Google Cloud console, go to the RAG Engine page.

    Go to RAG Engine

  2. Select the region in which your RAG Engine is running. Your list of RAG corpora is updated.
  3. Click Configure RAG Engine. The Configure RAG Engine pane appears.
  4. Select the tier that you want to run your RAG Engine.
  5. Click Save.
from vertexai import rag
import vertexai

PROJECT_ID = YOUR_PROJECT_ID
LOCATION = YOUR_RAG_ENGINE_LOCATION

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location=LOCATION)

rag_engine_config_name=f"projects/{PROJECT_ID}/locations/{LOCATION}/ragEngineConfig"

new_rag_engine_config = rag.RagEngineConfig(
name=rag_engine_config_name,
rag_managed_db_config=rag.RagManagedDbConfig(tier=rag.Basic()),
)

updated_rag_engine_config = rag.rag_data.update_rag_engine_config(
rag_engine_config=new_rag_engine_config
)

print(updated_rag_engine_config)
curl -X PATCH \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/ragEngineConfig -d "{'ragManagedDbConfig': {'basic': {}}}"

Update your RagEngineConfig to the Unprovisioned tier

The following code samples demonstrate how to set the RagEngineConfig to the Unprovisioned tier:

  1. In the Google Cloud console, go to the RAG Engine page.

    Go to RAG Engine

  2. Select the region in which your RAG Engine is running. Your list of RAG corpora is updated.
  3. Click Configure RAG Engine. The Configure RAG Engine pane appears.
  4. Click Delete RAG Engine. A confirmation dialog appears.
  5. Verify that you're about to delete your data in RAG Engine by typing delete, then click Confirm.
  6. Click Save.
from vertexai import rag
import vertexai

PROJECT_ID = YOUR_PROJECT_ID
LOCATION = YOUR_RAG_ENGINE_LOCATION

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location=LOCATION)

rag_engine_config_name=f"projects/{PROJECT_ID}/locations/{LOCATION}/ragEngineConfig"

new_rag_engine_config = rag.RagEngineConfig(
  name=rag_engine_config_name,
  rag_managed_db_config=rag.RagManagedDbConfig(tier=rag.Unprovisioned()),
)

updated_rag_engine_config = rag.rag_data.update_rag_engine_config(
  rag_engine_config=new_rag_engine_config
)

print(updated_rag_engine_config)
curl -X PATCH \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/ragEngineConfig -d "{'ragManagedDbConfig': {'unprovisioned': {}}}"

What's next