RAG Engine API

This guide provides an API reference for RAG Engine. It covers the following topics:

The Vertex AI RAG Engine is a component of the Vertex AI platform that facilitates Retrieval-Augmented Generation (RAG). RAG Engine enables Large Language Models (LLMs) to access and incorporate data from external knowledge sources, such as documents and databases. By using RAG, LLMs can generate more accurate and informative responses.

Parameters list

Corpus management parameters

For information about a RAG corpus, see Corpus management.

Create a RAG corpus

This table lists the parameters used to create a RAG corpus.

Body Request

Parameter Description

corpus_type_config

Optional, Immutable. RagCorpus.CorpusTypeConfig

The configuration to specify the corpus type.

display_name

Required. string

The display name of the RAG corpus.

description

Optional. string

The description of the RAG corpus.

encryption_spec

Optional, Immutable. string

The CMEK key name used to encrypt at-rest data related to the RAG corpus. This key name is only applicable to the RagManaged option for the vector database. You can set this field when you create the corpus, but you can't update or delete it later.

Format: projects/{project}/locations/{location}/keyRings/{key_ring}/cryptoKeys/{key_name}

vector_db_config

Optional, Immutable. RagVectorDbConfig

The configuration for the vector databases.

vertex_ai_search_config.serving_config

Optional. string

The configuration for Vertex AI Search.

Format: projects/{project}/locations/{location}/collections/{collection}/engines/{engine}/servingConfigs/{serving_config} or projects/{project}/locations/{location}/collections/{collection}/dataStores/{data_store}/servingConfigs/{serving_config}

CorpusTypeConfig

Parameter Description

document_corpus

oneof RagCorpus.CorpusTypeConfig.DocumentCorpus

The default value of corpus_type_config, which represents a conventional document-based RAG corpus.

memory_corpus

oneof RagCorpus.CorpusTypeConfig.MemoryCorpus

If you set this type, the RAG corpus is a MemoryCorpus that can be used with the Gemini Live API as a memory store.

For more information, see Use Vertex AI RAG Engine as the memory store.

memory_corpus.llm_parser

oneof RagFileParsingConfig.LlmParser

The LLM parser that's used to parse and store session contexts from the Gemini Live API. You can build memories for indexing.

RagVectorDbConfig

When you create a RAG corpus, you can choose from several vector database options. The following table provides a comparison to help you select the best option for your use case.

Vector Database Option Description Use Case
rag_managed_db A fully managed vector database provided by Google. For users who want an integrated solution without managing their own database infrastructure.
weaviate Connects to a self-managed Weaviate instance. For users who already have or prefer to use a Weaviate vector database.
pinecone Connects to a self-managed Pinecone instance. For users who already have or prefer to use a Pinecone vector database.
vertex_feature_store Uses an existing Vertex AI Feature Store instance. For integrating RAG with existing machine learning features and data in Vertex AI Feature Store.
vertex_vector_search Uses an existing Vector Search index. For leveraging the advanced scalability and performance of Vector Search for large-scale applications.

If you choose the rag_managed_db option, you must also select a retrieval strategy. The following table compares the available strategies.

Retrieval Strategy Description Pros Cons
KNN (k-Nearest Neighbors) Finds the exact nearest neighbors by performing an exhaustive search across all data points. Provides the most accurate and relevant results. Can be slower and more computationally expensive, especially with large datasets.
ANN (Approximate Nearest Neighbor) Finds neighbors that are likely to be the closest, trading some accuracy for significant speed improvements. Much faster query performance and lower latency. Results are approximate and might not be the absolute most relevant.
Parameter Description

rag_managed_db

oneof vector_db: RagVectorDbConfig.RagManagedDb

If no vector database is specified, rag_managed_db is the default vector database.

rag_managed_db.knn

oneof retrieval_strategy: KNN

This is the default retrieval strategy.

Finds the exact nearest neighbors by comparing all data points in your RAG corpus.

If you don't specify a strategy when you create your RAG corpus, KNN is used.

rag_managed_db.ann

oneof retrieval_strategy: ANN

tree_depth

Determines the number of layers or levels in the tree.

  • If you have O(10K) RAG files in the RAG corpus, set this value to 2.
  • If more layers or levels are required, set this value to 3.
  • If you don't specify the number of layers or levels, Vertex AI RAG Engine assigns a default value of 2.

leaf_count

Determines the number of leaf nodes in the tree-based structure.

  • The recommended value is 10 * sqrt(num of RAG files in your RAG corpus).
  • If not specified, Vertex AI RAG Engine assigns a default value of 500.

rebuild_ann_index

  • Vertex AI RAG Engine rebuilds your ANN index.
  • To rebuild the index, set this to true in your ImportRagFiles API request.
  • Before you query the RAG corpus, you must rebuild the ANN index once.
  • Only one concurrent index rebuild is supported on a project in each location.

weaviate

oneof vector_db: RagVectorDbConfig.Weaviate

Specifies your Weaviate instance.

weaviate.http_endpoint

string

The Weaviate instance's HTTP endpoint.

This value can't be changed after it's set. You can leave this field empty in the CreateRagCorpus request and set it later in an UpdateRagCorpus request.

weaviate.collection_name

string

The Weaviate collection that the RAG corpus maps to.

This value can't be changed after it's set. You can leave this field empty in the CreateRagCorpus request and set it later in an UpdateRagCorpus request.

pinecone

oneof vector_db: RagVectorDbConfig.Pinecone

Specifies your Pinecone instance.

pinecone.index_name

string

The name used to create the Pinecone index that's used with the RAG corpus.

This value can't be changed after it's set. You can leave this field empty in the CreateRagCorpus request and set it later in an UpdateRagCorpus request.

vertex_feature_store

oneof vector_db: RagVectorDbConfig.VertexFeatureStore

Specifies your Vertex AI Feature Store instance.

vertex_feature_store.feature_view_resource_name

string

The Vertex AI Feature Store FeatureView that the RAG corpus maps to.

Format: projects/{project}/locations/{location}/featureOnlineStores/{feature_online_store}/featureViews/{feature_view}

This value can't be changed after it's set. You can leave this field empty in the CreateRagCorpus request and set it later in an UpdateRagCorpus request.

vertex_vector_search

oneof vector_db: RagVectorDbConfig.VertexVectorSearch

Specifies your Vector Search instance.

vertex_vector_search.index

string

The resource name of the Vector Search index that's used with the RAG corpus.

Format: projects/{project}/locations/{location}/indexEndpoints/{index_endpoint}

This value can't be changed after it's set. You can leave this field empty in the CreateRagCorpus request and set it later in an UpdateRagCorpus request.

vertex_vector_search.index_endpoint

string

The resource name of the Vector Search index endpoint that's used with the RAG corpus.

Format: projects/{project}/locations/{location}/indexes/{index}

This value can't be changed after it's set. You can leave this field empty in the CreateRagCorpus request and set it later in an UpdateRagCorpus request.

api_auth.api_key_config.api_key_secret_version

string

This is the full resource name of the secret stored in Secret Manager, which contains the API key for your Weaviate or Pinecone vector database.

Format: projects/{PROJECT_NUMBER}/secrets/{SECRET_ID}/versions/{VERSION_ID}

You can leave this field empty in the CreateRagCorpus request and set it later in an UpdateRagCorpus request.

rag_embedding_model_config.vertex_prediction_endpoint.endpoint

Optional, Immutable. string

The embedding model to use for the RAG corpus. This value can't be changed after it's set. If you leave it empty, RAG Engine uses text-embedding-005 as the embedding model.

Update a RAG corpus

This table lists the parameters used to update a RAG corpus.

Body Request

Parameter Description

display_name

Optional. string

The display name of the RAG corpus.

description

Optional. string

The description of the RAG corpus.

rag_vector_db.weaviate.http_endpoint

string

The Weaviate instance's HTTP endpoint.

If you created the RagCorpus with a Weaviate configuration and haven't set this field, you can update the Weaviate instance's HTTP endpoint.

rag_vector_db.weaviate.collection_name

string

The Weaviate collection that the RAG corpus maps to.

If you created the RagCorpus with a Weaviate configuration and haven't set this field, you can update the Weaviate instance's collection name.

rag_vector_db.pinecone.index_name

string

The name used to create the Pinecone index that's used with the RAG corpus.

If you created the RagCorpus with a Pinecone configuration and haven't set this field, you can update the Pinecone instance's index name.

rag_vector_db.vertex_feature_store.feature_view_resource_name

string

The Vertex AI Feature Store FeatureView that the RAG corpus maps to.

Format: projects/{project}/locations/{location}/featureOnlineStores/{feature_online_store}/featureViews/{feature_view}

If you created the RagCorpus with a Vertex AI Feature Store configuration and haven't set this field, you can update it.

rag_vector_db.vertex_vector_search.index

string

The resource name of the Vector Search index that's used with the RAG corpus.

Format: projects/{project}/locations/{location}/indexEndpoints/{index_endpoint}

If you created the RagCorpus with a Vector Search configuration and haven't set this field, you can update it.

rag_vector_db.vertex_vector_search.index_endpoint

string

The resource name of the Vector Search index endpoint that's used with the RAG corpus.

Format: projects/{project}/locations/{location}/indexes/{index}

If you created the RagCorpus with a Vector Search configuration and haven't set this field, you can update it.

rag_vector_db.api_auth.api_key_config.api_key_secret_version

string

The full resource name of the secret stored in Secret Manager, which contains the API key for your Weaviate or Pinecone vector database.

Format: projects/{PROJECT_NUMBER}/secrets/{SECRET_ID}/versions/{VERSION_ID}

List RAG corpora

This table lists the parameters used to list RAG corpora.

Parameter Description

page_size

Optional. int

The standard list page size.

page_token

Optional. string

A page token that you can get from [ListRagCorporaResponse.next_page_token][] in a previous [VertexRagDataService.ListRagCorpora][] call.

Get a RAG corpus

This table lists the parameter used to get a RAG corpus.

Parameter Description

name

Required. string

The name of the RagCorpus resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus_id}

Delete a RAG corpus

This table lists the parameter used to delete a RAG corpus.

Parameter Description

name

Required. string

The name of the RagCorpus resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus_id}

File management parameters

For information about a RAG file, see File management.

Upload a RAG file

This table lists parameters used to upload a RAG file.

Body Request

Parameter Description

parent

Required. string

The name of the RagCorpus resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus_id}

rag_file

Required. RagFile

The file to upload.

upload_rag_file_config

Required. UploadRagFileConfig

The configuration for the RagFile to be uploaded into the RagCorpus.

RagFile Description

display_name

Required. string

The display name of the RAG file.

description

Optional. string

The description of the RAG file.

UploadRagFileConfig Description

rag_file_transformation_config.rag_file_chunking_config.fixed_length_chunking.chunk_size

int32

The number of tokens in each chunk.

rag_file_transformation_config.rag_file_chunking_config.fixed_length_chunking.chunk_overlap

int32

The number of tokens to overlap between chunks.

Import RAG files

This table lists parameters used to import a RAG file.

Parameter Description

parent

Required. string

The name of the RagCorpus resource.

Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus_id}

gcs_source

oneof import_source: GcsSource

The Cloud Storage location. Supports importing individual files and entire Cloud Storage directories.

gcs_source.uris

list of string

The Cloud Storage URI that contains the upload file.

google_drive_source

oneof import_source: GoogleDriveSource

The Google Drive location. Supports importing individual files and Google Drive folders.

slack_source

oneof import_source: SlackSource

The Slack channel where the file is uploaded.

jira_source

oneof import_source: JiraSource

The Jira query where the file is uploaded.

share_point_sources

oneof import_source: SharePointSources

The SharePoint sources where the file is uploaded.

rag_file_transformation_config.rag_file_chunking_config.fixed_length_chunking.chunk_size

int32

The number of tokens in each chunk.

rag_file_transformation_config.rag_file_chunking_config.fixed_length_chunking.chunk_overlap

int32

The number of tokens to overlap between chunks.

rag_file_parsing_config

Optional. RagFileParsingConfig

Specifies the parsing configuration for RagFiles.

If you don't set this field, RAG uses the default parser.

max_embedding_requests_per_min

Optional. int32

The maximum number of queries per minute (QPM) that this job can make to the embedding model. This value is specific to this job and isn't shared with other import jobs. To set an appropriate value, see the Quotas page for your project. If you don't specify a value, a default of 1,000 QPM is used.

GoogleDriveSource Description

resource_ids.resource_id

Required. string

The ID of the Google Drive resource.

resource_ids.resource_type

Required. string

The type of the Google Drive resource.

SlackSource Description

channels.channels

Repeated. SlackSource.SlackChannels.SlackChannel

Slack channel information, including ID and time range to import.

channels.channels.channel_id

Required. string

The Slack channel ID.

channels.channels.start_time

Optional. google.protobuf.Timestamp

The starting timestamp for messages to import.

channels.channels.end_time

Optional. google.protobuf.Timestamp

The ending timestamp for messages to import.

channels.api_key_config.api_key_secret_version

Required. string

The full resource name of the secret stored in Secret Manager, which contains a Slack channel access token that has access to the specified Slack channel IDs.
See: Getting a token.

Format: projects/{PROJECT_NUMBER}/secrets/{SECRET_ID}/versions/{VERSION_ID}

JiraSource Description

jira_queries.projects

Repeated. string

A list of Jira projects to import in their entirety.

jira_queries.custom_queries

Repeated. string

A list of custom Jira queries to import. For more information about Jira Query Language (JQL), see
Jira Support.

jira_queries.email

Required. string

The Jira email address.

jira_queries.server_uri

Required. string

The Jira server URI.

jira_queries.api_key_config.api_key_secret_version

Required. string

The full resource name of the secret stored in Secret Manager, which contains a Jira API key.
See: Manage API tokens for your Atlassian account.

Format: projects/{PROJECT_NUMBER}/secrets/{SECRET_ID}/versions/{VERSION_ID}

SharePointSources Description

share_point_sources.sharepoint_folder_path

oneof in folder_source: string

The path of the SharePoint folder to download from.

share_point_sources.sharepoint_folder_id

oneof in folder_source: string

The ID of the SharePoint folder to download from.

share_point_sources.drive_name

oneof in drive_source: string

The name of the drive to download from.

share_point_sources.drive_id

oneof in drive_source: string

The ID of the drive to download from.

share_point_sources.client_id

string

The Application (client) ID for the app registered in the Microsoft Azure Portal.
The application must also be configured with the following Microsoft Graph permissions: Files.ReadAll, Sites.ReadAll, and BrowserSiteLists.Read.All.

share_point_sources.client_secret.api_key_secret_version

Required. string

The full resource name of the secret stored in Secret Manager, which contains the application secret for the app registered in Azure.

Format: projects/{PROJECT_NUMBER}/secrets/{SECRET_ID}/versions/{VERSION_ID}

share_point_sources.tenant_id

string

Unique identifier of the Azure Active Directory Instance.

share_point_sources.sharepoint_site_name

string

The name of the SharePoint site to download from. This can be the site name or the site ID.

Parser Option Description Use Case
layout_parser Uses Document AI to parse files, preserving the structure and layout of the document. Best for structured or semi-structured documents like PDFs with tables, columns, and complex layouts.
llm_parser Uses a Large Language Model (LLM) to parse files, focusing on semantic understanding of the content. Ideal for unstructured text documents where extracting meaning and context is more important than preserving the original visual layout.
RagFileParsingConfig Description

layout_parser

oneof parser: RagFileParsingConfig.LayoutParser

The Layout Parser to use for RagFiles.

layout_parser.processor_name

string

The full resource name of a Document AI processor or processor version.

Format:
projects/{project_id}/locations/{location}/processors/{processor_id}
projects/{project_id}/locations/{location}/processors/{processor_id}/processorVersions/{processor_version_id}

layout_parser.max_parsing_requests_per_min

string

The maximum number of requests the job is allowed to make to the Document AI processor per minute.

Consult https://cloud.google.com/document-ai/quotas and the Quota page for your project to set an appropriate value here. If unspecified, a default value of 120 QPM is used.

llm_parser

oneof parser: RagFileParsingConfig.LlmParser

The LLM parser to use for RagFiles.

llm_parser.model_name

string

The resource name of an LLM model.

Format:
projects/{project_id}/locations/{location}/publishers/{publisher}/models/{model}

llm_parser.max_parsing_requests_per_min

string

The maximum number of requests the job is allowed to make to the LLM model per minute.

To set an appropriate value for your project, see model quota section and the Quota page for your project to set an appropriate value here. If unspecified, a default value of 5000 QPM is used.

Get a RAG file

This table lists the parameter used to get a RAG file.

Parameter Description

name

Required. string

The name of the RagFile resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_file_id}

Delete a RAG file

This table lists the parameter used to delete a RAG file.

Parameter Description

name

Required. string

The name of the RagFile resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_file_id}

Retrieval and prediction parameters

This section lists the retrieval and prediction parameters.

Retrieval parameters

This table lists the parameters for the retrieveContexts method.

Parameter Description

parent

Required. string

The resource name of the location from which to retrieve RagContexts.
You must have permission to make calls in the project.

Format: projects/{project}/locations/{location}

vertex_rag_store

VertexRagStore

The data source for Vertex RagStore.

query

Required. RagQuery

Single RAG retrieve query.

VertexRagStore

VertexRagStore Description

rag_resources

list: RagResource

The RAG source. You can specify a single corpus or multiple RagFiles from one corpus.

rag_resources.rag_corpus

Optional. string

RagCorpora resource name.

Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus}

rag_resources.rag_file_ids

list: string

A list of RagFile resources.

Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus}/ragFiles/{rag_file}

RagQuery Description

text

string

The query in text format to get relevant contexts.

rag_retrieval_config

Optional. RagRetrievalConfig

The retrieval configuration for the query.

RagRetrievalConfig Description

top_k

Optional. int32

The number of contexts to retrieve.

hybrid_search.alpha

Optional. float

Controls the weight between dense and sparse vector search results. The value must be between 0.0 and 1.0. A value of 0.0 means sparse vector search only, and 1.0 means dense vector search only. The default is 0.5. Hybrid search is only available for Weaviate.

filter.vector_distance_threshold

oneof vector_db_threshold: double

Returns only contexts with a vector distance smaller than this threshold.

filter.vector_similarity_threshold

oneof vector_db_threshold: double

Returns only contexts with a vector similarity larger than this threshold.

ranking.rank_service.model_name

Optional. string

The model name of the rank service.

Example: semantic-ranker-512@latest

ranking.llm_ranker.model_name

Optional. string

The model name used for ranking.

Example: gemini-2.5-flash

Prediction parameters

This table lists prediction parameters.

GenerateContentRequest Description

tools.retrieval.vertex_rag_store

VertexRagStore

Set to use a data source powered by the Vertex AI RAG store.

See VertexRagStore for details.

Project management parameters

This table lists project-level parameters.

When using the RagManagedDb, you can select a tier that best fits your performance and cost requirements. The following table compares the available tiers.

Tier Description Use Case
scaled A production-scale tier with autoscaling to handle high and variable query loads. Recommended for production applications requiring high availability and performance.
basic A cost-effective, lower-compute tier for smaller-scale needs. Suitable for development, testing, or applications with low, predictable traffic.
unprovisioned De-provisions the managed database and its underlying resources. Used to disable the managed database and stop incurring associated costs.

RagEngineConfig

Parameter Description
RagManagedDbConfig.scaled This tier offers production-scale performance along with autoscaling functionality.
RagManagedDbConfig.basic This tier offers a cost-effective and low-compute tier.
RagManagedDbConfig.unprovisioned This tier de-provisions the RagManagedDb and its underlying Spanner instance.

Corpus management examples

This section provides examples of how to use the API to manage your RAG corpora.

Create a RAG corpus example

This example demonstrates how to create a RAG corpus.

REST

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • CORPUS_DISPLAY_NAME: The display name of the RagCorpus.
  • CORPUS_DESCRIPTION: The description of the RagCorpus.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora

Request JSON body:

{
  "display_name" : "CORPUS_DISPLAY_NAME",
  "description": "CORPUS_DESCRIPTION",
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora"

PowerShell

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora" | Select-Object -Expand Content
A successful request returns a 2xx status code.

Update a RAG corpus example

You can update a RAG corpus's display name, description, and some vector database configurations. However, you can't change the following immutable parameters:

  • The vector database type. For example, you can't change the vector database from Weaviate to Vertex AI Feature Store.
  • If you're using the managed database option, you can't update the vector database configuration.

This example demonstrates how to update a RAG corpus.

REST

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • CORPUS_ID: The corpus ID of your RAG corpus.
  • CORPUS_DISPLAY_NAME: The display name of the RagCorpus.
  • CORPUS_DESCRIPTION: The description of the RagCorpus.
  • INDEX_NAME: The resource name of the Vector Search Index. Format: projects/{project}/locations/{location}/indexes/{index}
  • INDEX_ENDPOINT_NAME: The resource name of the Vector Search Index Endpoint. Format: projects/{project}/locations/{location}/indexEndpoints/{index_endpoint}

HTTP method and URL:

PATCH https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/CORPUS_ID

Request JSON body:

{
  "display_name" : "CORPUS_DISPLAY_NAME",
  "description": "CORPUS_DESCRIPTION",
  "rag_vector_db_config": {
     "vertex_vector_search": {
         "index": "INDEX_NAME",
         "index_endpoint": "INDEX_ENDPOINT_NAME",
     }
  }
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json, and execute the following command:

curl -X PATCH \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/CORPUS_ID"

PowerShell

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method PATCH `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/CORPUS_ID" | Select-Object -Expand Content
A successful request returns a 2xx status code.

List RAG corpora example

This example shows how to list all RAG corpora in a project.

REST

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • PAGE_SIZE: The standard list page size. You can adjust the number of RagCorpora to return per page by updating the page_size parameter.
  • PAGE_TOKEN: The standard list page token. Get this token from ListRagCorporaResponse.next_page_token in a previous VertexRagDataService.ListRagCorpora call.

HTTP method and URL:

GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora?page_size=PAGE_SIZE&page_token=PAGE_TOKEN

To send your request, choose one of these options:

curl

Execute the following command:

curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora?page_size=PAGE_SIZE&page_token=PAGE_TOKEN"

PowerShell

Execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora?page_size=PAGE_SIZE&page_token=PAGE_TOKEN" | Select-Object -Expand Content
A successful request returns a 2xx status code and a list of RagCorpora for the specified project.

Get a RAG corpus example

REST

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • RAG_CORPUS_ID: The ID of the RagCorpus resource.

HTTP method and URL:

GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID

To send your request, choose one of these options:

curl

Execute the following command:

curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID"

PowerShell

Execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID" | Select-Object -Expand Content
A successful request returns the specified RagCorpus resource.

Delete a RAG corpus example

REST

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • RAG_CORPUS_ID: The ID of the RagCorpus resource.

HTTP method and URL:

DELETE https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID

To send your request, choose one of these options:

curl

Execute the following command:

curl -X DELETE \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID"

PowerShell

Execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method DELETE `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID" | Select-Object -Expand Content
A successful request returns a DeleteOperationMetadata resource.

File management examples

This section provides examples of how to use the API to manage RAG files.

Upload a RAG file example

REST

Before running the command, replace the following variables:

  PROJECT_ID: Your project ID.
  LOCATION: The region to process the request.
  RAG_CORPUS_ID: The corpus ID of your RAG corpus.
  LOCAL_FILE_PATH: The local path to the file to be uploaded.
  DISPLAY_NAME: The display name of the RAG file.
  DESCRIPTION: The description of the RAG file.

To send your request, use the following command:

  curl -X POST \
    -H "X-Goog-Upload-Protocol: multipart" \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -F metadata="{'rag_file': {'display_name':' DISPLAY_NAME', 'description':'DESCRIPTION'}}" \
    -F file=@LOCAL_FILE_PATH \
    "https://LOCATION-aiplatform.googleapis.com/upload/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:upload"

Import RAG files example

You can import files and folders from Drive or Cloud Storage.

The response.skipped_rag_files_count refers to the number of files that were skipped during import. A file is skipped if the following conditions are met:

  1. The file has already been imported.
  2. The file hasn't changed.
  3. The chunking configuration for the file hasn't changed.

REST

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • RAG_CORPUS_ID: The ID of the RagCorpus resource.
  • GCS_URIS: A list of Cloud Storage locations. Example: gs://my-bucket1, gs://my-bucket2.
  • CHUNK_SIZE: Optional: The number of tokens each chunk should have.
  • CHUNK_OVERLAP: Optional: The number of tokens to overlap between chunks.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import

Request JSON body:

{
  "import_rag_files_config": {
    "gcs_source": {
      "uris": "GCS_URIS"
    },
    "rag_file_chunking_config": {
      "chunk_size": CHUNK_SIZE,
      "chunk_overlap": CHUNK_OVERLAP
    }
  }
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import"

PowerShell

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import" | Select-Object -Expand Content
A successful request returns an ImportRagFilesOperationMetadata resource.

The following sample demonstrates how to import a file from Cloud Storage. Use the max_embedding_requests_per_min control field to limit the rate at which RAG Engine calls the embedding model during the ImportRagFiles indexing process. The field has a default value of 1000 calls per minute.

  PROJECT_ID: Your project ID.
  LOCATION: The region to process the request.
  RAG_CORPUS_ID: The corpus ID of your RAG corpus.
  GCS_URIS: A list of Cloud Storage locations. Example: gs://my-bucket1.
  CHUNK_SIZE: Number of tokens each chunk should have.
  CHUNK_OVERLAP: Number of tokens overlap between chunks.
  EMBEDDING_MODEL_QPM_RATE: The QPM rate to limit RAGs access to your embedding model. Example: 1000.
// ImportRagFiles
// Import a single Cloud Storage file or all files in a Cloud Storage bucket.
// Input: LOCATION, PROJECT_ID, RAG_CORPUS_ID, GCS_URIS
// Output: ImportRagFilesOperationMetadataNumber
// Use ListRagFiles to find the server-generated rag_file_id.
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import \
-d '{
  "import_rag_files_config": {
    "gcs_source": {
      "uris": "GCS_URIS"
    },
    "rag_file_chunking_config": {
      "chunk_size": CHUNK_SIZE,
      "chunk_overlap": CHUNK_OVERLAP
    },
    "max_embedding_requests_per_min": EMBEDDING_MODEL_QPM_RATE
  }
}'

// Poll the operation status.
// The response contains the number of files imported.
OPERATION_ID: The operation ID you get from the response of the previous command.
poll_op_wait OPERATION_ID

The following sample demonstrates how to import a file from Drive. Use the max_embedding_requests_per_min control field to limit the rate at which RAG Engine calls the embedding model during the ImportRagFiles indexing process. The field has a default value of 1000 calls per minute.

  PROJECT_ID: Your project ID.
  LOCATION: The region to process the request.
  RAG_CORPUS_ID: The corpus ID of your RAG corpus.
  FOLDER_RESOURCE_ID: The resource ID of your Google Drive folder.
  CHUNK_SIZE: Number of tokens each chunk should have.
  CHUNK_OVERLAP: Number of tokens overlap between chunks.
  EMBEDDING_MODEL_QPM_RATE: The QPM rate to limit RAGs access to your embedding model. Example: 1000.
// ImportRagFiles
// Import all files in a Google Drive folder.
// Input: LOCATION, PROJECT_ID, RAG_CORPUS_ID, FOLDER_RESOURCE_ID
// Output: ImportRagFilesOperationMetadataNumber
// Use ListRagFiles to find the server-generated rag_file_id.
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import \
-d '{
  "import_rag_files_config": {
    "google_drive_source": {
      "resource_ids": {
        "resource_id": "FOLDER_RESOURCE_ID",
        "resource_type": "RESOURCE_TYPE_FOLDER"
      }
    },
    "max_embedding_requests_per_min": EMBEDDING_MODEL_QPM_RATE
  }
}'
// Poll the operation status.
// The response contains the number of files imported.
OPERATION_ID: The operation ID you get from the response of the previous command.
poll_op_wait OPERATION_ID

List RAG files example

This example demonstrates how to list RAG files.

REST

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • RAG_CORPUS_ID: The ID of the RagCorpus resource.
  • PAGE_SIZE: The standard list page size. You can adjust the number of RagFiles to return per page by updating the page_size parameter.
  • PAGE_TOKEN: The standard list page token. Get this token from ListRagFilesResponse.next_page_token in a previous VertexRagDataService.ListRagFiles call.

HTTP method and URL:

GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles?page_size=PAGE_SIZE&page_token=PAGE_TOKEN

To send your request, choose one of these options:

curl

Execute the following command:

curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles?page_size=PAGE_SIZE&page_token=PAGE_TOKEN"

PowerShell

Execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles?page_size=PAGE_SIZE&page_token=PAGE_TOKEN" | Select-Object -Expand Content
A successful request returns a 2xx status code and a list of RagFiles for the specified RAG_CORPUS_ID.

Get a RAG file example

This example demonstrates how to get a RAG file.

REST

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • RAG_CORPUS_ID: The ID of the RagCorpus resource.
  • RAG_FILE_ID: The ID of the RagFile resource.

HTTP method and URL:

GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID

To send your request, choose one of these options:

curl

Execute the following command:

curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID"

PowerShell

Execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID" | Select-Object -Expand Content
A successful request returns the specified RagFile resource.

Delete a RAG file example

This example demonstrates how to delete a RAG file.

REST

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • RAG_CORPUS_ID: The ID of the RagCorpus resource.
  • RAG_FILE_ID: The ID of the RagFile resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus}/ragFiles/{rag_file_id}.

HTTP method and URL:

DELETE https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID

To send your request, choose one of these options:

curl

Execute the following command:

curl -X DELETE \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID"

PowerShell

Execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method DELETE `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID" | Select-Object -Expand Content
A successful request returns a DeleteOperationMetadata resource.

Retrieval and prediction examples

Retrieval query example

When you provide a query, the retrieval component in RAG searches its knowledge base to find relevant information.

REST

Before using any of the request data, make the following replacements:

  • LOCATION: The region to process the request.
  • PROJECT_ID: Your project ID.
  • RAG_CORPUS_RESOURCE: The name of the RagCorpus resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus}.
  • VECTOR_DISTANCE_THRESHOLD: Only contexts with a vector distance smaller than the threshold are returned.
  • TEXT: The query text to get relevant contexts.
  • SIMILARITY_TOP_K: The number of top contexts to retrieve.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION:retrieveContexts

Request JSON body:

{
 "vertex_rag_store": {
    "rag_resources": {
      "rag_corpus": "RAG_CORPUS_RESOURCE"
    },
    "vector_distance_threshold": VECTOR_DISTANCE_THRESHOLD
  },
  "query": {
   "text": "TEXT",
   "similarity_top_k": SIMILARITY_TOP_K
  }
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION:retrieveContexts"

PowerShell

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION:retrieveContexts" | Select-Object -Expand Content
A successful request returns a 2xx status code and a list of related RagFiles.

Generation example

The LLM generates a grounded response using the retrieved contexts.

REST

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • MODEL_ID: The LLM model for content generation. Example: gemini-2.5-flash
  • GENERATION_METHOD: The LLM method for content generation. Options: generateContent, streamGenerateContent
  • INPUT_PROMPT: The text sent to the LLM for content generation. Try to use a prompt relevant to the uploaded RAG files.
  • RAG_CORPUS_RESOURCE: The name of the RagCorpus resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus}.
  • SIMILARITY_TOP_K: Optional: The number of top contexts to retrieve.
  • VECTOR_DISTANCE_THRESHOLD: Optional: Contexts with a vector distance smaller than the threshold are returned.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:GENERATION_METHOD

Request JSON body:

{
 "contents": {
  "role": "user",
  "parts": {
    "text": "INPUT_PROMPT"
  }
 },
 "tools": {
  "retrieval": {
   "disable_attribution": false,
   "vertex_rag_store": {
    "rag_resources": {
      "rag_corpus": "RAG_CORPUS_RESOURCE"
    },
    "similarity_top_k": SIMILARITY_TOP_K,
    "vector_distance_threshold": VECTOR_DISTANCE_THRESHOLD
   }
  }
 }
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:GENERATION_METHOD"

PowerShell

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:GENERATION_METHOD" | Select-Object -Expand Content
A successful request returns the generated content with citations.

Project management examples

The tier is a project-level setting available under the RagEngineConfig resource and affects RAG corpora that use RagManagedDb. To get the tier configuration, use GetRagEngineConfig. To update the tier configuration, use UpdateRagEngineConfig.

For more information on managing your tier configuration, see Manage your tier.

Get project configuration

The following example demonstrates how to read your RagEngineConfig:

Console

  1. In the Google Cloud console, go to the RAG Engine page.

    Go to RAG Engine

  2. Select the region in which your RAG Engine is running. Your list of RAG corpora is updated.
  3. Click Configure RAG Engine. The Configure RAG Engine pane appears. You can see the tier that's selected for your RAG Engine.
  4. Click Cancel.

Python

from vertexai import rag
import vertexai

PROJECT_ID = YOUR_PROJECT_ID
LOCATION = YOUR_RAG_ENGINE_LOCATION

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location=LOCATION)

rag_engine_config = rag.rag_data.get_rag_engine_config(
    name=f"projects/{PROJECT_ID}/locations/{LOCATION}/ragEngineConfig"
)

print(rag_engine_config)

REST

curl -X GET \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://${LOCATION}-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION}/ragEngineConfig

Update project configuration

This section provides examples of how to change your tier.

Update your RagEngineConfig to the Scaled tier

The following examples demonstrate how to set the RagEngineConfig to the Scaled tier:

Console

  1. In the Google Cloud console, go to the RAG Engine page.

    Go to RAG Engine

  2. Select the region in which your RAG Engine is running. Your list of RAG corpora is updated.
  3. Click Configure RAG Engine. The Configure RAG Engine pane appears.
  4. Select the tier that you want to run your RAG Engine.
  5. Click Save.

Python

from vertexai import rag
import vertexai

PROJECT_ID = YOUR_PROJECT_ID
LOCATION = YOUR_RAG_ENGINE_LOCATION

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location=LOCATION)

rag_engine_config_name=f"projects/{PROJECT_ID}/locations/{LOCATION}/ragEngineConfig"

new_rag_engine_config = rag.RagEngineConfig(
name=rag_engine_config_name,
rag_managed_db_config=rag.RagManagedDbConfig(tier=rag.Scaled()),
)

updated_rag_engine_config = rag.rag_data.update_rag_engine_config(
rag_engine_config=new_rag_engine_config
)

print(updated_rag_engine_config)

REST

curl -X PATCH \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://${LOCATION}-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION}/ragEngineConfig -d "{'ragManagedDbConfig': {'scaled': {}}}"

Update your RagEngineConfig to the Basic tier

The following examples demonstrate how to set the RagEngineConfig to the Basic tier:

If you have a large amount of data in your RagManagedDb across your RAG corpora, downgrading to a Basic tier can fail due to insufficient compute and storage capacity.

Console

  1. In the Google Cloud console, go to the RAG Engine page.

    Go to RAG Engine

  2. Select the region in which your RAG Engine is running. Your list of RAG corpora is updated.
  3. Click Configure RAG Engine. The Configure RAG Engine pane appears.
  4. Select the tier that you want to run your RAG Engine.
  5. Click Save.

Python

from vertexai import rag
import vertexai

PROJECT_ID = YOUR_PROJECT_ID
LOCATION = YOUR_RAG_ENGINE_LOCATION

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location=LOCATION)

rag_engine_config_name=f"projects/{PROJECT_ID}/locations/{LOCATION}/ragEngineConfig"

new_rag_engine_config = rag.RagEngineConfig(
name=rag_engine_config_name,
rag_managed_db_config=rag.RagManagedDbConfig(tier=rag.Basic()),
)

updated_rag_engine_config = rag.rag_data.update_rag_engine_config(
rag_engine_config=new_rag_engine_config
)

print(updated_rag_engine_config)

REST

curl -X PATCH \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://${LOCATION}-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION}/ragEngineConfig -d "{'ragManagedDbConfig': {'basic': {}}}"

Update your RagEngineConfig to the Unprovisioned tier

The following examples demonstrate how to set the RagEngineConfig to the Unprovisioned tier:

Console

  1. In the Google Cloud console, go to the RAG Engine page.

    Go to RAG Engine

  2. Select the region in which your RAG Engine is running. Your list of RAG corpora is updated.
  3. Click Configure RAG Engine. The Configure RAG Engine pane appears.
  4. Click Delete RAG Engine. A confirmation dialog appears.
  5. Verify that you're about to delete your data in RAG Engine by typing delete, then click Confirm.
  6. Click Save.

Python

from vertexai import rag
import vertexai

PROJECT_ID = YOUR_PROJECT_ID
LOCATION = YOUR_RAG_ENGINE_LOCATION

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location=LOCATION)

rag_engine_config_name=f"projects/{PROJECT_ID}/locations/{LOCATION}/ragEngineConfig"

new_rag_engine_config = rag.RagEngineConfig(
  name=rag_engine_config_name,
  rag_managed_db_config=rag.RagManagedDbConfig(tier=rag.Unprovisioned()),
)

updated_rag_engine_config = rag.rag_data.update_rag_engine_config(
  rag_engine_config=new_rag_engine_config
)

print(updated_rag_engine_config)

REST

curl -X PATCH \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://${LOCATION}-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION}/ragEngineConfig -d "{'ragManagedDbConfig': {'unprovisioned': {}}}"

What's next