Starting April 29, 2025, Gemini 1.5 Pro and Gemini 1.5 Flash models are not available in projects that have no prior usage of these models, including new projects. For details, see Model versions and lifecycle.

RAG Engine API

The Vertex AI RAG Engine is a component of the Vertex AI platform, which facilitates Retrieval-Augmented Generation (RAG). RAG Engine enables Large Language Models (LLMs) to access and incorporate data from external knowledge sources, such as documents and databases. By using RAG, LLMs can generate more accurate and informative LLM responses.

Parameters list

This section lists the following:

Parameters	Examples
See Corpus management parameters.	See Corpus management examples.
See File management parameters.	See File management examples.
See Project management parameters.	See Project management examples.

Corpus management parameters

For information about a RAG corpus, see Corpus management.

Create a RAG corpus

This table lists the parameters used to create a RAG corpus.

Body Request

Parameters
`corpus_type_config`	Optional: Immutable. `RagCorpus.CorpusTypeConfig` The configuration to specify the corpus type.
`display_name`	Required: `string` The display name of the RAG corpus.
`description`	Optional: `string` The description of the RAG corpus.
`encryption_spec`	Optional: Immutable: `string` The CMEK key name is used to encrypt at-rest data that's related to the RAG corpus. The key name is only applicable to the `RagManaged` option for the vector database. When the corpus is created, this field can be set and can't be updated or deleted. Format: `projects/{project}/locations/{location}/keyRings/{key_ring}/cryptoKeys/{key_name}`
`vector_db_config`	Optional: Immutable: `RagVectorDbConfig` The configuration for the vector databases.
`vertex_ai_search_config.serving_config`	Optional: `string` The configuration for the Vertex AI Search. Format: `projects/{project}/locations/{location}/collections/{collection}/engines/{engine}/servingConfigs/{serving_config}` or `projects/{project}/locations/{location}/collections/{collection}/dataStores/{data_store}/servingConfigs/{serving_config}`

`CorpusTypeConfig`

Parameters

Parameters
`document_corpus`	`oneof` `RagCorpus.CorpusTypeConfig.DocumentCorpus` The default value of `corpus_type_config`, which represents a conventional document-based RAG corpus.
`memory_corpus`	`oneof` `RagCorpus.CorpusTypeConfig.MemoryCorpus` If you set this type, the RAG corpus is a `MemoryCorpus` that can be used with the Gemini Live API as a memory store. For more information, see Use Vertex AI RAG Engine as the memory store.
`memory_corpus.llm_parser`	`oneof` `RagFileParsingConfig.LlmParser` The LLM parser that's used to parse and store session contexts from the Gemini Live API. You can build memories for indexing.

document_corpus

oneof RagCorpus.CorpusTypeConfig.DocumentCorpus

The default value of corpus_type_config, which represents a conventional document-based RAG corpus.

memory_corpus

oneof RagCorpus.CorpusTypeConfig.MemoryCorpus

If you set this type, the RAG corpus is a MemoryCorpus that can be used with the Gemini Live API as a memory store.

For more information, see Use Vertex AI RAG Engine as the memory store.

memory_corpus.llm_parser

oneof RagFileParsingConfig.LlmParser

The LLM parser that's used to parse and store session contexts from the Gemini Live API. You can build memories for indexing.

`RagVectorDbConfig`

Parameters
`rag_managed_db`	`oneof` `vector_db`: `RagVectorDbConfig.RagManagedDb` If no vector database is specified, `rag_managed_db` is the default vector database.
`rag_managed_db.knn`	`oneof` `retrieval_strategy`: `KNN` Default. Finds the exact nearest neighbors by comparing all data points in your RAG corpus. If you don't specify a strategy during the creation of your RAG corpus, KNN is the default retrieval strategy used.
`rag_managed_db.ann`	`oneof` `retrieval_strategy`: `ANN` `tree_depth` Determines the number of layers or levels in the tree. If you have `O(10K)` RAG files in the RAG corpus, set thi value to 2. If more layers or levels are required, set this value to 3. If the number of layers or levels isn't specified, Vertex AI RAG Engine assigns a default value of 2 for this parameter. `leaf_count` Determines the number of leaf nodes in the tree-based structure. The recommended value is `10 * sqrt(num of RAG files in your RAG corpus)`. If not specified, Vertex AI RAG Engine assigns a default value of 500 for this parameter. `rebuild_ann_index` Vertex AI RAG Engine rebuilds your ANN index. Set to `true` in your `ImportRagFiles` API request. Before you query the RAG corpus, it's required to rebuild the ANN index once. Only one concurrent index rebuild is supported on a project in each location.
`weaviate`	`oneof` `vector_db`: `RagVectorDbConfig.Weaviate` Specifies your Weaviate instance.
`weaviate.http_endpoint`	`string` The Weaviate instance's HTTP endpoint. This value can't be changed after it's set. You can leave it empty in the `CreateRagCorpus` API call, and set it with a non-empty value in a follow up `UpdateRagCorpus` API call.
`weaviate.collection_name`	`string` The Weaviate collection that the RAG corpus maps to. This value can't be changed after it's set. You can leave it empty in the `CreateRagCorpus` API call, and set it with a non-empty value in a follow up `UpdateRagCorpus` API call.
`pinecone`	`oneof` `vector_db`: `RagVectorDbConfig.Pinecone` Specifies your Pinecone instance.
`pinecone.index_name`	`string` This is the name used to create the Pinecone index that's used with the RAG corpus. This value can't be changed after it's set. You can leave it empty in the `CreateRagCorpus` API call, and set it with a non-empty value in a follow up `UpdateRagCorpus` API call.
`vertex_feature_store`	`oneof` `vector_db`: `RagVectorDbConfig.VertexFeatureStore` Specifies your Vertex AI Feature Store instance.
`vertex_feature_store.feature_view_resource_name`	`string` The Vertex AI Feature Store `FeatureView` that the RAG corpus maps to. Format: `projects/{project}/locations/{location}/featureOnlineStores/{feature_online_store}/featureViews/{feature_view}` This value can't be changed after it's set. You can leave it empty in the `CreateRagCorpus` API call, and set it with a non-empty value in a follow up `UpdateRagCorpus` API call.
`vertex_vector_search`	`oneof` `vector_db`: `RagVectorDbConfig.VertexVectorSearch` Specifies your Vertex Vector Search instance.
`vertex_vector_search.index`	`string` This is the resource name of the Vector Search index that's used with the RAG corpus. Format: `projects/{project}/locations/{location}/indexEndpoints/{index_endpoint}` This value can't be changed after it's set. You can leave it empty in the `CreateRagCorpus` API call, and set it with a non-empty value in a follow up `UpdateRagCorpus` API call.
`vertex_vector_search.index_endpoint`	`string` This is the resource name of the Vector Search index endpoint that's used with the RAG corpus. Format: `projects/{project}/locations/{location}/indexes/{index}` This value can't be changed after it's set. You can leave it empty in the `CreateRagCorpus` API call, and set it with a non-empty value in a follow up `UpdateRagCorpus` API call.
`api_auth.api_key_config.api_key_secret_version`	`string` This the full resource name of the secret that is stored in Secret Manager, which contains your Weaviate or Pinecone API key that depends on your choice of vector database. Format: `projects/{PROJECT_NUMBER}/secrets/{SECRET_ID}/versions/{VERSION_ID}` You can leave it empty in the `CreateRagCorpus` API call, and set it with a non-empty value in a follow up `UpdateRagCorpus` API call.
`rag_embedding_model_config.vertex_prediction_endpoint.endpoint`	Optional: Immutable: `string` The embedding model to use for the RAG corpus. This value can't be changed after it's set. If you leave it empty, we use text-embedding-005 as the embedding model.

Update a RAG corpus

This table lists the parameters used to update a RAG corpus.

Body Request

Parameters
`display_name`	Optional: `string` The display name of the RAG corpus.
`description`	Optional: `string` The description of the RAG corpus.
`rag_vector_db.weaviate.http_endpoint`	`string` The Weaviate instance's HTTP endpoint. If your `RagCorpus` was created with a `Weaviate` configuration, and this field has never been set before, then you can update the Weaviate instance's HTTP endpoint.
`rag_vector_db.weaviate.collection_name`	`string` The Weaviate collection that the RAG corpus maps to. If your `RagCorpus` was created with a `Weaviate` configuration, and this field has never been set before, then you can update the Weaviate instance's collection name.
`rag_vector_db.pinecone.index_name`	`string` This is the name used to create the Pinecone index that's used with the RAG corpus. If your `RagCorpus` was created with a `Pinecone` configuration, and this field has never been set before, then you can update the Pinecone instance's index name.
`rag_vector_db.vertex_feature_store.feature_view_resource_name`	`string` The Vertex AI Feature Store `FeatureView` that the RAG corpus maps to. Format: `projects/{project}/locations/{location}/featureOnlineStores/{feature_online_store}/featureViews/{feature_view}` If your `RagCorpus` was created with a `Vertex AI Feature Store` configuration, and this field has never been set before, then you can update it.
`rag_vector_db.vertex_vector_search.index`	`string` This is the resource name of the Vector Search index that's used with the RAG corpus. Format: `projects/{project}/locations/{location}/indexEndpoints/{index_endpoint}` If your `RagCorpus` was created with a `Vector Search` configuration, and this field has never been set before, then you can update it.
`rag_vector_db.vertex_vector_search.index_endpoint`	`string` This is the resource name of the Vector Search index endpoint that's used with the RAG corpus. Format: `projects/{project}/locations/{location}/indexes/{index}` If your `RagCorpus` was created with a `Vector Search` configuration, and this field has never been set before, then you can update it.
`rag_vector_db.api_auth.api_key_config.api_key_secret_version`	`string` The full resource name of the secret that is stored in Secret Manager, which contains your Weaviate or Pinecone API key depends on your choice of vector database. Format: `projects/{PROJECT_NUMBER}/secrets/{SECRET_ID}/versions/{VERSION_ID}`

List RAG corpora

This table lists the parameters used to list RAG corpora.

Parameters

Parameters
`page_size`	Optional: `int` The standard list page size.
`page_token`	Optional: `string` The standard list page token. Typically obtained from `[ListRagCorporaResponse.next_page_token][]` of the previous `[VertexRagDataService.ListRagCorpora][]` call.

page_size

Optional: int

The standard list page size.

page_token

Optional: string

The standard list page token. Typically obtained from [ListRagCorporaResponse.next_page_token][] of the previous [VertexRagDataService.ListRagCorpora][] call.

Get a RAG corpus

This table lists parameters used to get a RAG corpus.

Parameters

Parameters
`name`	`string` The name of the `RagCorpus` resource. Format: `projects/{project}/locations/{location}/ragCorpora/{rag_corpus_id}`

name

string

The name of the RagCorpus resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus_id}

Delete a RAG corpus

This table lists parameters used to delete a RAG corpus.

Parameters

Parameters
`name`	`string` The name of the `RagCorpus` resource. Format: `projects/{project}/locations/{location}/ragCorpora/{rag_corpus_id}`

name

string

The name of the RagCorpus resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus_id}

File management parameters

For information about a RAG file, see File management.

Upload a RAG file

This table lists parameters used to upload a RAG file.

Body Request

Parameters

Parameters
`parent`	`string` The name of the `RagCorpus` resource. Format: `projects/{project}/locations/{location}/ragCorpora/{rag_corpus_id}`
`rag_file`	Required: `RagFile` The file to upload.
`upload_rag_file_config`	Required: `UploadRagFileConfig` The configuration for the `RagFile` to be uploaded into the `RagCorpus`.

parent

string

The name of the RagCorpus resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus_id}

rag_file

Required: RagFile

The file to upload.

upload_rag_file_config

Required: UploadRagFileConfig

The configuration for the RagFile to be uploaded into the RagCorpus.

RagFile

`RagFile`
`display_name`	Required: `string` The display name of the RAG file.
`description`	Optional: `string` The description of the RAG file.

display_name

Required: string

The display name of the RAG file.

description

Optional: string

The description of the RAG file.

UploadRagFileConfig

`UploadRagFileConfig`
`rag_file_transformation_config.rag_file_chunking_config.fixed_length_chunking.chunk_size`	`int32` Number of tokens each chunk has.
`rag_file_transformation_config.rag_file_chunking_config.fixed_length_chunking.chunk_overlap`	`int32` The overlap between chunks.

rag_file_transformation_config.rag_file_chunking_config.fixed_length_chunking.chunk_size

int32

Number of tokens each chunk has.

rag_file_transformation_config.rag_file_chunking_config.fixed_length_chunking.chunk_overlap

int32

The overlap between chunks.

Import RAG files

This table lists parameters used to import a RAG file.

Parameters
`parent`	Required: `string` The name of the `RagCorpus` resource. Format: `projects/{project}/locations/{location}/ragCorpora/{rag_corpus_id}`
`gcs_source`	`oneof` `import_source`: `GcsSource` Cloud Storage location. Supports importing individual files as well as entire Cloud Storage directories.
`gcs_source.uris`	`list` of `string` Cloud Storage URI that contains the upload file.
`google_drive_source`	`oneof` `import_source`: `GoogleDriveSource` Google Drive location. Supports importing individual files as well as Google Drive folders.
`slack_source`	`oneof` `import_source`: `SlackSource` The slack channel where the file is uploaded.
`jira_source`	`oneof` `import_source`: `JiraSource` The Jira query where the file is uploaded.
`share_point_sources`	`oneof` `import_source`: `SharePointSources` The SharePoint sources where the file is uploaded.
`rag_file_transformation_config.rag_file_chunking_config.fixed_length_chunking.chunk_size`	`int32` Number of tokens each chunk has.
`rag_file_transformation_config.rag_file_chunking_config.fixed_length_chunking.chunk_overlap`	`int32` The overlap between chunks.
`rag_file_parsing_config`	Optional: `RagFileParsingConfig` Specifies the parsing configuration for `RagFiles`. If this field isn't set, RAG uses the default parser.
`max_embedding_requests_per_min`	Optional: `int32` The maximum number of queries per minute that this job is allowed to make to the embedding model specified on the corpus. This value is specific to this job and not shared across other import jobs. Consult the Quotas page on the project to set an appropriate value. If unspecified, a default value of 1,000 QPM is used.

GoogleDriveSource

`GoogleDriveSource`
`resource_ids.resource_id`	Required: `string` The ID of the Google Drive resource.
`resource_ids.resource_type`	Required: `string` The type of the Google Drive resource.

resource_ids.resource_id

Required: string

The ID of the Google Drive resource.

resource_ids.resource_type

Required: string

The type of the Google Drive resource.

`SlackSource`
`channels.channels`	Repeated: `SlackSource.SlackChannels.SlackChannel` Slack channel information, include ID and time range to import.
`channels.channels.channel_id`	Required: `string` The Slack channel ID.
`channels.channels.start_time`	Optional: `google.protobuf.Timestamp` The starting timestamp for messages to import.
`channels.channels.end_time`	Optional: `google.protobuf.Timestamp` The ending timestamp for messages to import.
`channels.api_key_config.api_key_secret_version`	Required: `string` The full resource name of the secret that is stored in Secret Manager, which contains a Slack channel access token that has access to the slack channel IDs. See: https://api.slack.com/tutorials/tracks/getting-a-token. Format: `projects/{PROJECT_NUMBER}/secrets/{SECRET_ID}/versions/{VERSION_ID}`

`JiraSource`
`jira_queries.projects`	Repeated: `string` A list of Jira projects to import in their entirety.
`jira_queries.custom_queries`	Repeated: `string` A list of custom Jira queries to import. For information about JQL (Jira Query Language), see Jira Support
`jira_queries.email`	Required: `string` The Jira email address.
`jira_queries.server_uri`	Required: `string` The Jira server URI.
`jira_queries.api_key_config.api_key_secret_version`	Required: `string` The full resource name of the secret that is stored in Secret Manager, which contains Jira API key that has access to the slack channel IDs. See: https://support.atlassian.com/atlassian-account/docs/manage-api-tokens-for-your-atlassian-account/ Format: `projects/{PROJECT_NUMBER}/secrets/{SECRET_ID}/versions/{VERSION_ID}`

`SharePointSources`
`share_point_sources.sharepoint_folder_path`	`oneof` in `folder_source`: `string` The path of the SharePoint folder to download from.
`share_point_sources.sharepoint_folder_id`	`oneof` in `folder_source`: `string` The ID of the SharePoint folder to download from.
`share_point_sources.drive_name`	`oneof` in `drive_source`: `string` The name of the drive to download from.
`share_point_sources.drive_id`	`oneof` in `drive_source`: `string` The ID of the drive to download from.
`share_point_sources.client_id`	`string` The Application ID for the app registered in Microsoft Azure Portal. The application must also be configured with MS Graph permissions "Files.ReadAll", "Sites.ReadAll" and BrowserSiteLists.Read.All.
`share_point_sources.client_secret.api_key_secret_version`	Required: `string` The full resource name of the secret that is stored in Secret Manager, which contains the application secret for the app registered in Azure. Format: `projects/{PROJECT_NUMBER}/secrets/{SECRET_ID}/versions/{VERSION_ID}`
`share_point_sources.tenant_id`	`string` Unique identifier of the Azure Active Directory Instance.
`share_point_sources.sharepoint_site_name`	`string` The name of the SharePoint site to download from. This can be the site name or the site id.

`RagFileParsingConfig`
`layout_parser`	`oneof` `parser`: `RagFileParsingConfig.LayoutParser` The Layout Parser to use for `RagFile`s.
`layout_parser.processor_name`	`string` The full resource name of a Document AI processor or processor version. Format: `projects/{project_id}/locations/{location}/processors/{processor_id}` `projects/{project_id}/locations/{location}/processors/{processor_id}/processorVersions/{processor_version_id}`
`layout_parser.max_parsing_requests_per_min`	`string` The maximum number of requests the job is allowed to make to the Document AI processor per minute. Consult https://cloud.google.com/document-ai/quotas and the Quota page for your project to set an appropriate value here. If unspecified, a default value of 120 QPM is used.
`llm_parser`	`oneof` `parser`: `RagFileParsingConfig.LlmParser` The LLM parser to use for `RagFile`s.
`llm_parser.model_name`	`string` The resource name of an LLM model. Format: `projects/{project_id}/locations/{location}/publishers/{publisher}/models/{model}`
`llm_parser.max_parsing_requests_per_min`	`string` The maximum number of requests the job is allowed to make to the LLM model per minute. To set an appropriate value for your project, see model quota section and the Quota page for your project to set an appropriate value here. If unspecified, a default value of 5000 QPM is used.

Get a RAG file

This table lists parameters used to get a RAG file.

Parameters

Parameters
`name`	`string` The name of the `RagFile` resource. Format: `projects/{project}/locations/{location}/ragCorpora/{rag_file_id}`

name

string

The name of the RagFile resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_file_id}

Delete a RAG file

This table lists parameters used to delete a RAG file.

Parameters

Parameters
`name`	`string` The name of the `RagFile` resource. Format: `projects/{project}/locations/{location}/ragCorpora/{rag_file_id}`

name

string

The name of the RagFile resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_file_id}

Retrieval and prediction parameters

This section lists the retrieval and prediction parameters.

Retrieval parameters

This table lists parameters for retrieveContexts API.

Parameters

Parameters
`parent`	Required: `string` The resource name of the Location to retrieve `RagContexts`. The users must have permission to make a call in the project. Format: `projects/{project}/locations/{location}`
`vertex_rag_store`	`VertexRagStore` The data source for Vertex RagStore.
`query`	Required: `RagQuery` Single RAG retrieve query.

parent

Required: string

The resource name of the Location to retrieve RagContexts.
The users must have permission to make a call in the project.

Format: projects/{project}/locations/{location}

vertex_rag_store

VertexRagStore

The data source for Vertex RagStore.

query

Required: RagQuery

Single RAG retrieve query.

`VertexRagStore`

VertexRagStore

`VertexRagStore`
`rag_resources`	list: `RagResource` The representation of the RAG source. It can be used to specify the corpus only or `RagFile`s. Only support one corpus or multiple files from one corpus.
`rag_resources.rag_corpus`	Optional: `string` `RagCorpora` resource name. Format: `projects/{project}/locations/{location}/ragCorpora/{rag_corpus}`
`rag_resources.rag_file_ids`	list: `string` A list of `RagFile` resources. Format: `projects/{project}/locations/{location}/ragCorpora/{rag_corpus}/ragFiles/{rag_file}`

rag_resources

list: RagResource

The representation of the RAG source. It can be used to specify the corpus only or RagFiles. Only support one corpus or multiple files from one corpus.

rag_resources.rag_corpus

Optional: string

RagCorpora resource name.

Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus}

rag_resources.rag_file_ids

list: string

A list of RagFile resources.

Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus}/ragFiles/{rag_file}

`RagQuery`
`text`	`string` The query in text format to get relevant contexts.
`rag_retrieval_config`	Optional: `RagRetrievalConfig` The retrieval configuration for the query.

RagQuery

text

string

The query in text format to get relevant contexts.

rag_retrieval_config

Optional: RagRetrievalConfig

The retrieval configuration for the query.

`RagRetrievalConfig`
`top_k`	Optional: `int32` The number of contexts to retrieve.
`hybrid_search.alpha`	Optional: `float` Alpha value controls the weight between dense and sparse vector search results. The range is [0, 1], where 0 means sparse vector search only and 1 means dense vector search only. The default value is 0.5, which balances sparse and dense vector search equally. Hybrid Search is only available for Weaviate.
`filter.vector_distance_threshold`	`oneof vector_db_threshold`: `double` Only returns contexts with a vector distance smaller than the threshold.
`filter.vector_similarity_threshold`	`oneof vector_db_threshold`: `double` Only returns contexts with vector similarity larger than the threshold.
`ranking.rank_service.model_name`	Optional: `string` The model name of the rank service. Example: `semantic-ranker-512@latest`
`ranking.llm_ranker.model_name`	Optional: `string` The model name used for ranking. Example: `gemini-2.5-flash`

Prediction parameters

This table lists prediction parameters.

GenerateContentRequest

`GenerateContentRequest`
`tools.retrieval.vertex_rag_store`	`VertexRagStore` Set to use a data source powered by Vertex AI RAG store.

tools.retrieval.vertex_rag_store

VertexRagStore

Set to use a data source powered by Vertex AI RAG store.

See VertexRagStore for details.

Project management parameters

This table lists project-level parameters.

`RagEngineConfig`

Parameters
`RagManagedDbConfig.scaled`	This tier offers production-scale performance along with autoscaling functionality.
`RagManagedDbConfig.basic`	This tier offers a cost-effective and low-compute tier.
`RagManagedDbConfig.unprovisioned`	This tier deletes the `RagManagedDb` and its underlying Spanner instance.

Corpus management examples

This section provides examples of how to use the API to manage your RAG corpus.

Create a RAG corpus example

This code sample demonstrates how to create a RAG corpus.

REST

Before using any of the request data, make the following replacements:

PROJECT_ID: Your project ID.
LOCATION: The region to process the request.
CORPUS_DISPLAY_NAME: The display name of the RagCorpus.
CORPUS_DESCRIPTION: The description of the RagCorpus.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora

Request JSON body:

{
  "display_name" : "CORPUS_DISPLAY_NAME",
  "description": "CORPUS_DESCRIPTION",
}

To send your request, choose one of these options:

curl

Note: The following command assumes that you have logged in to the gcloud CLI with your user account by running gcloud init or gcloud auth login , or by using Cloud Shell, which automatically logs you into the gcloud CLI . You can check the currently active account by running gcloud auth list.

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora"

PowerShell

Note: The following command assumes that you have logged in to the gcloud CLI with your user account by running gcloud init or gcloud auth login . You can check the currently active account by running gcloud auth list.

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora" | Select-Object -Expand Content

You should receive a successful status code (2xx).

The following example demonstrates how to create a RAG corpus by using the REST API.

  PROJECT_ID: Your project ID.
  LOCATION: The region to process the request.
  CORPUS_DISPLAY_NAME: The display name of the <code>RagCorpus</code>.

    // CreateRagCorpus
    // Input: LOCATION, PROJECT_ID, CORPUS_DISPLAY_NAME
    // Output: CreateRagCorpusOperationMetadata
    curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json" \
    https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora \
    -d '{
          "display_name" : "CORPUS_DISPLAY_NAME"
      }'

Update a RAG corpus example

You can update your RAG corpus with a new display name, description, and vector database configuration. However, you can't change the following parameters in your RAG corpus:

The vector database type. For example, you can't change the vector database from Weaviate to Vertex AI Feature Store.
If you're using the managed database option, you can't update the vector database configuration.

These examples demonstrate how to update a RAG corpus.

REST

Before using any of the request data, make the following replacements:

PROJECT_ID: Your project ID.
LOCATION: The region to process the request.
CORPUS_ID: The corpus ID of your RAG corpus.
CORPUS_DISPLAY_NAME: The display name of the RagCorpus.
CORPUS_DESCRIPTION: The description of the RagCorpus.
INDEX_NAME: The resource name of the Vector Search Index. Format: projects/{project}/locations/{location}/indexes/{index}
INDEX_ENDPOINT_NAME: The resource name of the Vector Search Index Endpoint. Format: projects/{project}/locations/{location}/indexEndpoints/{index_endpoint}

HTTP method and URL:

PATCH https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/CORPUS_ID

Request JSON body:

{
  "display_name" : "CORPUS_DISPLAY_NAME",
  "description": "CORPUS_DESCRIPTION",
  "rag_vector_db_config": {
     "vertex_vector_search": {
         "index": "INDEX_NAME",
         "index_endpoint": "INDEX_ENDPOINT_NAME",
     }
  }
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json, and execute the following command:

curl -X PATCH \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/CORPUS_ID"

PowerShell

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method PATCH `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/CORPUS_ID" | Select-Object -Expand Content

You should receive a successful status code (2xx).

List RAG corpora example

This code sample demonstrates how to list all of the RAG corpora.

REST

Before using any of the request data, make the following replacements:

PROJECT_ID: Your project ID.
LOCATION: The region to process the request.
PAGE_SIZE: The standard list page size. You may adjust the number of RagCorpora to return per page by updating the page_size parameter.
PAGE_TOKEN: The standard list page token. Obtained typically using ListRagCorporaResponse.next_page_token of the previous VertexRagDataService.ListRagCorpora call.

HTTP method and URL:

GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora?page_size=PAGE_SIZE&page_token=PAGE_TOKEN

To send your request, choose one of these options:

curl

Execute the following command:

curl -X GET \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora?page_size=PAGE_SIZE&page_token=PAGE_TOKEN"

PowerShell

Execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method GET `
    -Headers $headers `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora?page_size=PAGE_SIZE&page_token=PAGE_TOKEN" | Select-Object -Expand Content

You should receive a successful status code (`2xx`) and a list of RagCorpora under the given PROJECT_ID.

Get a RAG corpus example

REST

Before using any of the request data, make the following replacements:

PROJECT_ID: Your project ID.
LOCATION: The region to process the request.
RAG_CORPUS_ID: The ID of the RagCorpus resource.

HTTP method and URL:

GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID

To send your request, choose one of these options:

curl

Execute the following command:

curl -X GET \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID"

PowerShell

Execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method GET `
    -Headers $headers `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID" | Select-Object -Expand Content

A successful response returns the RagCorpus resource.

The get and list commands are used in an example to demonstrate how RagCorpus uses the rag_embedding_model_config field with in the vector_db_config, which points to the embedding model you have chosen.

  PROJECT_ID: Your project ID.
  LOCATION: The region to process the request.
  RAG_CORPUS_ID: The corpus ID of your RAG corpus.

// GetRagCorpus
// Input: LOCATION, PROJECT_ID, RAG_CORPUS_ID
// Output: RagCorpus
curl -X GET \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID

// ListRagCorpora
curl -sS -X GET \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/

Delete a RAG corpus example

REST

Before using any of the request data, make the following replacements:

PROJECT_ID: Your project ID.
LOCATION: The region to process the request.
RAG_CORPUS_ID: The ID of the RagCorpus resource.

HTTP method and URL:

DELETE https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID

To send your request, choose one of these options:

curl

Execute the following command:

curl -X DELETE \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID"

PowerShell

Execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method DELETE `
    -Headers $headers `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID" | Select-Object -Expand Content

A successful response returns the DeleteOperationMetadata.

File management examples

This section provides examples of how to use the API to manage RAG files.

Upload a RAG file example

REST

Before using any of the request data, make the following replacements:

  PROJECT_ID: Your project ID.
  LOCATION: The region to process the request.
  RAG_CORPUS_ID: The corpus ID of your RAG corpus.
  LOCAL_FILE_PATH: The local path to the file to be uploaded.
  DISPLAY_NAME: The display name of the RAG file.
  DESCRIPTION: The description of the RAG file.

To send your request, use the following command:

  curl -X POST \
    -H "X-Goog-Upload-Protocol: multipart" \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -F metadata="{'rag_file': {'display_name':' DISPLAY_NAME', 'description':'DESCRIPTION'}}" \
    -F file=@LOCAL_FILE_PATH \
    "https://LOCATION-aiplatform.googleapis.com/upload/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:upload"

Import RAG files example

Files and folders can be imported from Drive or Cloud Storage.

The response.skipped_rag_files_count refers to the number of files that were skipped during import. A file is skipped when the following conditions are met:

The file has already been imported.
The file hasn't changed.
The chunking configuration for the file hasn't changed.

REST

Before using any of the request data, make the following replacements:

PROJECT_ID: Your project ID.
LOCATION: The region to process the request.
RAG_CORPUS_ID: The ID of the RagCorpus resource.
GCS_URIS: A list of Cloud Storage locations. Example: gs://my-bucket1, gs://my-bucket2.
CHUNK_SIZE: Optional: Number of tokens each chunk should have.
CHUNK_OVERLAP: Optional: Number of tokens overlap between chunks.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import

Request JSON body:

{
  "import_rag_files_config": {
    "gcs_source": {
      "uris": "GCS_URIS"
    },
    "rag_file_chunking_config": {
      "chunk_size": CHUNK_SIZE,
      "chunk_overlap": CHUNK_OVERLAP
    }
  }
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import"

PowerShell

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import" | Select-Object -Expand Content

A successful response returns the ImportRagFilesOperationMetadata resource.

The following sample demonstrates how to import a file from Cloud Storage. Use the max_embedding_requests_per_min control field to limit the rate at which RAG Engine calls the embedding model during the ImportRagFiles indexing process. The field has a default value of 1000 calls per minute.

  PROJECT_ID: Your project ID.
  LOCATION: The region to process the request.
  RAG_CORPUS_ID: The corpus ID of your RAG corpus.
  GCS_URIS: A list of Cloud Storage locations. Example: gs://my-bucket1.
  CHUNK_SIZE: Number of tokens each chunk should have.
  CHUNK_OVERLAP: Number of tokens overlap between chunks.
  EMBEDDING_MODEL_QPM_RATE: The QPM rate to limit RAGs access to your embedding model. Example: 1000.

// ImportRagFiles
// Import a single Cloud Storage file or all files in a Cloud Storage bucket.
// Input: LOCATION, PROJECT_ID, RAG_CORPUS_ID, GCS_URIS
// Output: ImportRagFilesOperationMetadataNumber
// Use ListRagFiles to find the server-generated rag_file_id.
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import \
-d '{
  "import_rag_files_config": {
    "gcs_source": {
      "uris": "GCS_URIS"
    },
    "rag_file_chunking_config": {
      "chunk_size": CHUNK_SIZE,
      "chunk_overlap": CHUNK_OVERLAP
    },
    "max_embedding_requests_per_min": EMBEDDING_MODEL_QPM_RATE
  }
}'

// Poll the operation status.
// The response contains the number of files imported.
OPERATION_ID: The operation ID you get from the response of the previous command.
poll_op_wait OPERATION_ID

The following sample demonstrates how to import a file from Drive. Use the max_embedding_requests_per_min control field to limit the rate at which RAG Engine calls the embedding model during the ImportRagFiles indexing process. The field has a default value of 1000 calls per minute.

  PROJECT_ID: Your project ID.
  LOCATION: The region to process the request.
  RAG_CORPUS_ID: The corpus ID of your RAG corpus.
  FOLDER_RESOURCE_ID: The resource ID of your Google Drive folder.
  CHUNK_SIZE: Number of tokens each chunk should have.
  CHUNK_OVERLAP: Number of tokens overlap between chunks.
  EMBEDDING_MODEL_QPM_RATE: The QPM rate to limit RAGs access to your embedding model. Example: 1000.

// ImportRagFiles
// Import all files in a Google Drive folder.
// Input: LOCATION, PROJECT_ID, RAG_CORPUS_ID, FOLDER_RESOURCE_ID
// Output: ImportRagFilesOperationMetadataNumber
// Use ListRagFiles to find the server-generated rag_file_id.
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import \
-d '{
  "import_rag_files_config": {
    "google_drive_source": {
      "resource_ids": {
        "resource_id": "FOLDER_RESOURCE_ID",
        "resource_type": "RESOURCE_TYPE_FOLDER"
      }
    },
    "max_embedding_requests_per_min": EMBEDDING_MODEL_QPM_RATE
  }
}'

// Poll the operation status.
// The response contains the number of files imported.
OPERATION_ID: The operation ID you get from the response of the previous command.
poll_op_wait OPERATION_ID

List RAG files example

This code sample demonstrates how to list RAG files.

REST

Before using any of the request data, make the following replacements:

PROJECT_ID: Your project ID.
LOCATION: The region to process the request.
RAG_CORPUS_ID: The ID of the RagCorpus resource.
PAGE_SIZE: The standard list page size. You may adjust the number of RagFiles to return per page by updating the page_size parameter.
PAGE_TOKEN: The standard list page token. Obtained typically using ListRagFilesResponse.next_page_token of the previous VertexRagDataService.ListRagFiles call.

HTTP method and URL:

GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles?page_size=PAGE_SIZE&page_token=PAGE_TOKEN

To send your request, choose one of these options:

curl

Execute the following command:

curl -X GET \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles?page_size=PAGE_SIZE&page_token=PAGE_TOKEN"

PowerShell

Execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method GET `
    -Headers $headers `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles?page_size=PAGE_SIZE&page_token=PAGE_TOKEN" | Select-Object -Expand Content

You should receive a successful status code (2xx) along with a list of RagFiles under the given RAG_CORPUS_ID.

Get a RAG file example

This code sample demonstrates how to get a RAG file.

REST

Before using any of the request data, make the following replacements:

PROJECT_ID: Your project ID.
LOCATION: The region to process the request.
RAG_CORPUS_ID: The ID of the RagCorpus resource.
RAG_FILE_ID: The ID of the RagFile resource.

HTTP method and URL:

GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID

To send your request, choose one of these options:

curl

Execute the following command:

curl -X GET \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID"

PowerShell

Execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method GET `
    -Headers $headers `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID" | Select-Object -Expand Content

A successful response returns the RagFile resource.

Delete a RAG file example

This code sample demonstrates how to delete a RAG file.

REST

Before using any of the request data, make the following replacements:

PROJECT_ID: Your project ID.
LOCATION: The region to process the request.
RAG_CORPUS_ID: The ID of the RagCorpus resource.
RAG_FILE_ID: The ID of the RagFile resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus}/ragFiles/{rag_file_id}.

HTTP method and URL:

DELETE https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID

To send your request, choose one of these options:

curl

Execute the following command:

curl -X DELETE \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID"

PowerShell

Execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method DELETE `
    -Headers $headers `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID" | Select-Object -Expand Content

A successful response returns the DeleteOperationMetadata resource.

Retrieval query example

When a user asks a question or provides a prompt, the retrieval component in RAG searches through its knowledge base to find information that is relevant to the query.

REST

Before using any of the request data, make the following replacements:

LOCATION: The region to process the request.
PROJECT_ID: Your project ID.
RAG_CORPUS_RESOURCE: The name of the RagCorpus resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus}.
VECTOR_DISTANCE_THRESHOLD: Only contexts with a vector distance smaller than the threshold are returned.
TEXT: The query text to get relevant contexts.
SIMILARITY_TOP_K: The number of top contexts to retrieve.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION:retrieveContexts

Request JSON body:

{
 "vertex_rag_store": {
    "rag_resources": {
      "rag_corpus": "RAG_CORPUS_RESOURCE"
    },
    "vector_distance_threshold": VECTOR_DISTANCE_THRESHOLD
  },
  "query": {
   "text": "TEXT",
   "similarity_top_k": SIMILARITY_TOP_K
  }
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION:retrieveContexts"

PowerShell

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION:retrieveContexts" | Select-Object -Expand Content

You should receive a successful status code (2xx) and a list of related RagFiles.

Generation example

The LLM generates a grounded response using the retrieved contexts.

REST

Before using any of the request data, make the following replacements:

PROJECT_ID: Your project ID.
LOCATION: The region to process the request.
MODEL_ID: LLM model for content generation. Example: gemini-2.5-flash
GENERATION_METHOD: LLM method for content generation. Options: generateContent, streamGenerateContent
INPUT_PROMPT: The text sent to the LLM for content generation. Try to use a prompt relevant to the uploaded rag Files.
RAG_CORPUS_RESOURCE: The name of the RagCorpus resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus}.
SIMILARITY_TOP_K: Optional: The number of top contexts to retrieve.
VECTOR_DISTANCE_THRESHOLD: Optional: Contexts with a vector distance smaller than the threshold are returned.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:GENERATION_METHOD

Request JSON body:

{
 "contents": {
  "role": "user",
  "parts": {
    "text": "INPUT_PROMPT"
  }
 },
 "tools": {
  "retrieval": {
   "disable_attribution": false,
   "vertex_rag_store": {
    "rag_resources": {
      "rag_corpus": "RAG_CORPUS_RESOURCE"
    },
    "similarity_top_k": SIMILARITY_TOP_K,
    "vector_distance_threshold": VECTOR_DISTANCE_THRESHOLD
   }
  }
 }
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:GENERATION_METHOD"

PowerShell

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:GENERATION_METHOD" | Select-Object -Expand Content

A successful response returns the generated content with citations.

Project management examples

The tier is a project-level setting available under the RagEngineConfig resource and impacts RAG corpora using RagManagedDb. To get the tier configuration, use GetRagEngineConfig. To update the tier configuration, use UpdateRagEngineConfig.

For more information on managing your tier configuration, see Manage your tier.

Get project configuration

The following sample code demonstrates how to read your RagEngineConfig:

Console

In the Google Cloud console, go to the RAG Engine page.
Go to RAG Engine
Select the region in which your RAG Engine is running. Your list of RAG corpora is updated.
Click Configure RAG Engine. The Configure RAG Engine pane appears. You can see the tier that's selected for your RAG Engine.
Click Cancel.

Python

from vertexai import rag
import vertexai

PROJECT_ID = YOUR_PROJECT_ID
LOCATION = YOUR_RAG_ENGINE_LOCATION

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location=LOCATION)

rag_engine_config = rag.rag_data.get_rag_engine_config(
    name=f"projects/{PROJECT_ID}/locations/{LOCATION}/ragEngineConfig"
)

print(rag_engine_config)

REST

curl -X GET \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://${LOCATION}-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION}/ragEngineConfig

Update project configuration

This section provides code samples to demonstrate how to change your tier in the configuration.

Update your `RagEngineConfig` to the Scaled tier

The following code samples demonstrate how to set the RagEngineConfig to the Scaled tier:

Console

In the Google Cloud console, go to the RAG Engine page.
Go to RAG Engine
Select the region in which your RAG Engine is running. Your list of RAG corpora is updated.
Click Configure RAG Engine. The Configure RAG Engine pane appears.
Select the tier that you want to run your RAG Engine.
Click Save.

Python

from vertexai import rag
import vertexai

PROJECT_ID = YOUR_PROJECT_ID
LOCATION = YOUR_RAG_ENGINE_LOCATION

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location=LOCATION)

rag_engine_config_name=f"projects/{PROJECT_ID}/locations/{LOCATION}/ragEngineConfig"

new_rag_engine_config = rag.RagEngineConfig(
name=rag_engine_config_name,
rag_managed_db_config=rag.RagManagedDbConfig(tier=rag.Scaled()),
)

updated_rag_engine_config = rag.rag_data.update_rag_engine_config(
rag_engine_config=new_rag_engine_config
)

print(updated_rag_engine_config)

REST

curl -X PATCH \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://${LOCATION}-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION}/ragEngineConfig -d "{'ragManagedDbConfig': {'scaled': {}}}"

Update your `RagEngineConfig` to the Basic tier

The following code samples demonstrate how to set the RagEngineConfig to the Basic tier:

If you have a large amount of data in your RagManagedDb across your RAG corpora, downgrading to a Basic tier can fail due to insufficient compute and storage capacity.

Console

In the Google Cloud console, go to the RAG Engine page.
Go to RAG Engine
Select the region in which your RAG Engine is running. Your list of RAG corpora is updated.
Click Configure RAG Engine. The Configure RAG Engine pane appears.
Select the tier that you want to run your RAG Engine.
Click Save.

Python

from vertexai import rag
import vertexai

PROJECT_ID = YOUR_PROJECT_ID
LOCATION = YOUR_RAG_ENGINE_LOCATION

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location=LOCATION)

rag_engine_config_name=f"projects/{PROJECT_ID}/locations/{LOCATION}/ragEngineConfig"

new_rag_engine_config = rag.RagEngineConfig(
name=rag_engine_config_name,
rag_managed_db_config=rag.RagManagedDbConfig(tier=rag.Basic()),
)

updated_rag_engine_config = rag.rag_data.update_rag_engine_config(
rag_engine_config=new_rag_engine_config
)

print(updated_rag_engine_config)

REST

curl -X PATCH \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://${LOCATION}-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION}/ragEngineConfig -d "{'ragManagedDbConfig': {'basic': {}}}"

Update your `RagEngineConfig` to the Unprovisioned tier

The following code samples demonstrate how to set the RagEngineConfig to the Unprovisioned tier:

Console

In the Google Cloud console, go to the RAG Engine page.
Go to RAG Engine
Select the region in which your RAG Engine is running. Your list of RAG corpora is updated.
Click Configure RAG Engine. The Configure RAG Engine pane appears.
Click Delete RAG Engine. A confirmation dialog appears.
Verify that you're about to delete your data in RAG Engine by typing delete, then click Confirm.
Click Save.

Python

from vertexai import rag
import vertexai

PROJECT_ID = YOUR_PROJECT_ID
LOCATION = YOUR_RAG_ENGINE_LOCATION

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location=LOCATION)

rag_engine_config_name=f"projects/{PROJECT_ID}/locations/{LOCATION}/ragEngineConfig"

new_rag_engine_config = rag.RagEngineConfig(
  name=rag_engine_config_name,
  rag_managed_db_config=rag.RagManagedDbConfig(tier=rag.Unprovisioned()),
)

updated_rag_engine_config = rag.rag_data.update_rag_engine_config(
  rag_engine_config=new_rag_engine_config
)

print(updated_rag_engine_config)

REST

curl -X PATCH \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://${LOCATION}-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION}/ragEngineConfig -d "{'ragManagedDbConfig': {'unprovisioned': {}}}"

What's next

To learn more about supported generation models, see Generative AI models that support RAG.
To learn more about supported embedding models, see Embedding models.
To learn more about open models, see Open models.
To learn more about RAG Engine, see RAG Engine overview.

RAG Engine API Stay organized with collections Save and categorize content based on your preferences.

Parameters list

Corpus management parameters

Create a RAG corpus

Body Request

CorpusTypeConfig

RagVectorDbConfig

Update a RAG corpus

Body Request

List RAG corpora

Get a RAG corpus

Delete a RAG corpus

File management parameters

Upload a RAG file

Body Request

Import RAG files

Get a RAG file

Delete a RAG file

Retrieval and prediction parameters

Retrieval parameters

VertexRagStore

Prediction parameters

Project management parameters

RagEngineConfig

Corpus management examples

Create a RAG corpus example

REST

curl

PowerShell

Update a RAG corpus example

REST

curl

PowerShell

List RAG corpora example

REST

curl

PowerShell

Get a RAG corpus example

REST

curl

PowerShell

Delete a RAG corpus example

REST

curl

PowerShell

File management examples

Upload a RAG file example

REST

Import RAG files example

REST

curl

PowerShell

List RAG files example

REST

curl

PowerShell

Get a RAG file example

REST

curl

PowerShell

Delete a RAG file example

REST

curl

PowerShell

Retrieval query example

REST

curl

PowerShell

Generation example

REST

curl

PowerShell

Project management examples

Get project configuration

Console

Python

REST

Update project configuration

Update your RagEngineConfig to the Scaled tier

Console

RAG Engine API

`CorpusTypeConfig`

`RagVectorDbConfig`

`VertexRagStore`

`RagEngineConfig`

Update your `RagEngineConfig` to the Scaled tier

Update your `RagEngineConfig` to the Basic tier

Update your `RagEngineConfig` to the Unprovisioned tier