Retrieval-augmented generation (RAG) gives large language models (LLM) access to external knowledge sources, such as documents and databases. By using RAG, LLMs can generate more accurate and informative responses based on the data that the external knowledge sources contain.
Example syntax
This section provides syntax to create a RAG corpus.
curl
curl -X POST \ -H "Authorization: Bearer $(gcloud auth print-access-token)" \ -H "Content-Type: application/json" \ https://${LOCATION}-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION}/ragCorpora\ -d '{ "display_name" : "...", "description": "...", "rag_embedding_model_config": { "vertex_prediction_endpoint": { "endpoint": "..." } } }'
Python
corpus = rag.create_corpus(display_name=..., description=...) print(corpus)
Parameters list
This section lists the following:
Parameters | Examples |
---|---|
See Corpus management parameters. | See Corpus management examples. |
See File management parameters. | See File management examples. |
Corpus management parameters
For information about a RAG corpus, see Corpus management.
Create a RAG corpus
This table lists the parameters used to create a RAG corpus.
Parameters | |
---|---|
|
Optional: The display name of the RAG corpus. |
|
Optional: The description of the RAG corpus. |
|
Optional: The embedding model to use for the RAG corpus. |
|
Optional: The Weaviate instance's HTTPS or HTTP endpoint. |
|
Optional: The Weaviate collection that the RAG corpus maps to. |
|
Optional: The Vertex AI Feature Store Format: |
|
Optional: The Secret Manager secret version resource name that stores the API key. Format: |
|
This field helps you to set the choice of
a vector database that you would like to associate with your RAG corpus, and
it must be set during the |
|
This is the name used to
create the Pinecone index that's used with the RAG corpus. You can set the
name during the |
|
This the full resource name of the secret that is stored in
Secret Manager, which contains your Pinecone API key. You can set the
name during the |
|
This field helps you to set the
choice of a vector database that you would like to associate with your RAG corpus, and
it must be set during the |
|
This is the resource name of the Vector Search that's used with the RAG corpus. You can set the name during the |
|
This is the resource name of the Vector Search index endpoint that's used with the RAG corpus. You can set the name during the |
Update a RAG corpus
This table lists the parameters used to update a RAG corpus.
Name | Description |
---|---|
display_name |
Optional: string The display name of the RAG corpus. |
description |
Optional: string The display of the RAG corpus. |
rag_vector_db_config.weaviate.http_endpoint |
Optional: string The Weaviate instance's HTTPS or HTTP endpoint. |
rag_vector_db_config.weaviate.collection_name |
Optional: string The Weaviate collection that the RAG corpus maps to. |
rag_vector_db_config.vertex_feature_store.feature_view_resource_name |
Optional: string The Vertex AI Feature Store featureview that the RAG corpus maps to. Format: projects/{project}/locations/{location}/featureOnlineStores/{feature_online_store}/featureViews/{feature_view} |
api_auth.api_key_config.api_key_secret_version |
Optional: string The Secret Manager secret version resource name that stores the API key. Format: projects/{project}/secrets/{secret}/versions/{version} |
|
This field helps you to set the choice of
a vector database that you would like to associate with your RAG corpus, and
it must be set during the |
|
This is the name used to
create the Pinecone index that's used with the RAG corpus. You can set the
name during the |
|
This the full resource name of the secret that is stored in
Secret Manager, which contains your Pinecone API key. You can set the
name during the |
|
This field helps you to set the
choice of a vector database that you would like to associate with your RAG corpus, and
it must be set during the |
|
This is the resource name of the Vector Search that's used with the RAG corpus. You can set the name during the |
|
This is the resource name of the Vector Search index endpoint that's used with the RAG corpus. You can set the name during the |
List RAG corpora
This table lists the parameters used to list RAG corpora.
Parameters | |
---|---|
|
Optional: The standard list page size. |
|
Optional: The standard list page token. Typically obtained from |
Get a RAG corpus
This table lists parameters used to get a RAG corpus.
Parameters | |
---|---|
|
The ID of the |
Delete a RAG corpus
This table lists parameters used to delete a RAG corpus.
Parameters | |
---|---|
|
The ID of the |
File management parameters
For information about a RAG file, see File management.
Upload a RAG file
This table lists parameters used to upload a RAG file.
Parameters | |
---|---|
|
The ID of the |
|
Optional: The display name of the RagCorpus. |
|
Optional: The description of the RagCorpus. |
Import RAG files
This table lists parameters used to import a RAG file.
Parameters | |
---|---|
|
The ID of the |
|
Cloud Storage URI that contains the upload file. |
|
Optional: The type of the Google Drive resource. |
|
Optional: The ID of the Google Drive resource. |
|
Optional: Number of tokens each chunk should have. |
|
Optional: Number of tokens overlap between two chunks. |
|
Optional: Number that represents a limit to restrict the rate at which RAG Engine calls the embedding model during the indexing process. Default limit is |
|
The ID of the |
|
Optional: The standard list page size. |
|
Optional: The standard list page token. Typically obtained from |
Get a RAG file
This table lists parameters used to get a RAG file.
Parameters | |
---|---|
|
The ID of the |
Delete a RAG file
This table lists parameters used to delete a RAG file.
Parameters | |
---|---|
|
The ID of the |
Retrieval and prediction
This section lists the retrieval and prediction parameters.
Retrieval parameters
This table lists retrieval parameters.
Parameter | Description |
---|---|
similarity_top_k |
Controls the maximum number of contexts that are retrieved. |
vector_distance_threshold |
Only contexts with a distance smaller than the threshold are considered. |
Prediction parameters
This table lists prediction parameters.
Parameters | |
---|---|
|
LLM model for content generation. |
|
The name of the RagCorpus resource. Format: |
|
The text to LLM for content generation. Maximum value: |
|
Optional: Only contexts with a vector distance smaller than the threshold are returned. |
|
Optional: The number of top contexts to retrieve. |
Corpus management examples
This section provides examples of how to use the API to manage your RAG corpus.
Create a RAG corpus example
This code sample demonstrates how to create a RAG corpus.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- CORPUS_DISPLAY_NAME: The display name of the
RagCorpus
. - CORPUS_DESCRIPTION: The description of the
RagCorpus
. - RAG_EMBEDDING_MODEL_CONFIG_ENDPOINT: The embedding model of the
RagCorpus
.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora
Request JSON body:
{ "display_name" : "CORPUS_DISPLAY_NAME", "description": "CORPUS_DESCRIPTION", "rag_embedding_model_config_endpoint": "RAG_EMBEDDING_MODEL_CONFIG_ENDPOINT" }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$headers = @{ }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora" | Select-Object -Expand Content
The following example demonstrates how to create a RAG corpus by using the REST API.
// Either your first party publisher model or fine-tuned endpoint
// Example: projects/${PROJECT_ID}/locations/${LOCATION}/publishers/google/models/textembedding-gecko@003
// or
// Example: projects/${PROJECT_ID}/locations/${LOCATION}/endpoints/12345
ENDPOINT_NAME=${RAG_EMBEDDING_MODEL_CONFIG_ENDPOINT}
// Corpus display name
// Such as "my_test_corpus"
CORPUS_DISPLAY_NAME=YOUR_CORPUS_DISPLAY_NAME
// CreateRagCorpus
// Input: ENDPOINT, PROJECT_ID, CORPUS_DISPLAY_NAME
// Output: CreateRagCorpusOperationMetadata
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://${ENDPOINT}/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION}/ragCorpora \
-d '{
"display_name" : '\""${CORPUS_DISPLAY_NAME}"\"',
"rag_embedding_model_config" : {
"vertex_prediction_endpoint": {
"endpoint": '\""${ENDPOINT_NAME}"\"'
}
}
}'
// Poll the operation status.
// The last component of the RagCorpus "name" field is the server-generated
// rag_corpus_id: (only Bold part)
// projects/${PROJECT_ID}/locations/${LOCATION}/ragCorpora/7454583283205013504.
OPERATION_ID=OPERATION_ID
poll_op_wait ${OPERATION_ID}
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
Update a RAG corpus example
You can update your RAG corpus with a new display name, description, and vector database configuration. However, you can't change the following parameters in your RAG corpus:
- The vector database type. For example, you can't change the vector database from Weaviate to Vertex AI Feature Store.
- If you're using the managed database option, you can't update the vector database configuration.
These examples demonstrate how to update a RAG corpus.
Python
To learn how to install or update the Vertex AI SDK, see Install the Vertex AI SDK. For more information, see the Python API reference documentation.
from vertexai.preview import rag
import vertexai
# TODO(developer): Update and un-comment on the following lines:
# PROJECT_ID = "YOUR_PROJECT_ID"
# corpus_name = "YOUR_CORPUS_NAME"
# e.g. "projects/1234567890/locations/us-central1/ragCorpora/1234567890'"
# display_name = "test_corpus"
# description = "Corpus Description"
# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")
corpus = rag.update_corpus(
corpus_name=corpus_name,
display_name=display_name,
description=description,
)
print(corpus)
REST
Before using any of the request data, make the following replacements:
PROJECT_ID: Your project ID.
LOCATION: The region to process the request.
CORPUS_ID: The corpus ID of your RAG corpus.
CORPUS_DISPLAY_NAME: The display name of the RAG corpus.
CORPUS_DESCRIPTION: The description of the RAG corpus.
HTTP method and URL:
PATCH https://${LOCATION}-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION}/ragCorpora/${CORPUS_ID}
Request JSON body:
{
"display_name" : ${CORPUS_DISPLAY_NAME},
"description": ${CORPUS_DESCRIPTION}
}
To send your request, choose one of these options:
curl
Save the request body in a file named request.json, and execute the following command:
curl -X PATH \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \ "https://${LOCATION}-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION}/ragCorpora/${CORPUS_ID}"
```
* { Powershell }
Save the request body in a file named request.json, and execute the following command:
```sh
$headers = @{ }
Invoke-WebRequest `
-Method PATCH `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://${LOCATION}-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION}/ragCorpora/${CORPUS_ID}
" | Select-Object -Expand Content
```
List RAG corpora example
This code sample demonstrates how to list all of the RAG corpora.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- PAGE_SIZE: The standard list page size. You may adjust the number of
RagCorpora
to return per page by updating thepage_size
parameter. - PAGE_TOKEN: The standard list page token. Obtained typically using
ListRagCorporaResponse.next_page_token
of the previousVertexRagDataService.ListRagCorpora
call.
HTTP method and URL:
GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora?page_size=PAGE_SIZE&page_token=PAGE_TOKEN
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora?page_size=PAGE_SIZE&page_token=PAGE_TOKEN"
PowerShell
Execute the following command:
$headers = @{ }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora?page_size=PAGE_SIZE&page_token=PAGE_TOKEN" | Select-Object -Expand Content
RagCorpora
under the given PROJECT_ID
.
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
Get a RAG corpus example
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_ID: The ID of the
RagCorpus
resource.
HTTP method and URL:
GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID"
PowerShell
Execute the following command:
$headers = @{ }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID" | Select-Object -Expand Content
RagCorpus
resource.
The get
and list
commands are used in an example to demonstrate how
RagCorpus
uses the rag_embedding_model_config
field, which points to the
embedding model you have chosen.
// Server-generated rag_corpus_id in CreateRagCorpus
RAG_CORPUS_ID=RAG_CORPUS_ID
// GetRagCorpus
// Input: ENDPOINT, PROJECT_ID, RAG_CORPUS_ID
// Output: RagCorpus
curl -X GET \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://${ENDPOINT}/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION}/ragCorpora/${RAG_CORPUS_ID}
// ListRagCorpora
curl -sS -X GET \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://${ENDPOINT}/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION}/ragCorpora"
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
Delete a RAG corpus example
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_ID: The ID of the
RagCorpus
resource.
HTTP method and URL:
DELETE https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID
To send your request, choose one of these options:
curl
Execute the following command:
curl -X DELETE \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID"
PowerShell
Execute the following command:
$headers = @{ }
Invoke-WebRequest `
-Method DELETE `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID" | Select-Object -Expand Content
DeleteOperationMetadata
.
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
File management examples
This section provides examples of how to use the API to manage RAG files.
Upload a RAG file example
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_ID: The ID of the
RagCorpus
resource. - INPUT_FILE: The path of a local file.
- FILE_DISPLAY_NAME: The display name of the
RagFile
. - RAG_FILE_DESCRIPTION: The description of the
RagFile
.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/upload/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:upload
Request JSON body:
{ "rag_file": { "display_name": "FILE_DISPLAY_NAME", "description": "RAG_FILE_DESCRIPTION" } }
To send your request, choose one of these options:
curl
Save the request body in a file named INPUT_FILE
,
and execute the following command:
curl -X POST \
-H "Content-Type: application/json; charset=utf-8" \
-d @INPUT_FILE \
"https://LOCATION-aiplatform.googleapis.com/upload/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:upload"
PowerShell
Save the request body in a file named INPUT_FILE
,
and execute the following command:
$headers = @{ }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile INPUT_FILE `
-Uri "https://LOCATION-aiplatform.googleapis.com/upload/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:upload" | Select-Object -Expand Content
RagFile
resource. The last component of the RagFile.name
field is the server-generated rag_file_id
.
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
Import RAG files example
Files and folders can be imported from Drive or
Cloud Storage. You can use response.metadata
to view partial
failures, request time, and response time in the SDK's response
object.
The response.skipped_rag_files_count
refers to the number of files that
were skipped during import. A file is skipped when the following conditions are
met:
- The file has already been imported.
- The file hasn't changed.
- The chunking configuration for the file hasn't changed.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_ID: The ID of the
RagCorpus
resource. - GCS_URIS: A list of Cloud Storage locations. Example:
gs://my-bucket1, gs://my-bucket2
. - DRIVE_RESOURCE_ID: The ID of the Drive resource. Examples:
https://drive.google.com/file/d/ABCDE
https://drive.google.com/corp/drive/u/0/folders/ABCDEFG
- DRIVE_RESOURCE_TYPE: Type of the Drive resource. Options:
RESOURCE_TYPE_FILE
- FileRESOURCE_TYPE_FOLDER
- Folder- CHUNK_SIZE: Optional: Number of tokens each chunk should have.
- CHUNK_OVERLAP: Optional: Number of tokens overlap between chunks.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/upload/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import
Request JSON body:
{ "import_rag_files_config": { "gcs_source": { "uris": GCS_URIS }, "google_drive_source": { "resource_ids": { "resource_id": DRIVE_RESOURCE_ID, "resource_type": DRIVE_RESOURCE_TYPE }, } } }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/upload/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$headers = @{ }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/upload/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import" | Select-Object -Expand Content
ImportRagFilesOperationMetadata
resource.
The following sample demonstrates how to import a file from
Cloud Storage. Use the max_embedding_requests_per_min
control field
to limit the rate at which RAG Engine calls the embedding model during the
ImportRagFiles
indexing process. The field has a default value of 1000
calls
per minute.
// Cloud Storage bucket/file location.
// Such as "gs://rag-e2e-test/"
GCS_URIS=YOUR_GCS_LOCATION
// Enter the QPM rate to limit RAG's access to your embedding model
// Example: 1000
EMBEDDING_MODEL_QPM_RATE=MAX_EMBEDDING_REQUESTS_PER_MIN_LIMIT
// ImportRagFiles
// Import a single Cloud Storage file or all files in a Cloud Storage bucket.
// Input: ENDPOINT, PROJECT_ID, RAG_CORPUS_ID, GCS_URIS
// Output: ImportRagFilesOperationMetadataNumber
// Use ListRagFiles to find the server-generated rag_file_id.
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://${ENDPOINT}/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION}/ragCorpora/${RAG_CORPUS_ID}/ragFiles:import \
-d '{
"import_rag_files_config": {
"gcs_source": {
"uris": '\""${GCS_URIS}"\"'
},
"rag_file_chunking_config": {
"chunk_size": 512
},
"max_embedding_requests_per_min": '"${EMBEDDING_MODEL_QPM_RATE}"'
}
}'
// Poll the operation status.
// The response contains the number of files imported.
OPERATION_ID=OPERATION_ID
poll_op_wait ${OPERATION_ID}
The following sample demonstrates how to import a file from
Drive. Use the max_embedding_requests_per_min
control field to
limit the rate at which RAG Engine calls the embedding model during the
ImportRagFiles
indexing process. The field has a default value of 1000
calls
per minute.
// Google Drive folder location.
FOLDER_RESOURCE_ID=YOUR_GOOGLE_DRIVE_FOLDER_RESOURCE_ID
// Enter the QPM rate to limit RAG's access to your embedding model
// Example: 1000
EMBEDDING_MODEL_QPM_RATE=MAX_EMBEDDING_REQUESTS_PER_MIN_LIMIT
// ImportRagFiles
// Import all files in a Google Drive folder.
// Input: ENDPOINT, PROJECT_ID, RAG_CORPUS_ID, FOLDER_RESOURCE_ID
// Output: ImportRagFilesOperationMetadataNumber
// Use ListRagFiles to find the server-generated rag_file_id.
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://${ENDPOINT}/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION}/ragCorpora/${RAG_CORPUS_ID}/ragFiles:import \
-d '{
"import_rag_files_config": {
"google_drive_source": {
"resource_ids": {
"resource_id": '\""${FOLDER_RESOURCE_ID}"\"',
"resource_type": "RESOURCE_TYPE_FOLDER"
}
},
"max_embedding_requests_per_min": '"${EMBEDDING_MODEL_QPM_RATE}"'
}
}'
// Poll the operation status.
// The response contains the number of files imported.
OPERATION_ID=OPERATION_ID
poll_op_wait ${OPERATION_ID}
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
Get a RAG file example
This code sample demonstrates how to get a RAG file.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_ID: The ID of the
RagCorpus
resource. - RAG_FILE_ID: The ID of the
RagFile
resource.
HTTP method and URL:
GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID"
PowerShell
Execute the following command:
$headers = @{ }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID" | Select-Object -Expand Content
RagFile
resource.
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
List RAG files example
This code sample demonstrates how to list RAG files.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_ID: The ID of the
RagCorpus
resource. - PAGE_SIZE: The standard list page size. You may adjust the number of
RagFiles
to return per page by updating thepage_size
parameter. - PAGE_TOKEN: The standard list page token. Obtained typically using
ListRagFilesResponse.next_page_token
of the previousVertexRagDataService.ListRagFiles
call.
HTTP method and URL:
GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles?page_size=PAGE_SIZE&page_token=PAGE_TOKEN
To send your request, choose one of these options:
curl
Execute the following command:
curl -X GET \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles?page_size=PAGE_SIZE&page_token=PAGE_TOKEN"
PowerShell
Execute the following command:
$headers = @{ }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles?page_size=PAGE_SIZE&page_token=PAGE_TOKEN" | Select-Object -Expand Content
RagFiles
under the given RAG_CORPUS_ID
.
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
Delete a RAG file example
This code sample demonstrates how to delete a RAG file.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_ID: The ID of the
RagCorpus
resource. - RAG_FILE_ID: The ID of the
RagFile
resource. Format:projects/{project}/locations/{location}/ragCorpora/{rag_corpus}/ragFiles/{rag_file_id}
.
HTTP method and URL:
DELETE https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID
To send your request, choose one of these options:
curl
Execute the following command:
curl -X DELETE \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID"
PowerShell
Execute the following command:
$headers = @{ }
Invoke-WebRequest `
-Method DELETE `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID" | Select-Object -Expand Content
DeleteOperationMetadata
resource.
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
Retrieval query
When a user asks a question or provides a prompt, the retrieval component in RAG searches through its knowledge base to find information that is relevant to the query.
REST
Before using any of the request data, make the following replacements:
- LOCATION: The region to process the request.
- PROJECT_ID: Your project ID.
- RAG_CORPUS_RESOURCE: The name of the
RagCorpus
resource. Format:projects/{project}/locations/{location}/ragCorpora/{rag_corpus}
. - VECTOR_DISTANCE_THRESHOLD: Only contexts with a vector distance smaller than the threshold are returned.
- TEXT: The query text to get relevant contexts.
- SIMILARITY_TOP_K: The number of top contexts to retrieve.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION:retrieveContexts
Request JSON body:
{ "vertex_rag_store": { "rag_resources": { "rag_corpus": "RAG_CORPUS_RESOURCE", }, "vector_distance_threshold": 0.8 }, "query": { "text": "TEXT", "similarity_top_k": SIMILARITY_TOP_K } }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION:retrieveContexts"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$headers = @{ }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION:retrieveContexts" | Select-Object -Expand Content
RagFiles
.
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
Prediction
The prediction generates a grounded response using the retrieved contexts.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- MODEL_ID: LLM model for content generation. Example:
gemini-1.5-pro-002
- GENERATION_METHOD: LLM method for content generation. Options:
generateContent
,streamGenerateContent
- INPUT_PROMPT: The text sent to the LLM for content generation. Try to use a prompt relevant to the uploaded rag Files.
- RAG_CORPUS_RESOURCE: The name of the
RagCorpus
resource. Format:projects/{project}/locations/{location}/ragCorpora/{rag_corpus}
. - SIMILARITY_TOP_K: Optional: The number of top contexts to retrieve.
- VECTOR_DISTANCE_THRESHOLD: Optional: Contexts with a vector distance smaller than the threshold are returned.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:GENERATION_METHOD
Request JSON body:
{ "contents": { "role": "user", "parts": { "text": "INPUT_PROMPT" } }, "tools": { "retrieval": { "disable_attribution": false, "vertex_rag_store": { "rag_resources": { "rag_corpus": "RAG_CORPUS_RESOURCE", }, "similarity_top_k": SIMILARITY_TOP_K, "vector_distance_threshold": VECTOR_DISTANCE_THRESHOLD } } } }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:GENERATION_METHOD"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$headers = @{ }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:GENERATION_METHOD" | Select-Object -Expand Content
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
What's next
- To learn more about supported generation models, see Generative AI models that support RAG.
- To learn more about supported embedding models, see Embedding models.
- To learn more about open models, see Open models.
- To learn more about RAG Engine, see
RAG Engine overview.