Vertex AI for RAG API 上的 LlamaIndex

检索增强生成 (RAG) 可让大型语言模型 (LLM) 访问外部知识源，例如文档和数据库。通过使用 RAG，LLM 可以根据外部知识源包含的数据生成更准确、信息更丰富的回答。

示例语法

用于创建 RAG 语料库的语法。

curl

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://${LOCATION}-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION}/ragCorpora\
  -d '{
  "display_name" : "...",
  "description": "...",
  "rag_embedding_model_config": {
    "vertex_prediction_endpoint": {
      "endpoint": "..."
    }
  }
}'

Python

corpus = rag.create_corpus(display_name=..., description=...)
print(corpus)

参数列表

如需了解实现详情，请参阅示例。

语料库管理

如需了解 RAG 语料库，请参阅索引管理。

创建 RagCorpus

参数
`display_name`	可选：`string`。 `RagCorpus` 的显示名称。
`description`	可选：`string`。 `RagCorpus` 的说明。
`rag_embedding_model_config.vertex_prediction_endpoint.endpoint`	可选：`string`。要为 `RagCorpus` 使用的嵌入模型。
`rag_vector_db_config.weaviate.http_endpoint`	可选：`string`。 Weaviate 实例的 HTTPS 或 HTTP 端点。
`rag_vector_db_config.weaviate.collection_name`	可选：`string`。 `RagCorpus` 映射到的 Weaviate 集合。
`rag_vector_db_config.vertex_feature_store.feature_view_resource_name`	可选：`string`。 `RagCorpus` 映射到的 Vertex AI Feature Store `FeatureView`。格式：`projects/{project}/locations/{location}/featureOnlineStores/{feature_online_store}/featureViews/{feature_view}`
`api_auth.api_key_config.api_key_secret_version`	可选：`string`。存储 API 密钥的 Secret Manager Secret 版本资源名称。格式：`projects/{project}/secrets/{secret}/versions/{version}`

列出 RagCorpora

参数

参数
`page_size`	可选：`int`。标准列表页面大小。
`page_token`	可选：`string`。标准列表页面令牌。通常从上一个 `[VertexRagDataService.ListRagCorpora][]` 调用的 `[ListRagCorporaResponse.next_page_token][]` 中获取。

page_size

可选：int。

标准列表页面大小。

page_token

可选：string。

标准列表页面令牌。通常从上一个 [VertexRagDataService.ListRagCorpora][] 调用的 [ListRagCorporaResponse.next_page_token][] 中获取。

获取 RagCorpus

参数

rag_corpus_id

string

RagCorpus 资源的 ID。格式：projects/{project}/locations/{location}/ragCorpora/{rag_corpus_id}

删除 RagCorpus

参数

rag_corpus_id

string

RagCorpus 资源的 ID。格式：projects/{project}/locations/{location}/ragCorpora/{rag_corpus_id}

文件管理

如需了解 RAG 文件，请参阅文件管理。

上传 RagFile

参数

rag_corpus_id

string

RagCorpus 资源的 ID。格式：projects/{project}/locations/{location}/ragCorpora/{rag_corpus_id}

display_name

可选：string。

RagCorpus 的显示名称。

description

可选：string。

RagCorpus 的说明。

导入 RagFile

参数
`rag_corpus_id`	`string` `RagCorpus` 资源的 ID。格式：`projects/{project}/locations/{location}/ragCorpora/{rag_corpus_id}`
`gcs_source.uris`	`list` 包含上传文件的 Cloud Storage URI。
`google_drive_source.resource_id`	可选：`string`。 Google 云端硬盘资源的类型。
`google_drive_source.resource_ids.resource_type`	可选：`string`。 Google 云端硬盘资源的 ID。
`rag_file_chunking_config.chunk_size`	可选：`int`。每个分块应具有的词元数。
`rag_file_chunking_config.chunk_overlap`	可选：`int`。两个分块之间的词元重叠数。
`max_embedding_requests_per_min`	可选：`int`。表示限制，用于限制 LlamaIndex on Vertex AI for RAG 在索引编制过程中调用嵌入模型的速率。默认限制为 `1000`。如需详细了解速率限制，请参阅 Vertex AI for RAG 上的 LlamaIndex 配额。

参数

rag_corpus_id

string

RagCorpus 资源的 ID。格式：projects/{project}/locations/{location}/ragCorpora/{rag_corpus_id}

page_size

可选：int。

标准列表页面大小。

page_token

可选：string。

标准列表页面令牌。通常从上一个 [VertexRagDataService.ListRagCorpora][]< 调用的 [ListRagCorporaResponse.next_page_token][] 中获取。

获取 RagFile

参数

参数
`rag_file_id`	`string` `RagCorpus` 资源的 ID。格式：`projects/{project}/locations/{location}/ragCorpora/{rag_file_id}`

rag_file_id

string

RagCorpus 资源的 ID。格式：projects/{project}/locations/{location}/ragCorpora/{rag_file_id}

删除 RagFile

参数

参数
`rag_file_id`	`string` `RagCorpus` 资源的 ID。格式：`projects/{project}/locations/{location}/ragCorpora/{rag_file_id}`

rag_file_id

string

RagCorpus 资源的 ID。格式：projects/{project}/locations/{location}/ragCorpora/{rag_file_id}

检索和预测

检索

参数	说明
`similarity_top_k`	控制检索的上下文的最大数量。
`vector_distance_threshold`	仅考虑距离小于阈值的上下文。

预测

参数
`model_id`	`string` 用于生成内容的 LLM 模型。
`rag_corpora`	`string` RagCorpus 资源的名称。格式：`projects/{project}/locations/{location}/ragCorpora/{rag_corpus}`
`text`	`string (list)` 发送到 LLM 以生成内容的文本。最大值：1 个列表。
`vector_distance_threshold`	可选：`double`。仅返回向量距离小于阈值的上下文。
`similarity_top_k`	可选：`int`。要检索的热门上下文数量。

示例

以下示例展示了语料库管理、文件管理以及检索和预测。

创建 RAG 语料库

REST

在使用任何请求数据之前，请先进行以下替换：

PROJECT_ID：您的项目 ID。
LOCATION：处理请求的区域。
CORPUS_DISPLAY_NAME：RagCorpus 的显示名。
CORPUS_DESCRIPTION：RagCorpus 的说明。
RAG_EMBEDDING_MODEL_CONFIG_ENDPOINT：RagCorpus 的嵌入模型。

HTTP 方法和网址：

POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora

请求 JSON 正文：

{
  "display_name" : "CORPUS_DISPLAY_NAME",
  "description": "CORPUS_DESCRIPTION",
  "rag_embedding_model_config_endpoint": "RAG_EMBEDDING_MODEL_CONFIG_ENDPOINT"
}

如需发送请求，请选择以下方式之一：

curl

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora"

PowerShell

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$headers = @{  }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora" | Select-Object -Expand Content

您应该会收到一个成功状态代码 (2xx)。

以下示例演示了如何使用 REST API 创建 RAG 语料库。

    // Either your first party publisher model or fine-tuned endpoint
    // Example: projects/${PROJECT_ID}/locations/${LOCATION}/publishers/google/models/textembedding-gecko@003
    // or
    // Example: projects/${PROJECT_ID}/locations/${LOCATION}/endpoints/12345
    ENDPOINT_NAME=${RAG_EMBEDDING_MODEL_CONFIG_ENDPOINT}

    // Corpus display name
    // Such as "my_test_corpus"
    CORPUS_DISPLAY_NAME=YOUR_CORPUS_DISPLAY_NAME

    // CreateRagCorpus
    // Input: ENDPOINT, PROJECT_ID, CORPUS_DISPLAY_NAME
    // Output: CreateRagCorpusOperationMetadata
    curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json" \
    https://${ENDPOINT}/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION}/ragCorpora \
    -d '{
          "display_name" : '\""${CORPUS_DISPLAY_NAME}"\"',
          "rag_embedding_model_config" : {
                  "vertex_prediction_endpoint": {
                        "endpoint": '\""${ENDPOINT_NAME}"\"'
                  }
          }
      }'

    // Poll the operation status.
    // The last component of the RagCorpus "name" field is the server-generated
    // rag_corpus_id: (only Bold part)
    // projects/${PROJECT_ID}/locations/${LOCATION}/ragCorpora/7454583283205013504.
    OPERATION_ID=OPERATION_ID
    poll_op_wait ${OPERATION_ID}

Python

如需了解如何安装或更新 Vertex AI SDK for Python，请参阅安装 Vertex AI SDK for Python。如需了解详情，请参阅 Python API 参考文档。


from vertexai.preview import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# display_name = "test_corpus"
# description = "Corpus Description"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

# Configure embedding model
embedding_model_config = rag.EmbeddingModelConfig(
    publisher_model="publishers/google/models/text-embedding-004"
)

corpus = rag.create_corpus(
    display_name=display_name,
    description=description,
    embedding_model_config=embedding_model_config,
)
print(corpus)

列出 RAG 语料库

REST

在使用任何请求数据之前，请先进行以下替换：

PROJECT_ID：您的项目 ID。
LOCATION：处理请求的区域。
PAGE_SIZE：标准列表页面大小。您可以通过更新 page_size 参数来调整每页返回的 RagCorpora 数量。
PAGE_TOKEN：标准列表页面词元。通常使用前一个 VertexRagDataService.ListRagCorpora 调用的 ListRagCorporaResponse.next_page_token 获取。

HTTP 方法和网址：

GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora?page_size=PAGE_SIZE&page_token=PAGE_TOKEN

如需发送请求，请选择以下方式之一：

curl

执行以下命令：

curl -X GET \
     "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora?page_size=PAGE_SIZE&page_token=PAGE_TOKEN"

PowerShell

执行以下命令：

$headers = @{  }

Invoke-WebRequest `
    -Method GET `
    -Headers $headers `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora?page_size=PAGE_SIZE&page_token=PAGE_TOKEN" | Select-Object -Expand Content

您应该会收到一个成功状态代码 (2xx) 和一个给定 PROJECT_ID 下的 RagCorpora 列表。

Python

如需了解如何安装或更新 Vertex AI SDK for Python，请参阅安装 Vertex AI SDK for Python。如需了解详情，请参阅 Python API 参考文档。


from vertexai.preview import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

corpora = rag.list_corpora()
print(corpora)

获取 RAG 语料库

REST

在使用任何请求数据之前，请先进行以下替换：

PROJECT_ID：您的项目 ID。
LOCATION：处理请求的区域。
RAG_CORPUS_ID：RagCorpus 资源的 ID。

HTTP 方法和网址：

GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID

如需发送请求，请选择以下方式之一：

curl

执行以下命令：

curl -X GET \
     "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID"

PowerShell

执行以下命令：

$headers = @{  }

Invoke-WebRequest `
    -Method GET `
    -Headers $headers `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID" | Select-Object -Expand Content

成功的响应会返回 RagCorpus 资源。

示例中使用 get 和 list 命令，来演示 RagCorpus 如何使用指向您选择的嵌入模型的 rag_embedding_model_config 字段。

// Server-generated rag_corpus_id in CreateRagCorpus
RAG_CORPUS_ID=RAG_CORPUS_ID

// GetRagCorpus
// Input: ENDPOINT, PROJECT_ID, RAG_CORPUS_ID
// Output: RagCorpus
curl -X GET \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://${ENDPOINT}/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION}/ragCorpora/${RAG_CORPUS_ID}

// ListRagCorpora
curl -sS -X GET \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://${ENDPOINT}/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION}/ragCorpora"

Python

如需了解如何安装或更新 Vertex AI SDK for Python，请参阅安装 Vertex AI SDK for Python。如需了解详情，请参阅 Python API 参考文档。


from vertexai.preview import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# corpus_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

corpus = rag.get_corpus(name=corpus_name)
print(corpus)

删除 RAG 语料库

REST

在使用任何请求数据之前，请先进行以下替换：

PROJECT_ID：您的项目 ID。
LOCATION：处理请求的区域。
RAG_CORPUS_ID：RagCorpus 资源的 ID。

HTTP 方法和网址：

DELETE https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID

如需发送请求，请选择以下方式之一：

curl

执行以下命令：

curl -X DELETE \
     "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID"

PowerShell

执行以下命令：

$headers = @{  }

Invoke-WebRequest `
    -Method DELETE `
    -Headers $headers `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID" | Select-Object -Expand Content

成功的响应会返回 DeleteOperationMetadata。

Python

如需了解如何安装或更新 Vertex AI SDK for Python，请参阅安装 Vertex AI SDK for Python。如需了解详情，请参阅 Python API 参考文档。


from vertexai.preview import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# corpus_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

rag.delete_corpus(name=corpus_name)
print(f"Corpus {corpus_name} deleted.")

上传 RAG 文件

REST

在使用任何请求数据之前，请先进行以下替换：

PROJECT_ID：您的项目 ID。
LOCATION：处理请求的区域。
RAG_CORPUS_ID：RagCorpus 资源的 ID。
INPUT_FILE：本地文件的路径。
FILE_DISPLAY_NAME：RagFile 的显示名。
RAG_FILE_DESCRIPTION：RagFile 的说明。

HTTP 方法和网址：

POST https://LOCATION-aiplatform.googleapis.com/upload/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:upload

请求 JSON 正文：

{
 "rag_file": {
  "display_name": "FILE_DISPLAY_NAME",
  "description": "RAG_FILE_DESCRIPTION"
 }
}

如需发送请求，请选择以下方式之一：

curl

将请求正文保存在名为 INPUT_FILE 的文件中，然后执行以下命令：

curl -X POST \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @INPUT_FILE \
     "https://LOCATION-aiplatform.googleapis.com/upload/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:upload"

PowerShell

将请求正文保存在名为 INPUT_FILE 的文件中，然后执行以下命令：

$headers = @{  }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile INPUT_FILE `
    -Uri "https://LOCATION-aiplatform.googleapis.com/upload/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:upload" | Select-Object -Expand Content

成功的响应会返回 RagFile 资源。RagFile.name 字段的最后一个组成部分是服务器生成的 rag_file_id。

Python

如需了解如何安装或更新 Vertex AI SDK for Python，请参阅安装 Vertex AI SDK for Python。如需了解详情，请参阅 Python API 参考文档。


from vertexai.preview import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# corpus_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}"
# path = "path/to/local/file.txt"
# display_name = "file_display_name"
# description = "file description"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

rag_file = rag.upload_file(
    corpus_name=corpus_name,
    path=path,
    display_name=display_name,
    description=description,
)
print(rag_file)

导入 RAG 文件

您可以从云端硬盘或 Cloud Storage 导入文件和文件夹。

REST

在使用任何请求数据之前，请先进行以下替换：

PROJECT_ID：您的项目 ID。
LOCATION：处理请求的区域。
RAG_CORPUS_ID：RagCorpus 资源的 ID。
GCS_URIS：Cloud Storage 位置列表。示例：gs://my-bucket1, gs://my-bucket2。
DRIVE_RESOURCE_ID：云端硬盘资源的 ID。示例：

https://drive.google.com/file/d/ABCDE
https://drive.google.com/corp/drive/u/0/folders/ABCDEFG

DRIVE_RESOURCE_TYPE：云端硬盘资源的类型。选项：

RESOURCE_TYPE_FILE - 文件
RESOURCE_TYPE_FOLDER - 文件夹

CHUNK_SIZE（可选）：每个分块应具有的词元数。
CHUNK_OVERLAP（可选）：分块之间的词元重叠数。

HTTP 方法和网址：

POST https://LOCATION-aiplatform.googleapis.com/upload/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import

请求 JSON 正文：

{
  "import_rag_files_config": {
    "gcs_source": {
      "uris": GCS_URIS
    },
    "google_drive_source": {
      "resource_ids": {
        "resource_id": DRIVE_RESOURCE_ID,
        "resource_type": DRIVE_RESOURCE_TYPE
      },
    }
  }
}

如需发送请求，请选择以下方式之一：

curl

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/upload/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import"

PowerShell

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$headers = @{  }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/upload/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import" | Select-Object -Expand Content

成功的响应会返回 ImportRagFilesOperationMetadata 资源。

以下示例演示了如何从 Cloud Storage 导入文件。使用 max_embedding_requests_per_min 控制字段限制 LlamaIndex on Vertex AI for RAG 在 ImportRagFiles 索引编制过程中调用嵌入模型的速率。该字段的默认值为每分钟 1000 次调用。

// Cloud Storage bucket/file location.
// Such as "gs://rag-e2e-test/"
GCS_URIS=YOUR_GCS_LOCATION

// Enter the QPM rate to limit RAG's access to your embedding model
// Example: 1000
EMBEDDING_MODEL_QPM_RATE=MAX_EMBEDDING_REQUESTS_PER_MIN_LIMIT

// ImportRagFiles
// Import a single Cloud Storage file or all files in a Cloud Storage bucket.
// Input: ENDPOINT, PROJECT_ID, RAG_CORPUS_ID, GCS_URIS
// Output: ImportRagFilesOperationMetadataNumber
// Use ListRagFiles to find the server-generated rag_file_id.
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://${ENDPOINT}/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION}/ragCorpora/${RAG_CORPUS_ID}/ragFiles:import \
-d '{
  "import_rag_files_config": {
    "gcs_source": {
      "uris": '\""${GCS_URIS}"\"'
    },
    "rag_file_chunking_config": {
      "chunk_size": 512
    },
    "max_embedding_requests_per_min": '"${EMBEDDING_MODEL_QPM_RATE}"'
  }
}'

// Poll the operation status.
// The response contains the number of files imported.
OPERATION_ID=OPERATION_ID
poll_op_wait ${OPERATION_ID}

以下示例演示了如何从云端硬盘导入文件。使用 max_embedding_requests_per_min 控制字段限制 LlamaIndex on Vertex AI for RAG 在 ImportRagFiles 索引编制过程中调用嵌入模型的速率。该字段的默认值为每分钟 1000 次调用。

// Google Drive folder location.
FOLDER_RESOURCE_ID=YOUR_GOOGLE_DRIVE_FOLDER_RESOURCE_ID

// Enter the QPM rate to limit RAG's access to your embedding model
// Example: 1000
EMBEDDING_MODEL_QPM_RATE=MAX_EMBEDDING_REQUESTS_PER_MIN_LIMIT

// ImportRagFiles
// Import all files in a Google Drive folder.
// Input: ENDPOINT, PROJECT_ID, RAG_CORPUS_ID, FOLDER_RESOURCE_ID
// Output: ImportRagFilesOperationMetadataNumber
// Use ListRagFiles to find the server-generated rag_file_id.
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://${ENDPOINT}/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION}/ragCorpora/${RAG_CORPUS_ID}/ragFiles:import \
-d '{
  "import_rag_files_config": {
    "google_drive_source": {
      "resource_ids": {
        "resource_id": '\""${FOLDER_RESOURCE_ID}"\"',
        "resource_type": "RESOURCE_TYPE_FOLDER"
      }
    },
    "max_embedding_requests_per_min": '"${EMBEDDING_MODEL_QPM_RATE}"'
  }
}'

// Poll the operation status.
// The response contains the number of files imported.
OPERATION_ID=OPERATION_ID
poll_op_wait ${OPERATION_ID}

Python

如需了解如何安装或更新 Vertex AI SDK for Python，请参阅安装 Vertex AI SDK for Python。如需了解详情，请参阅 Python API 参考文档。


from vertexai.preview import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# corpus_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}"
# paths = ["https://drive.google.com/file/123", "gs://my_bucket/my_files_dir"]  # Supports Google Cloud Storage and Google Drive Links

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

response = rag.import_files(
    corpus_name=corpus_name,
    paths=paths,
    chunk_size=512,  # Optional
    chunk_overlap=100,  # Optional
    max_embedding_requests_per_min=900,  # Optional
)
print(f"Imported {response.imported_rag_files_count} files.")

获取 RAG 文件

REST

在使用任何请求数据之前，请先进行以下替换：

PROJECT_ID：您的项目 ID。
LOCATION：处理请求的区域。
RAG_CORPUS_ID：RagCorpus 资源的 ID。
RAG_FILE_ID：RagFile 资源的 ID。

HTTP 方法和网址：

GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID

如需发送请求，请选择以下方式之一：

curl

执行以下命令：

curl -X GET \
     "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID"

PowerShell

执行以下命令：

$headers = @{  }

Invoke-WebRequest `
    -Method GET `
    -Headers $headers `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID" | Select-Object -Expand Content

成功的响应会返回 RagFile 资源。

Python

如需了解如何安装或更新 Vertex AI SDK for Python，请参阅安装 Vertex AI SDK for Python。如需了解详情，请参阅 Python API 参考文档。


from vertexai.preview import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# file_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}/ragFiles/{rag_file_id}"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

rag_file = rag.get_file(name=file_name)
print(rag_file)

列出 RAG 文件

REST

在使用任何请求数据之前，请先进行以下替换：

PROJECT_ID：您的项目 ID。
LOCATION：处理请求的区域。
RAG_CORPUS_ID：RagCorpus 资源的 ID。
PAGE_SIZE：标准列表页面大小。您可以通过更新 page_size 参数来调整每页返回的 RagFiles 数量。
PAGE_TOKEN：标准列表页面词元。通常使用前一个 VertexRagDataService.ListRagFiles 调用的 ListRagFilesResponse.next_page_token 获取。

HTTP 方法和网址：

GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles?page_size=PAGE_SIZE&page_token=PAGE_TOKEN

如需发送请求，请选择以下方式之一：

curl

执行以下命令：

curl -X GET \
     "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles?page_size=PAGE_SIZE&page_token=PAGE_TOKEN"

PowerShell

执行以下命令：

$headers = @{  }

Invoke-WebRequest `
    -Method GET `
    -Headers $headers `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles?page_size=PAGE_SIZE&page_token=PAGE_TOKEN" | Select-Object -Expand Content

您应该会收到一个成功状态代码 (2xx) 以及给定 RAG_CORPUS_ID 下的 RagFiles 列表。

Python

如需了解如何安装或更新 Vertex AI SDK for Python，请参阅安装 Vertex AI SDK for Python。如需了解详情，请参阅 Python API 参考文档。


from vertexai.preview import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# corpus_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

files = rag.list_files(corpus_name=corpus_name)
for file in files:
    print(file)

删除 RAG 文件

REST

在使用任何请求数据之前，请先进行以下替换：

PROJECT_ID：您的项目 ID。
LOCATION：处理请求的区域。
RAG_CORPUS_ID：RagCorpus 资源的 ID。
RAG_FILE_ID：RagFile 资源的 ID。格式：projects/{project}/locations/{location}/ragCorpora/{rag_corpus}/ragFiles/{rag_file_id}。

HTTP 方法和网址：

DELETE https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID

如需发送请求，请选择以下方式之一：

curl

执行以下命令：

curl -X DELETE \
     "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID"

PowerShell

执行以下命令：

$headers = @{  }

Invoke-WebRequest `
    -Method DELETE `
    -Headers $headers `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID" | Select-Object -Expand Content

成功的响应会返回 DeleteOperationMetadata 资源。

Python

如需了解如何安装或更新 Vertex AI SDK for Python，请参阅安装 Vertex AI SDK for Python。如需了解详情，请参阅 Python API 参考文档。


from vertexai.preview import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# file_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}/ragFiles/{rag_file_id}"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

rag.delete_file(name=file_name)
print(f"File {file_name} deleted.")

检索查询

当用户提问或提供问题时，RAG 中的检索组件会搜索其知识库，以查找与查询相关的信息。

REST

在使用任何请求数据之前，请先进行以下替换：

LOCATION：处理请求的区域。
PROJECT_ID：您的项目 ID。
RAG_CORPUS_RESOURCE：RagCorpus 资源的名称。格式：projects/{project}/locations/{location}/ragCorpora/{rag_corpus}。
VECTOR_DISTANCE_THRESHOLD：仅返回向量距离小于阈值的上下文。
TEXT：要获取相关上下文的查询文本。
SIMILARITY_TOP_K：要检索的热门上下文数量。

HTTP 方法和网址：

POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION:retrieveContexts

请求 JSON 正文：

{
 "vertex_rag_store": {
    "rag_resources": {
      "rag_corpus": "RAG_CORPUS_RESOURCE",
    },
    "vector_distance_threshold": 0.8
  },
  "query": {
   "text": "TEXT",
   "similarity_top_k": SIMILARITY_TOP_K
  }
 }

如需发送请求，请选择以下方式之一：

curl

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION:retrieveContexts"

PowerShell

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$headers = @{  }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION:retrieveContexts" | Select-Object -Expand Content

您应该会收到一个成功状态代码 (2xx) 以及相关RagFiles 的列表。

Python

如需了解如何安装或更新 Vertex AI SDK for Python，请参阅安装 Vertex AI SDK for Python。如需了解详情，请参阅 Python API 参考文档。


from vertexai.preview import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# rag_corpus_id = "9183965540115283968" # Only one corpus is supported at this time

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

response = rag.retrieval_query(
    rag_resources=[
        rag.RagResource(
            rag_corpus=rag_corpus_id,
            # Optional: supply IDs from `rag.list_files()`.
            # rag_file_ids=["rag-file-1", "rag-file-2", ...],
        )
    ],
    text="What is RAG and why it is helpful?",
    similarity_top_k=10,  # Optional
    vector_distance_threshold=0.5,  # Optional
)
print(response)

预测

预测可控制用于生成内容的 LLM 方法。

REST

在使用任何请求数据之前，请先进行以下替换：

PROJECT_ID：您的项目 ID。
LOCATION：处理请求的区域。
MODEL_ID：用于内容生成的 LLM 模型。示例：gemini-1.5-pro-002
GENERATION_METHOD：用于生成内容的 LLM 方法。选项：generateContent、streamGenerateContent
INPUT_PROMPT：发送到 LLM 用于生成内容的文本。尝试使用与上传的 rag 文件相关的问题。
RAG_CORPUS_RESOURCE：RagCorpus 资源的名称。格式：projects/{project}/locations/{location}/ragCorpora/{rag_corpus}。
SIMILARITY_TOP_K（可选）：要检索的热门上下文数量。
VECTOR_DISTANCE_THRESHOLD（可选）：返回向量距离小于阈值的上下文。

HTTP 方法和网址：

POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:GENERATION_METHOD

请求 JSON 正文：

{
 "contents": {
  "role": "user",
  "parts": {
    "text": "INPUT_PROMPT"
  }
 },
 "tools": {
  "retrieval": {
   "disable_attribution": false,
   "vertex_rag_store": {
    "rag_resources": {
      "rag_corpus": "RAG_CORPUS_RESOURCE",
    },
    "similarity_top_k": SIMILARITY_TOP_K,
    "vector_distance_threshold": VECTOR_DISTANCE_THRESHOLD
   }
  }
 }
}

如需发送请求，请选择以下方式之一：

curl

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:GENERATION_METHOD"

PowerShell

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$headers = @{  }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:GENERATION_METHOD" | Select-Object -Expand Content

成功的响应会返回生成的内容以及引用。

Python

如需了解如何安装或更新 Vertex AI SDK for Python，请参阅安装 Vertex AI SDK for Python。如需了解详情，请参阅 Python API 参考文档。


from vertexai.preview import rag
from vertexai.preview.generative_models import GenerativeModel, Tool
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# rag_corpus_id = "9183965540115283968" # Only one corpus is supported at this time

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

rag_retrieval_tool = Tool.from_retrieval(
    retrieval=rag.Retrieval(
        source=rag.VertexRagStore(
            rag_resources=[
                rag.RagResource(
                    rag_corpus=rag_corpus_id,  # Currently only 1 corpus is allowed.
                    # Optional: supply IDs from `rag.list_files()`.
                    # rag_file_ids=["rag-file-1", "rag-file-2", ...],
                )
            ],
            similarity_top_k=3,  # Optional
            vector_distance_threshold=0.5,  # Optional
        ),
    )
)

rag_model = GenerativeModel(
    model_name="gemini-1.5-flash-002", tools=[rag_retrieval_tool]
)
response = rag_model.generate_content("Why is the sky blue?")
print(response.text)

后续步骤

如需详细了解支持的生成模型，请参阅支持的模型。
如需详细了解支持的嵌入模型，请参阅支持的嵌入模型。
如需详细了解适用于 RAG 的 LlamaIndex on Vertex AI，请参阅适用于 RAG 的 LlamaIndex on Vertex AI 概览。