Resource: RagCorpus
A RagCorpus is a RagFile container and a project can have multiple RagCorpora.
name
string
Output only. The resource name of the RagCorpus.
displayName
string
Required. The display name of the RagCorpus. The name can be up to 128 characters long and can consist of any UTF-8 characters.
description
string
Optional. The description of the RagCorpus.
Optional. Immutable. The embedding model config of the RagCorpus.
Optional. Immutable. The Vector DB config of the RagCorpus.
Output only. timestamp when this RagCorpus was created.
A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z"
and "2014-10-02T15:01:23.045123456Z"
.
Output only. timestamp when this RagCorpus was last updated.
A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z"
and "2014-10-02T15:01:23.045123456Z"
.
Output only. RagCorpus state.
JSON representation |
---|
{ "name": string, "displayName": string, "description": string, "ragEmbeddingModelConfig": { object ( |
RagEmbeddingModelConfig
Config for the embedding model to use for RAG.
model_config
. The model config to use. model_config
can be only one of the following:The Vertex AI Prediction Endpoint that either refers to a publisher model or an endpoint that is hosting a 1P fine-tuned text embedding model. endpoints hosting non-1P fine-tuned text embedding models are currently not supported. This is used for dense vector search.
Configuration for hybrid search.
JSON representation |
---|
{ // Union field |
VertexPredictionEndpoint
Config representing a model hosted on Vertex Prediction Endpoint.
endpoint
string
Required. The endpoint resource name. Format: projects/{project}/locations/{location}/publishers/{publisher}/models/{model}
or projects/{project}/locations/{location}/endpoints/{endpoint}
model
string
Output only. The resource name of the model that is deployed on the endpoint. Present only when the endpoint is not a publisher model. Pattern: projects/{project}/locations/{location}/models/{model}
modelVersionId
string
Output only. version id of the model that is deployed on the endpoint. Present only when the endpoint is not a publisher model.
JSON representation |
---|
{ "endpoint": string, "model": string, "modelVersionId": string } |
HybridSearchConfig
Config for hybrid search.
Optional. The configuration for sparse embedding generation. This field is optional the default behavior depends on the vector database choice on the RagCorpus.
Required. The Vertex AI Prediction Endpoint that hosts the embedding model for dense embedding generations.
JSON representation |
---|
{ "sparseEmbeddingConfig": { object ( |
SparseEmbeddingConfig
Configuration for sparse emebdding generation.
model
. The model to use for sparse embedding generation. model
can be only one of the following:Use BM25 scoring algorithm.
JSON representation |
---|
{ // Union field |
Bm25
message for BM25 parameters.
multilingual
boolean
Optional. Use multilingual tokenizer if set to true.
k1
number
Optional. The parameter to control term frequency saturation. It determines the scaling between the matching term frequency and final score. k1 is in the range of [1.2, 3]. The default value is 1.2.
b
number
Optional. The parameter to control document length normalization. It determines how much the document length affects the final score. b is in the range of [0, 1]. The default value is 0.75.
JSON representation |
---|
{ "multilingual": boolean, "k1": number, "b": number } |
RagVectorDbConfig
Config for the Vector DB to use for RAG.
Authentication config for the chosen Vector DB.
vector_db
. The config for the Vector DB. vector_db
can be only one of the following:The config for the RAG-managed Vector DB.
The config for the Weaviate.
The config for the Pinecone.
The config for the Vertex feature Store.
The config for the Vertex Vector Search.
JSON representation |
---|
{ "apiAuth": { object ( |
RagManagedDb
This type has no fields.
The config for the default RAG-managed Vector DB.
Weaviate
The config for the Weaviate.
httpEndpoint
string
Weaviate DB instance HTTP endpoint. e.g. 34.56.78.90:8080 Vertex RAG only supports HTTP connection to Weaviate. This value cannot be changed after it's set.
collectionName
string
The corresponding collection this corpus maps to. This value cannot be changed after it's set.
JSON representation |
---|
{ "httpEndpoint": string, "collectionName": string } |
Pinecone
The config for the Pinecone.
indexName
string
Pinecone index name. This value cannot be changed after it's set.
JSON representation |
---|
{ "indexName": string } |
VertexFeatureStore
The config for the Vertex feature Store.
featureViewResourceName
string
The resource name of the FeatureView. Format: projects/{project}/locations/{location}/featureOnlineStores/{featureOnlineStore}/featureViews/{featureView}
JSON representation |
---|
{ "featureViewResourceName": string } |
VertexVectorSearch
The config for the Vertex Vector Search.
indexEndpoint
string
The resource name of the Index Endpoint. Format: projects/{project}/locations/{location}/indexEndpoints/{indexEndpoint}
index
string
The resource name of the Index. Format: projects/{project}/locations/{location}/indexes/{index}
JSON representation |
---|
{ "indexEndpoint": string, "index": string } |
ApiAuth
The generic reusable api auth config.
auth_config
. The auth config. auth_config
can be only one of the following:The API secret.
JSON representation |
---|
{ // Union field |
CorpusStatus
State
RagCorpus life state.
Enums | |
---|---|
UNKNOWN |
This state is not supposed to happen. |
INITIALIZED |
RagCorpus resource entry is initialized, but hasn't done validation. |
ACTIVE |
RagCorpus is provisioned successfully and is ready to serve. |
ERROR |
RagCorpus is in a problematic situation. See errorMessage field for details. |
Methods |
|
---|---|
|
Creates a RagCorpus. |
|
Deletes a RagCorpus. |
|
Gets a RagCorpus. |
|
Lists RagCorpora in a Location. |
|
Updates a RagCorpus. |