REST Resource: projects.locations.ragCorpora

Resource: RagCorpus

A RagCorpus is a RagFile container and a project can have multiple RagCorpora.

Fields
name string

Output only. The resource name of the RagCorpus.

displayName string

Required. The display name of the RagCorpus. The name can be up to 128 characters long and can consist of any UTF-8 characters.

description string

Optional. The description of the RagCorpus.

ragEmbeddingModelConfig
(deprecated)
object (RagEmbeddingModelConfig)

Optional. Immutable. The embedding model config of the RagCorpus.

ragVectorDbConfig
(deprecated)
object (RagVectorDbConfig)

Optional. Immutable. The Vector DB config of the RagCorpus.

createTime string (Timestamp format)

Output only. timestamp when this RagCorpus was created.

A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z" and "2014-10-02T15:01:23.045123456Z".

updateTime string (Timestamp format)

Output only. timestamp when this RagCorpus was last updated.

A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z" and "2014-10-02T15:01:23.045123456Z".

corpusStatus object (CorpusStatus)

Output only. RagCorpus state.

backend_config Union type
The backend config of the RagCorpus. It can be data store and/or retrieval engine. backend_config can be only one of the following:
vectorDbConfig object (RagVectorDbConfig)

Optional. Immutable. The config for the Vector DBs.

vertexAiSearchConfig object (VertexAiSearchConfig)

Optional. Immutable. The config for the Vertex AI Search.

JSON representation
{
  "name": string,
  "displayName": string,
  "description": string,
  "ragEmbeddingModelConfig": {
    object (RagEmbeddingModelConfig)
  },
  "ragVectorDbConfig": {
    object (RagVectorDbConfig)
  },
  "createTime": string,
  "updateTime": string,
  "corpusStatus": {
    object (CorpusStatus)
  },

  // backend_config
  "vectorDbConfig": {
    object (RagVectorDbConfig)
  },
  "vertexAiSearchConfig": {
    object (VertexAiSearchConfig)
  }
  // Union type
}

RagVectorDbConfig

Config for the Vector DB to use for RAG.

Fields
apiAuth object (ApiAuth)

Authentication config for the chosen Vector DB.

ragEmbeddingModelConfig object (RagEmbeddingModelConfig)

Optional. Immutable. The embedding model config of the Vector DB.

vector_db Union type
The config for the Vector DB. vector_db can be only one of the following:
ragManagedDb object (RagManagedDb)

The config for the RAG-managed Vector DB.

weaviate object (Weaviate)

The config for the Weaviate.

pinecone object (Pinecone)

The config for the Pinecone.

vertexFeatureStore object (VertexFeatureStore)

The config for the Vertex feature Store.

JSON representation
{
  "apiAuth": {
    object (ApiAuth)
  },
  "ragEmbeddingModelConfig": {
    object (RagEmbeddingModelConfig)
  },

  // vector_db
  "ragManagedDb": {
    object (RagManagedDb)
  },
  "weaviate": {
    object (Weaviate)
  },
  "pinecone": {
    object (Pinecone)
  },
  "vertexFeatureStore": {
    object (VertexFeatureStore)
  },
  "vertexVectorSearch": {
    object (VertexVectorSearch)
  }
  // Union type
}

RagManagedDb

This type has no fields.

The config for the default RAG-managed Vector DB.

Weaviate

The config for the Weaviate.

Fields
httpEndpoint string

Weaviate DB instance HTTP endpoint. e.g. 34.56.78.90:8080 Vertex RAG only supports HTTP connection to Weaviate. This value cannot be changed after it's set.

collectionName string

The corresponding collection this corpus maps to. This value cannot be changed after it's set.

JSON representation
{
  "httpEndpoint": string,
  "collectionName": string
}

Pinecone

The config for the Pinecone.

Fields
indexName string

Pinecone index name. This value cannot be changed after it's set.

JSON representation
{
  "indexName": string
}

VertexFeatureStore

The config for the Vertex feature Store.

Fields
featureViewResourceName string

The resource name of the FeatureView. Format: projects/{project}/locations/{location}/featureOnlineStores/{featureOnlineStore}/featureViews/{featureView}

JSON representation
{
  "featureViewResourceName": string
}

VertexVectorSearch

The config for the Vertex Vector Search.

Fields
indexEndpoint string

The resource name of the Index Endpoint. Format: projects/{project}/locations/{location}/indexEndpoints/{indexEndpoint}

index string

The resource name of the Index. Format: projects/{project}/locations/{location}/indexes/{index}

JSON representation
{
  "indexEndpoint": string,
  "index": string
}

ApiAuth

The generic reusable api auth config.

Fields
auth_config Union type
The auth config. auth_config can be only one of the following:
apiKeyConfig object (ApiKeyConfig)

The API secret.

JSON representation
{

  // auth_config
  "apiKeyConfig": {
    object (ApiKeyConfig)
  }
  // Union type
}

RagEmbeddingModelConfig

Config for the embedding model to use for RAG.

Fields
model_config Union type
The model config to use. model_config can be only one of the following:
vertexPredictionEndpoint object (VertexPredictionEndpoint)

The Vertex AI Prediction Endpoint that either refers to a publisher model or an endpoint that is hosting a 1P fine-tuned text embedding model. endpoints hosting non-1P fine-tuned text embedding models are currently not supported. This is used for dense vector search.

hybridSearchConfig object (HybridSearchConfig)

Configuration for hybrid search.

JSON representation
{

  // model_config
  "vertexPredictionEndpoint": {
    object (VertexPredictionEndpoint)
  },
  "hybridSearchConfig": {
    object (HybridSearchConfig)
  }
  // Union type
}

VertexPredictionEndpoint

Config representing a model hosted on Vertex Prediction Endpoint.

Fields
endpoint string

Required. The endpoint resource name. Format: projects/{project}/locations/{location}/publishers/{publisher}/models/{model} or projects/{project}/locations/{location}/endpoints/{endpoint}

model string

Output only. The resource name of the model that is deployed on the endpoint. Present only when the endpoint is not a publisher model. Pattern: projects/{project}/locations/{location}/models/{model}

modelVersionId string

Output only. version id of the model that is deployed on the endpoint. Present only when the endpoint is not a publisher model.

JSON representation
{
  "endpoint": string,
  "model": string,
  "modelVersionId": string
}

HybridSearchConfig

Config for hybrid search.

Fields
sparseEmbeddingConfig object (SparseEmbeddingConfig)

Optional. The configuration for sparse embedding generation. This field is optional the default behavior depends on the vector database choice on the RagCorpus.

denseEmbeddingModelPredictionEndpoint object (VertexPredictionEndpoint)

Required. The Vertex AI Prediction Endpoint that hosts the embedding model for dense embedding generations.

JSON representation
{
  "sparseEmbeddingConfig": {
    object (SparseEmbeddingConfig)
  },
  "denseEmbeddingModelPredictionEndpoint": {
    object (VertexPredictionEndpoint)
  }
}

SparseEmbeddingConfig

Configuration for sparse emebdding generation.

Fields
model Union type
The model to use for sparse embedding generation. model can be only one of the following:
bm25 object (Bm25)

Use BM25 scoring algorithm.

JSON representation
{

  // model
  "bm25": {
    object (Bm25)
  }
  // Union type
}

Bm25

message for BM25 parameters.

Fields
multilingual boolean

Optional. Use multilingual tokenizer if set to true.

k1 number

Optional. The parameter to control term frequency saturation. It determines the scaling between the matching term frequency and final score. k1 is in the range of [1.2, 3]. The default value is 1.2.

b number

Optional. The parameter to control document length normalization. It determines how much the document length affects the final score. b is in the range of [0, 1]. The default value is 0.75.

JSON representation
{
  "multilingual": boolean,
  "k1": number,
  "b": number
}

VertexAiSearchConfig

Config for the Vertex AI Search.

Fields
servingConfig string

Vertex AI Search Serving Config resource full name. For example, projects/{project}/locations/{location}/collections/{collection}/engines/{engine}/servingConfigs/{servingConfig} or projects/{project}/locations/{location}/collections/{collection}/dataStores/{dataStore}/servingConfigs/{servingConfig}.

JSON representation
{
  "servingConfig": string
}

CorpusStatus

RagCorpus status.

Fields
state enum (State)

Output only. RagCorpus life state.

errorStatus string

Output only. Only when the state field is ERROR.

JSON representation
{
  "state": enum (State),
  "errorStatus": string
}

State

RagCorpus life state.

Enums
UNKNOWN This state is not supposed to happen.
INITIALIZED RagCorpus resource entry is initialized, but hasn't done validation.
ACTIVE RagCorpus is provisioned successfully and is ready to serve.
ERROR RagCorpus is in a problematic situation. See errorMessage field for details.

Methods

create

Creates a RagCorpus.

delete

Deletes a RagCorpus.

get

Gets a RagCorpus.

list

Lists RagCorpora in a Location.

patch

Updates a RagCorpus.