Understanding RagManagedDb

This page introduces you to RagManagedDb, its underlying technology, and how RagManagedDb is used in Vertex AI RAG Engine. In addition, this page describes the different tiers that are available to tune performance, which might impact your costs, and provides instructions for deleting your Vertex AI RAG Engine data, which stops billing.

Overview

Vertex AI RAG Engine uses RagManagedDb, which is an enterprise-ready, fully-managed Google Spanner instance that's used for resource storage by Vertex AI RAG Engine and is optionally available to be used as the vector database of choice for your RAG corpora.

Through Spanner, Vertex AI RAG Engine offers a consistent, highly available, and highly scalable database to support your application. To learn more about Google Spanner, see Spanner.

Vertex AI RAG Engine stores your RAG corpus and RAG file resource metadata in RagManagedDb, regardless of your choice of vector database. Vector databases are only used for storage and retrieval of embeddings. In addition to resource storage, RagManagedDb can also be used to store and manage vector representations of your documents. The vector database is then used to retrieve relevant documents based on the document's semantic similarity to a given query.

Manage tiers

Vertex AI RAG Engine lets you scale your RagManagedDb instance based on your usage and performance requirements using a choice of two tiers, and optionally, lets you delete your Vertex AI RAG Engine data using a third tier.

The tier is a project-level setting that's available in the RagEngineConfig resource that impacts RAG corpora using RagManagedDb. The following tiers are available in RagEngineConfig:

Scaled tier: This tier offers production-scale performance along with autoscaling functionality. It's suitable for customers with large amounts of data or performance-sensitive workloads. Internally, this tier sets the Spanner instance to autoscaling configuration with a minimum of 1 node (1,000 processing units) and a maximum of 10 nodes (10,000 processing units).
Basic tier (default): This tier offers a cost-effective and low-compute tier, which might be suitable for some of the following cases:
- Experimenting with RagManagedDb.
- Small data size.
- Latency-insensitive workload.
- Use Vertex AI RAG Engine with only other vector databases.
To offer the Basic tier, RagManagedDb sets the underlying Spanner instance to a fixed configuration of 100 processing units, which is equivalent to 0.1 nodes.
Unprovisioned tier: This tier deletes the RagManagedDb and its underlying Spanner instance. The Unprovisioned tier disables the Vertex AI RAG Engine service and deletes your data held within this service regardless of the vector database used for your RagCorpora. This stops the billing of the service. For more information on billing, see Vertex AI RAG Engine billing.

After the data is deleted, the data can't be recovered. To start usingVertex AI RAG Engine again, you must update the tier by calling the UpdateRagEngineConfig API.

Get the project configuration

The following code samples demonstrate how to use the GetRagEngineConfig API for each type of tier:

Version 1 (v1) API code samples.
v1beta1 API code samples.

Update the project configuration

The following code samples demonstrate how to use the UpdateRagEngineConfig API for each type of tier:

Version 1 (v1) API code samples.
v1beta1 API code samples.

What's next

To learn how to use the RAG API v1, the default, see RAG API v1.
To learn how to use the RAG API v1beta1, see RAG API v1beta1.
To learn more about RagManagedDb and how to manage your tier configuration as well as the RAG corpus-level retrieval strategy, see Use RagManagedDb with Vertex AI RAG Engine.