Introduction to vector search
This document provides an overview of vector search in BigQuery. Vector search lets you search embeddings to identify semantically similar entities.
Embeddings are high-dimensional numerical vectors that represent a given entity, like a piece of text or an audio file. Machine learning (ML) models use embeddings to encode semantics about such entities to make it easier to reason about and compare them. For example, a common operation in clustering, classification, and recommendation models is to measure the distance between vectors in an embedding space to find items that are most semantically similar.
To perform a vector search, you use the
VECTOR_SEARCH
function
and optionally a vector index. When a vector
index is used, VECTOR_SEARCH
uses the
Approximate Nearest Neighbor
search technique to help improve vector search performance, with the
trade-off of reducing
recall
and so returning more approximate results. Brute force is used to return exact
results when a vector index isn't available, and you can choose to use brute
force to get exact results even when a vector index is available.
Use cases
The combination of embedding generation and vector search enables many interesting use cases, with retrieval-augmented generation (RAG) being the canonical one. Some other possible use cases are as follows:
- Given a batch of new support cases, find several similar resolved cases for each. Pass information about the resolved cases to a large language model (LLM) to use as context when summarizing and suggesting resolutions for the new support cases.
- Given an audit log entry, find the most closely matching entries in the past 30 days.
- Generate embeddings from patient profile data, then use vector search to find patients with similar profiles in order to explore successful treatment plans prescribed to that patient cohort.
- Given the embeddings representing pre-accident moments from all the sensors and cameras in a fleet of school buses, find similar moments from all other vehicles in the fleet for further analysis, tuning, and retraining of the models that govern the safety feature engagements.
- Given a picture, find the most closely-related images in a BigQuery object table, and pass those images to a model to generate captions.
Pricing
The
CREATE VECTOR INDEX
statement and the
VECTOR_SEARCH
function use
BigQuery compute pricing.
For the CREATE VECTOR INDEX
statement, only the indexed column is considered
in the bytes processed.
There is no charge for the processing required to build and refresh your vector
indexes when the total size of indexed table data is below your per-organization
limit. To support indexing beyond this limit, you must
provide your own reservation
for handling the index management jobs. Vector indexes incur storage costs
when they are active. You can find the index storage size in the
INFORMATION_SCHEMA.VECTOR_INDEXES
view.
If the vector index is not yet at 100% coverage, you are still charged for all
index storage that is reported in the INFORMATION_SCHEMA.VECTOR_INDEXES
view.
Quotas and limits
For more information, see Vector index limits.
Limitations
- Queries that contain the
VECTOR_SEARCH
function aren't accelerated by BigQuery BI Engine. - BigQuery data security and governance rules apply to the use of
VECTOR_SEARCH
. For more information, see the Limitations section inVECTOR_SEARCH
. These rules don't apply to vector index generation.
What's next
- Learn more about creating a vector index.
- Try the Search embeddings with vector search tutorial to learn how to create a vector index, and then do a vector search for embeddings both with and without the index.
Try the Perform semantic search and retrieval-augmented generation tutorial to learn how to do the following tasks:
- Generate text embeddings.
- Create a vector index on the embeddings.
- Perform a vector search with the embeddings to search for similar text.
- Perform retrieval-augmented generation (RAG) by using vector search results to augment the prompt input and improve results.
Try the Parse PDFs in a retrieval-augmented generation pipeline tutorial to learn how to create a RAG pipeline based on parsed PDF content.