This page describes how to perform a search that mixes full-text and non-text data.
Perform a mixed full-text and non-text search
Search indexes support
full-text, exact match, and numeric columns. You can combine text and non-text
conditions in the WHERE
clause similarly to multi-column search queries. The
query optimizer tries to optimize non-text predicates with a search index. If
that's impossible, Spanner evaluates the condition for every row
that matches the search index. Referenced columns not stored in the search index
are fetched from the base table.
For example, consider the following example:
CREATE TABLE Albums (
AlbumId STRING(MAX) NOT NULL,
Title STRING(MAX),
Rating FLOAT64,
Genres ARRAY<STRING(MAX)>,
Likes INT64,
Cover BYTES(MAX),
Title_Tokens TOKENLIST AS (TOKENIZE_FULLTEXT(Title)) HIDDEN,
Rating_Tokens TOKENLIST AS (TOKENIZE_NUMBER(Rating)) HIDDEN,
Genres_Tokens TOKENLIST AS (TOKEN(Genres)) HIDDEN
) PRIMARY KEY(AlbumId);
CREATE SEARCH INDEX AlbumsIndex
ON Albums(Title_Tokens, Rating_Tokens, Genres_Tokens)
STORING Likes;
The behavior of queries on this table include the following:
Rating
andGenres
are included in the search index. Spanner accelerates conditions using search index posting lists.SELECT Album FROM Albums WHERE Rating > 4 AND ARRAY_INCLUDES_ANY(Genres, ['jazz'])
The query can combine conjunctions, disjunctions, and negations in any way, including mixing full-text and non-text predicates. This query is fully accelerated by the search index.
SELECT Album FROM Albums WHERE (SEARCH(Title_Tokens, 'car') OR Rating > 4) AND NOT ARRAY_INCLUDES_ANY(Genres, ['jazz'])
Likes
is stored in the index, but the schema doesn't request Spanner to build a token index for its possible values. Therefore, the full-text predicate onTitle
and non-text predicate onRating
is accelerated, but the predicate onLikes
isn't. In Spanner, the query fetches all documents with the term "car" in theTitle
and a rating more than 4, then it filters documents that don't have at least 1000 likes. This query uses a lot of resources if almost all albums have the term "car" in their title and almost all of them have a rating of 5, but few albums have 1000 likes. In such cases, indexingLikes
similarly toRating
saves resources.SELECT Album FROM Albums WHERE SEARCH(Title_Tokens, 'car') AND Rating > 4 AND Likes >= 1000
Cover
isn't stored in the index. The following query does a back join betweenAlbumsIndex
andAlbums
to fetchCover
for all matching albums.SELECT AlbumId, Cover FROM Albums WHERE SEARCH(Title_Tokens, 'car') AND Rating > 4
What's next
- Learn about full-text search queries.
- Learn how to rank search results.
- Learn how to perform a substring search.
- Learn how to paginate search results.
- Learn how to search multiple columns.