This page describes the SEARCH
function and the enhanced query mode, which are
used to perform full-text search
queries on Spanner tables.
Query a search index
Spanner provides the
SEARCH
function to use for search index queries. An example use case would be an
application where users enter text in a search box and the application
sends the user input directly into the SEARCH
function. The 'SEARCH' function
would then use a search index to find that text.
The SEARCH
function requires two arguments:
- A search index name
- A raw search query
The SEARCH
function only works when a search index is defined. The SEARCH
function can be combined with any arbitrary SQL constructs, such as filters,
aggregations, or joins.
The SEARCH
function can't be used with transaction queries.
The following query uses the SEARCH
function to return all albums that have
either friday
or monday
in the title:
SELECT AlbumId
FROM Albums
WHERE SEARCH(AlbumTitle_Tokens, 'friday OR monday')
Raw search queries and the rquery language
The second argument of the SEARCH
function is a raw search query.
Spanner uses a domain-specific language (DSL) called rquery.
The rquery language follows the same rules as the plain-text tokenizer when splitting the input string into distinct terms. This includes segmentation of Asian languages.
For information about using rquery, see rquery syntax.
Enhanced query mode
Spanner offers two full-text search modes: a basic token-based
search and a more advanced mode called enhance_query
. When enabled,
enhance_query
expands the search query to include related terms and synonyms,
increasing the likelihood of finding relevant results.
To enable this option, set the optional argument enhance_query=>true
in the
SEARCH
function. For example, the search query hotl cal
matches the album
Hotel California
.
SELECT AlbumId
FROM Albums
WHERE SEARCH(AlbumTitle_Tokens, 'hotl cal', enhance_query=>true)
The enhance_query
mode is a query-time option. It doesn't affect tokenization.
You can use the same search index with or without enhance_query
.
Google is continuously improving the query enhancement algorithms. As a
result, a query with enhance_query == true
might yield slightly different
results over time.
When the enhance_query
mode is enabled, it might increase the number of terms
that the SEARCH
function is looking for which could slightly elevate latency.
For example, the following query uses a three-second timeout and fails if
enhance_query
is unavailable:
@{require_enhance_query=true, enhance_query_timeout_ms=3000}
SELECT AlbumId
FROM Albums
WHERE SEARCH(AlbumTitle_Tokens, 'fast car', enhance_query=>true)
For more information about using the enhance_query
option, see the
SEARCH function.
SQL query requirements
There are several conditions that a SQL query must meet to use a search index. If these conditions aren't met, the query uses either an alternative query plan or fails if no alternative plan exists.
Queries must meet the following conditions:
SEARCH
andSEARCH_SUBSTRING
functions require a search index. Spanner doesn't support these functions in queries against the base table or secondary indexes.Partitioned indexes must have all partition columns bound by an equality condition in the
WHERE
clause of the query.For example, if a search index is defined as
PARTITION BY x, y
, the query must have a conjunct in theWHERE
clause ofx = <parameter or constant> AND y = <parameter or constant>
. That search index isn't considered by the query optimizer if such a condition is missing.All
TOKENLIST
columns referenced bySEARCH
andSEARCH_SUBSTRING
operators must be indexed in the same search index.For example, consider the following table and index definition:
CREATE TABLE Albums ( AlbumId STRING(MAX) NOT NULL, AlbumTitle STRING(MAX), AlbumStudio STRING(MAX), AlbumTitle_Tokens TOKENLIST AS (TOKENIZE_FULLTEXT(AlbumTitle)) HIDDEN, AlbumStudio_Tokens TOKENLIST AS (TOKENIZE_FULLTEXT(AlbumStudio)) HIDDEN ) PRIMARY KEY(AlbumId); CREATE SEARCH INDEX AlbumsTitleIndex ON Albums(AlbumTitle_Tokens); CREATE SEARCH INDEX AlbumsStudioIndex ON Albums(AlbumStudio_Tokens);
The following query fails because there's no single search index that indexes both
AlbumTitle_Tokens
andAlbumStudio_Tokens
:SELECT AlbumId FROM Albums WHERE SEARCH(AlbumTitle_Tokens, @p1) AND SEARCH(AlbumStudio_Tokens, @p2)
If the sort order column is nullable, both the schema and the query must exclude rows where the sort order column is NULL. For details, see Search index sort order.
If the search index is NULL filtered, the query must include the same NULL-filtering expression that's used in an index. See NULL-filtered search indexes for details.
Search indexes and search functions aren't supported in DML, partitioned DML, or partitioned queries.
Search indexes and search functions are typically used in read-only transactions. If application requirements allow stale results, we recommend running search queries with a staleness duration of 10 seconds or longer. For more information, see Read stale data. This is particularly useful for large search queries that fan out to multiple Paxos groups.
Search indexes and search functions aren't recommended in read-write transactions. During execution, search queries lock an entire index partition; as a result, a high rate of search queries in read-write transactions might cause lock conflicts leading to latency spikes. By default, search indexes aren't automatically selected in read-write transactions. If a query is forced to use a search index in a read-write transaction it fails by default. It will also fail if the query contains any of the search functions. This behavior can be overridden with the
@{ALLOW_SEARCH_INDEXES_IN_TRANSACTION=TRUE}
statement-level hint (but queries are still prone to lock conflicts).
Once index eligibility conditions are met, the query optimizer tries to
accelerate non-text query conditions (like Rating > 4
). If the search index
doesn't include the appropriate TOKENLIST
column, the condition isn't
accelerated and remains a
residual condition.
Query parameters
rquery and other parameters like OFFSET
are specified as either a literal or
a query parameter.
We recommend using query parameters for full-text search rather than string
literals.
Index selection
Spanner typically selects the most efficient index for a query
using cost-based modeling. However, the FORCE_INDEX
hint explicitly instructs
Spanner to use a specific search index. For example, the
following shows how to force Spanner to use the AlbumsIndex
:
SELECT AlbumId
FROM Albums @{FORCE_INDEX=AlbumsIndex}
WHERE SEARCH(AlbumTitle_Tokens, "fifth symphony")
If the specified search index isn't eligible, the query fails, even if there are other eligible search indexes.
Snippets in search results
A snippet is a piece of text extracted from a given string that gives users a sense of what a search result contains, and the reason why the result is relevant to their query.
For example, Gmail uses snippets to indicate the portion of an email that matches the search query:
Having the database generate a snippet has several benefits:
- Convenience: You don't need to implement logic to generate snippets from a search query.
- Efficiency: Snippets reduce the output size from the server.
The SNIPPET
function creates the snippet. It returns the relevant portion of
the original string value along with positions of characters to highlight. The
client can then choose how to display the snippet to the end user (for example,
using highlighted or bold text). The SNIPPET
function removes all HTML tags
from the original string.
For example, the following uses SNIPPET
to retrieve text from AlbumTitle
:
SELECT AlbumId, SNIPPET(AlbumTitle, "Fast Car")
FROM Albums
WHERE SEARCH(AlbumTitle_Tokens, "Fast Car")
What's next
- Learn how to rank search results.
- Learn how to perform a substring search.
- Learn how to paginate search results.
- Learn how to mix full-text and non-text queries.
- Learn how to search multiple columns.