Introduction to search in BigQuery
BigQuery search indexes let you use GoogleSQL to efficiently find unique data elements that are buried in unstructured text and semi-structured JSON data, without having to know the table schemas in advance.
With search indexes, BigQuery provides a powerful columnar store and text search in one platform, enabling efficient row lookups when you need to find individual rows of data. A common use case is log analytics. For example, you might want to identify the rows of data associated with a user for General Data Protection Regulation (GDPR) reporting, or to find specific error codes in a text payload.
BigQuery stores and manages your indexes, so that when data becomes
available in BigQuery, you can immediately retrieve it with the
SEARCH
function
or other operators and functions,
such as the equal (=
), IN
, or LIKE
operators and certain string and JSON
functions. To optimize your searches, read about
best practices.
Use cases
BigQuery search indexes help you perform the following tasks:
- Search system, network, or application logs stored in BigQuery tables.
- Identify data elements for deletion to comply with regulatory processes.
- Support developer troubleshooting.
- Perform security audits.
- Create a dashboard that requires highly selective search filters.
- Search pre-processed data for exact matches.
For more information, see Create a search index and Search with an index.
Pricing
There is no charge for the processing required to build and refresh your search
indexes when the total size of indexed tables in your organization is below
your region's
limit. To support indexing beyond this limit,
you need to
provide your own reservation
for handling the index-management jobs.
Search indexes incur storage costs when they are active.
You can find the index storage size in the
INFORMATION_SCHEMA.SEARCH_INDEXES
view.
Roles and permissions
To create a search index, you need the
bigquery.tables.createIndex
IAM permission
on the table where you're creating the index. To drop a search index, you need
the bigquery.tables.deleteIndex
permission. Each of the following predefined
IAM roles includes the permissions that you need to work with
search indexes:
- BigQuery Data Owner (
roles/bigquery.dataOwner
) - BigQuery Data Editor (
roles/bigquery.dataEditor
) - BigQuery Admin (
roles/bigquery.admin
)
Limitations
- You cannot create a search index directly on a view or materialized view, but
calling the
SEARCH
function on a view of an indexed table makes use of the underlying search index. - If you rename a table after you create a search index on it, the index becomes invalid.
- The
SEARCH
function is designed for point lookups. Fuzzy searching, typo correction, wildcards, and other types of document searches are not available. - If the search index is not yet at 100% coverage, you are still charged for all
index storage that is reported in the
INFORMATION_SCHEMA.SEARCH_INDEXES
view. - Queries that contain the
SEARCH
function are not accelerated by BigQuery BI Engine. Search indexes are not used when the indexed table is modified by a DML statement, but they can be used when the predicate that is optimizable by search indexes is part of a subquery in a DML statement.
- A search index is not used in the following query:
DELETE FROM my_dataset.indexed_table WHERE SEARCH(user_id, '123');
- A search index can be used in the following query:
DELETE FROM my_dataset.other_table WHERE user_id IN ( SELECT user_id FROM my_dataset.indexed_table WHERE SEARCH(user_id, '123') );
Search indexes are not used when the query references Materialized Views.
Search indexes are not used in a multi-statement transaction query.
What's next
- Learn more about creating a search index.
- Learn more about searching in a table with a search index.