Introduction to search in BigQuery

BigQuery search indexes let you use Google Standard SQL to easily find unique data elements that are buried in unstructured text and semi-structured JSON data, without having to know the table schemas in advance.

With search indexes, BigQuery provides a powerful columnar store and text search in one platform, enabling efficient row lookups when you need to find individual rows of data. A common use case is log analytics. For example, you might want to identify the rows of data associated with a user for General Data Protection Regulation (GDPR) reporting, or to find specific error codes in a text payload.

BigQuery stores and manages the indexes, so when data becomes available in BigQuery, you can immediately retrieve it with the SEARCH function.

Use cases

BigQuery search indexes help you perform the following tasks:

  • Search system, network, or application logs stored in BigQuery tables.
  • Find and remove data elements to comply with regulatory processes.
  • Support developer troubleshooting.
  • Perform security audits.
  • Create a dashboard that requires highly selective search filters.

For more information, see Create a search index and Search with an index.

Pricing

During the preview phase, the processing required to build the search index is free. However, search indexes incur storage costs, and you are charged for the queries that you run. You can find the number of bytes stored in the SEARCH_INDEXES view. There is no charge for the processing required to refresh and maintain the index.

Roles and permissions

To create a search index, you need the bigquery.tables.createIndex IAM permission on the table where you're creating the index. To drop a search index, you need the bigquery.tables.deleteIndex permission. Each of the following predefined IAM roles includes the permissions that you need to work with search indexes:

  • BigQuery Admin (roles/bigquery.admin)
  • BigQuery Data Editor (roles/bigquery.dataEditor)
  • BigQuery Data Owner (roles/bigquery.dataOwner)

Limitations

  • You cannot create a search index directly on a view or materialized view, but calling the SEARCH function on a view of an indexed table makes use of the underlying search index.
  • Search indexes are designed specifically to optimize the SEARCH function. Other types of functions, such as CONTAINS_SUBSTR, are not improved by adding a search index.
  • You cannot create a search index on a table if any of its columns have a policy tag or if the table has a row-level access policy. You can add policy tags or row-level access policies to the table after you create the index.
  • If you rename a base table after you create a search index on it, the index becomes invalid.
  • The SEARCH function is designed for point lookups. Fuzzy searching, typo correction, wildcards, and other types of document searches are not available.
  • If the search index is not yet at 100% coverage, you are still charged for all index storage that is reported in the INFORMATION_SCHEMA.SEARCH_INDEXES view.

Open issues

The following are known issues that the BigQuery team is working on:

  • Attempts to create a search index on a table that is being concurrently modified, such as through DML or streaming ingestions, might stall or fail under certain circumstances.
  • Sometimes when base table content is deleted—such as with full table content deletion, full partition deletion, or expiration—the index is not updated, but you are still charged for the existing index storage. You can drop the search index to avoid any unwanted cost.

What's next