Search with an index

The SEARCH function provides tokenized search on data. This page provides examples of searching in BigQuery. For more details about the syntax of the SEARCH function, see the reference section on Search functions.

SEARCH is designed to be used with an index to optimize point lookups. Although the SEARCH function works for columns of a table that are not indexed, its performance will be greatly improved with an index. Savings in bytes processed and slot milliseconds are maximized when the number of search results make up a relatively small fraction of the total rows because less data is scanned. To determine whether an index was used for a query, see search index usage.

Create a table with a search index

The following table called Logs will be used in examples to show different ways of using the SEARCH function. This example table is quite small, but in practice the performance gains you get with SEARCH improve with the size of the dataset.

WITH Logs AS
 (SELECT 'INFO' as Level, '65.177.8.234' as Source, 'Entry Foo-Bar created' as Message UNION ALL
  SELECT 'WARNING', '132.249.240.10', 'Entry Foo-Bar already exists, created by 65.177.8.234' UNION ALL
  SELECT 'INFO', '94.60.64.181', 'Entry Foo-Bar deleted' UNION ALL
  SELECT 'SEVERE', '4.113.82.10', 'Entry Foo-Bar does not exist, deleted by 94.60.64.181' UNION ALL
  SELECT 'INFO', '181.94.60.64', 'Entry Foo-Baz created')
SELECT * FROM Logs;
+---------+----------------+-------------------------------------------------------+
| Level   | Source         | Message                                               |
+---------+----------------+-------------------------------------------------------+
| INFO    | 65.177.8.234   | Entry Foo-Bar created                                 |
| WARNING | 132.249.240.10 | Entry Foo-Bar already exists, created by 65.177.8.234 |
| INFO    | 94.60.64.181   | Entry Foo-Bar deleted                                 |
| SEVERE  | 4.113.82.10    | Entry Foo-Bar does not exist, deleted by 94.60.64.181 |
| INFO    | 181.94.60.64   | Entry Foo-Baz created                                 |
+---------+----------------+-------------------------------------------------------+

Create a search index on the logs table.

CREATE SEARCH INDEX my_index ON Logs(ALL COLUMNS);

Search an entire table

The following query searches across all columns of the Logs table for the value bar and returns the rows that contain this value, regardless of capitalization.

SELECT * FROM Logs WHERE SEARCH(Logs, 'bar');
+---------+----------------+-------------------------------------------------------+
| Level   | Source         | Message                                               |
+---------+----------------+-------------------------------------------------------+
| INFO    | 65.177.8.234   | Entry Foo-Bar created                                 |
| WARNING | 132.249.240.10 | Entry Foo-Bar already exists, created by 65.177.8.234 |
| INFO    | 94.60.64.181   | Entry Foo-Bar deleted                                 |
| SEVERE  | 4.113.82.10    | Entry Foo-Bar does not exist, deleted by 94.60.64.181 |
+---------+----------------+-------------------------------------------------------+

The following query searches across all columns of the Logs table for the value `94.60.64.181` and returns the rows that contain this value. The backticks allow for an exact search, which is why the last row of the Logs table which contains 181.94.60.64 is omitted.

SELECT * FROM Logs WHERE SEARCH(Logs, '`94.60.64.181`');
+---------+----------------+-------------------------------------------------------+
| Level   | Source         | Message                                               |
+---------+----------------+-------------------------------------------------------+
| INFO    | 94.60.64.181   | Entry Foo-Bar deleted                                 |
| SEVERE  | 4.113.82.10    | Entry Foo-Bar does not exist, deleted by 94.60.64.181 |
+---------+----------------+-------------------------------------------------------+

Search a subset of columns

SEARCH makes it easy to specify a subset of columns within which to search for data. The following query searches the Message column of the Logs table for the value 94.60.64.181 and returns the rows that contain this value.

SELECT * FROM Logs WHERE SEARCH(Message, '`94.60.64.181`');
+---------+----------------+-------------------------------------------------------+
| Level   | Source         | Message                                               |
+---------+----------------+-------------------------------------------------------+
| SEVERE  | 4.113.82.10    | Entry Foo-Bar does not exist, deleted by 94.60.64.181 |
+---------+----------------+-------------------------------------------------------+

The following query searches both the Source and Message columns of the Logs table. It returns the rows that contain the value 94.60.64.181 from either column.

SELECT * FROM Logs WHERE SEARCH((Source, Message), '`94.60.64.181`');
+---------+----------------+-------------------------------------------------------+
| Level   | Source         | Message                                               |
+---------+----------------+-------------------------------------------------------+
| INFO    | 94.60.64.181   | Entry Foo-Bar deleted                                 |
| SEVERE  | 4.113.82.10    | Entry Foo-Bar does not exist, deleted by 94.60.64.181 |
+---------+----------------+-------------------------------------------------------+

If a table table has many columns and you want to search most of them, it may be easier to specify only the columns to exclude from the search. The following query searches across all columns of the Logs table except for the Message column. It returns the rows of any columns other than Message that contains the value 94.60.64.181.

SELECT *
FROM Logs
WHERE SEARCH(
  (SELECT AS STRUCT Logs.* EXCEPT (Message)), '`94.60.64.181`');
+---------+----------------+-------------------------------------------------------+
| Level   | Source         | Message                                               |
+---------+----------------+-------------------------------------------------------+
| INFO    | 94.60.64.181   | Entry Foo-Bar deleted                                 |
+---------+----------------+-------------------------------------------------------+

Search index usage

To determine whether a search index was used for a query, look at the query's execution details and confirm that the search index is listed.

Query execution specifying search index.

More detailed information on search index usage is available through the searchStatistics field in the Jobs.Get API method. The indexUsageMode field in searchStatistics indicates whether a search index was used with the following values:

  • UNUSED: no search index was used.
  • PARTIALLY_USED: part of the query used search indexes and part did not.
  • FULLY_USED: every SEARCH function in the query used a search index.

When indexUsageMode is UNUSED or PARTIALLY_USED, the indexUnusuedReasons field contains information about why search indexes were not used in the query.

To view searchStatistics for a query, run the bq show command.

bq show --format=prettyjson -j JOB_ID

Example

Suppose you run a query that calls the SEARCH function on data in a table. You can view the job details of the query to find the job ID, then run the bq show command to see more information:

bq show --format=prettyjson --j my_project:US.bquijob_123x456_789y123z456c

The output contains many fields, including searchStatistics, which looks similar to the following. In this example, indexUsageMode indicates that the index was not used. The reason is that the table doesn't have a search index. To solve this problem, create a search index on the table. See the indexUnusedReason code field for a list of all reasons a search index might not be used in a query.

"searchStatistics": {
  "indexUnusedReasons": [
    {
      "baseTable": {
        "datasetId": "my_dataset",
        "projectId": "my_project",
        "tableId": "my_table"
      },
      "code": "INDEX_CONFIG_NOT_AVAILABLE",
      "message": "There is no search index configuration for the base table `my_project:my_dataset.my_table`."
    }
  ],
  "indexUsageMode": "UNUSED"
},