Data Catalog search syntax

This document describes the syntax for Data Catalog search queries.

Simple predicates

In its simplest form, a search query comprises a single predicate. For example, the predicate foo matches the following Data Catalog entities:

  • An entity with a description This is the foo script.
  • An entity with the name foo.bar.

Qualified predicates

You can qualify a predicate by prefixing it with a key that restricts the matching to a specific piece of metadata. For example, name:foo will select entities whose names match the predicate foo.

Data Catalog supports the following qualifiers:

Qualifier Description
name:x Matches x as a substring of the data asset ID.
displayname:x Match x as a substring of the data asset display name.
column:x Matches x as a substring of the column name in the schema of the data asset.
description:x Matches x as a token in the data asset description.
labels:bar Matches BigQuery data assets that have a label (with some value) and the label key has bar as a substring.
labels:bar:x Matches x as a token in the value of a label bar attached to a BigQuery data asset.
type=<type> Matches data assets of a specific object type or subtype. Subtypes can be added with the format <type>.<sub-type>.
Types and subtypes include:
  • type=table matches all tables.
  • type=dataset matches all BigQuery datasets.
  • type=table.view or type=view matches all views.
  • type=tag_template matches all tag templates.
  • type=entry_group matches all entry groups.
  • type=data_stream matches all Pub/Sub topics.
projectid:bar Matches data assets within Cloud projects that match bar as a substring in the ID.
orgid:bar Matches data assets within Cloud organizations that match bar as a substring in the ID.
system=<system> Matches all data assets from a specified system.
Systems include:
  • system=bigquery matches all data assets from BigQuery.
  • system=cloud_pubsub matches all data assets from Pub/Sub.
  • system=data_catalog matches all data assets created in Data Catalog.
tag:x Matches data assets where x matches any substring in <tag_template_project_id>.<tag_template_id>.<tag_field_id>.
Examples:
  • tag:data_owner matches data assets that have the data_owner tag.
  • tag:data_gov_template matches data assets that have been tagged with the data_gov_template tag template.
  • tag:mycloudproject.data_gov_template matches data assets tagged with the data_gov_template template in the mycloudproject project.
tag:key:val Matches key in any substring of the tag field ID, tag template ID, or Cloud project ID of a tag template. Matches val as a token in the tag value of the key when the tag field is of type string. Matches val exactly to the tag value of the key when the tag field value is of type boolean or enum or double.
Permitted operators:
  • string: ":"
  • boolean and enum: "="
  • double: "=", "<", ">", "<=", ">="
  • timestamp: ":", "=", "<", ">", "<=", ">="
Examples:
  • string: tag:data_owner:@mail.com matches data assets that have @mail.com values.
  • boolean: tag:data_gov_template.hasPII=true matches hasPII boolean tags in the data_gov_template that are true.
  • enum: tag:certification_level_1=HIGHEST.
  • double: tag:datascore=9 matches data assets with datascore double tags that have value 9.
  • timestamp: tag:expiredDate:2019-01-01 matches data assets that have an expiredDate tag of 2019-01-01.
  • timestamp: tag:expiredDate<2019-02 matches data assets that have an expiredDate tag prior to 2019-02-01T00:00:00.
createtime Finds data assets that were created within, prior to, or after a given date or time.
Examples:
  • createtime:2019-01-01 matches data assets created on 2019-01-01.
  • createtime<2019-02 matches data assets created prior to 2019-02-01T00:00:00.
  • createtime>2019-02 matches data assets created after 2019-02-01T00:00:00.
updatetime Finds data assets that were updated within, prior to, or after a given date or time.
Examples:
  • updatetime:2019-01-01 matches data assets updated on 2019-01-01.
  • updatetime<2019-02 matches data assets updated prior to 2019-02-01T00:00:00.
  • updatetime>2019-02 matches data assets updated after 2019-02-01T00:00:00.
policytag:x Match x as a substring of the policy tag display name. Finds all assets using matching policy tag or its descendants.
policytagid=x Matches x as a policy tag or taxonomy ID. Finds all assets using matching policy tag or its descendants.

Logical operators

A query may be comprised of several predicates with logical operators. If you don't specify an operator, logical AND is implied. For example, foo bar returns entities that match both predicate foo and predicate bar.

Logical AND and logical OR are supported, for example, foo OR bar.

You can negate a predicate with a - or NOT prefix. For example, -name:foo returns all entities with names that do not match the predicate foo.

Abbreviated syntax

An abbreviated search syntax is also available, using | for OR operators and , for AND operators.

For example, to search for entries inside one of many projects using the OR operator, you can use:

projectid:(pid1|pid2|pid3|pid4)

Instead of:

projectid:pid1 OR projectid:pid2 OR projectid:pid3 OR projectid:pid4

To search for entries with matching column names:

  • AND: column:(name1, name2, name3)
  • OR: column:(name1|name2|name3)

This abbreviated syntax works for all of the qualified predicates listed above.