Search syntax for Dataplex Universal Catalog

This document describes the syntax for Dataplex Universal Catalog search queries. Before you read this document, it is important that you understand concepts for metadata management in Dataplex Universal Catalog, such as entries, aspects, aspect types, entry groups, and entry types. For more information, see About metadata management in Dataplex Universal Catalog.

Dataplex Universal Catalog offers two search modes: keyword search and semantic search (Preview).

Keyword search lets you find resources using specific keywords, filters, and a defined syntax.

Semantic search extends keyword search to support natural language queries. It lets you find resources using everyday language, eliminating the need for complex syntax.

This document covers syntax for both keyword and semantic search.

To launch a Dataplex Universal Catalog search query in the Google Cloud console, go to the Dataplex Universal Catalog Search page and select Dataplex Universal Catalog as the search platform.

Go to Search

For more information, see Search for resources in Dataplex Universal Catalog.

You can find assets by entering a term or phrase without any specific syntax. Dataplex Universal Catalog performs a broad search by matching your query against several metadata fields, including the following:

  • Name, display name, or description of a resource
  • Type of a resource
  • Project ID
  • Overview description
  • Column name (or nested column name) in the schema of a resource
  • Column description
  • Fully qualified name
  • Contacts
  • Aspects

Search with query syntax

For more precise searches, you can construct a query using specific syntax, including qualifiers, logical operators, and aspect searches.

Qualified predicates

You can qualify a predicate by prefixing it with a key that restricts the matching to a specific piece of metadata:

  • An equal sign (=) restricts the search to an exact match.
  • A colon (:) after the key matches the predicate to either a substring or a token within the value in the search results.

Tokenization splits the stream of text into a series of tokens, with each token usually corresponding to a single word.

For example:

  • name:foo selects resources with names that contain the foo substring, like foo1 and barfoo.
  • description:foo selects resources with the foo token in the description, like bar and foo.
  • location=foo matches resources in a specified location with foo as the location name.

The behavior of these qualifiers can vary slightly between search modes, as detailed in the following sections.

Keyword search qualifiers

The predicate keys type, system, location, and orgid support only the exact match (=) qualifier, not the substring qualifier (:). For example, type=foo or orgid=number.

Dataplex Universal Catalog supports the following qualifiers for keyword search:

Qualifier Description
name:x Matches x as a substring of the resource ID.
displayname:x Match x as a substring of the resource display name.
column:x Matches x as a substring of the column name (or nested column name) in the schema of the resource.
description:x Matches x as a token in the resource description.
label:bar Matches BigQuery resources that have a label (with some value) and the label key has bar as a substring.
label=bar Matches BigQuery resources that have a label (with some value) and the label key equals bar as a string.
label:bar:x Matches x as a substring in the value of a label with key bar attached to a BigQuery resource.
label=foo:bar Matches BigQuery resources where the key equals foo and the key value equals bar.
label.foo=bar Matches BigQuery resources where the key equals foo and the key value equals bar.
label.foo Matches BigQuery resources that have a label whose key equals foo as a string.
type=TYPE Matches resources of a specific entry type or its type alias.
projectid:bar Matches resources within Google Cloud projects that match bar as a substring in the ID.
parent:x Matches x as a substring of the hierarchical path of a resource. The parent path is a fully_qualified_name of the parent resource.
orgid=number Matches resources within a Google Cloud organization with the exact ID value of number.
system=SYSTEM Matches resources from a specified system.
location=LOCATION

Matches resources in a specified location with an exact name. For example, location=us-central1 matches assets hosted in Iowa.

BigQuery Omni assets support this qualifier by using the BigQuery Omni location name. For example, location=aws-us-east-1 matches BigQuery Omni assets in Northern Virginia.

createtime

Finds resources that were created within, before, or after a given date or time.

For example:

  • createtime:2019-01-01 matches resources created on 2019-01-01.
  • createtime<2019-02 matches resources created before 2019-02-01T00:00:00.
  • createtime>2019-02 matches resources created after 2019-02-01T00:00:00.

Timestamp format: YYYY-MM-DDThh:mm:ss

All timestamps must be in GMT; time zones are not supported. Partial timestamps, hyphen (-) date separators, and slash (/) date separators are supported.

For example:

  • 2010-10-22T05:36:24
  • 2010-10-22T05:36
  • 2010-10-22T05
  • 2010-10-22
  • 2010-10
  • 2010
  • 2010/10/22
updatetime

Finds resources that were updated within, before, or after a given date or time.

For example:

  • updatetime:2019-01-01 matches resources updated on 2019-01-01.
  • updatetime<2019-02 matches resources updated before 2019-02-01T00:00:00.
  • updatetime>2019-02 matches resources updated after 2019-02-01T00:00:00.

Timestamp format: YYYY-MM-DDThh:mm:ss

All timestamps must be in GMT; time zones are not supported. Partial timestamps, hyphen (-) date separators, and slash (/) date separators are supported.

For example:

  • 2010-10-22T05:36:24
  • 2010-10-22T05:36
  • 2010-10-22T05
  • 2010-10-22
  • 2010-10
  • 2010
  • 2010/10/22
fully_qualified_name:x Matches x as a substring of fully_qualified_name.
fully_qualified_name=x Matches x as fully_qualified_name.

Semantic search qualifiers

The predicate keys type, system, location, and description, and aspect search (excluding has) support only the exact match (=) qualifier, not the substring qualifier (:). For example, type=foo.

Dataplex Universal Catalog supports the following qualifiers for semantic search:

Qualifier Description
name:x Matches x as a substring of the resource ID or resource display name.
displayname:x Match x as a substring of the resource display name.
column:x Matches x as a substring of the column name (or nested column name) in the schema of the resource.
description:x Matches x as a token in the resource description.
labels:bar Matches BigQuery resources that have a label (with some value) and the label key has bar as a substring.
labels=bar Matches BigQuery resources that have a label (with some value) and the label key equals bar as a string.
labels.bar:x Matches x as a substring in the value of a label with key bar attached to a BigQuery resource.
labels.foo=bar Matches BigQuery resources where the key equals foo and the key value equals bar.
type=TYPE Matches resources of a specific entry type or its type alias.
projectid:bar Matches resources within Google Cloud projects that match bar as a substring in the ID.
parent:x Matches x as a substring of the hierarchical path of a resource.
system=SYSTEM Matches resources from a specified system.
location=LOCATION

Matches resources in a specified location with an exact name. For example, location=us-central1 matches assets hosted in Iowa.

BigQuery Omni assets support this qualifier by using the BigQuery Omni location name. For example, location=aws-us-east-1 matches BigQuery Omni assets in Northern Virginia.

createtime

Finds resources that were created within, before, or after a given date or time.

For example:

  • createtime:2019-01-01 matches all resources created on 2019-01-01.
  • createtime<2019-02 matches all resources created before 2019-02-01T00:00:00.
  • createtime>2019-02 matches all resources created after 2019-02-01T00:00:00.
  • createtime>-30d matches all resources created in the last 30 days.
  • createtime<=-30d matches all resources created 30 days ago or earlier.
  • createtime<=-1d matches all resources created on the previous day.

Timestamp format: YYYY-MM-DDThh:mm:ss

All timestamps must be in GMT; time zones are not supported. Partial timestamps, hyphen (-) date separators, and slash (/) date separators are supported.

For example:

  • 2010-10-22T05:36:24
  • 2010-10-22T05:36
  • 2010-10-22T05
  • 2010-10-22
  • 2010-10
  • 2010
  • 2010/10/22
updatetime

Finds resources that were updated within, before, or after a given date or time.

For example:

  • updatetime:2019-01-01 matches all resources updated on 2019-01-01.
  • updatetime<2019-02 matches all resources updated before 2019-02-01T00:00:00.
  • updatetime>2019-02 matches all resources updated after 2019-02-01T00:00:00.
  • updatetime>-30d matches all resources updated in the last 30 days.
  • updatetime<-30d matches all resources updated 30 days ago or earlier.
  • updatetime=-1d matches all resources updated on the previous day.
  • updatetime>=-30d matches all resources updated in the last 30 days.
  • updatetime<=-30d matches all resources updated 30 days ago or earlier.

Timestamp format: YYYY-MM-DDThh:mm:ss

All timestamps must be in GMT; time zones are not supported. Partial timestamps, hyphen (-) date separators, and slash (/) date separators are supported.

For example:

  • 2010-10-22T05:36:24
  • 2010-10-22T05:36
  • 2010-10-22T05
  • 2010-10-22
  • 2010-10
  • 2010
  • 2010/10/22

To search for entries based on their attached aspects, use the following query syntax.

Qualifier Description
aspect:x Matches x as a substring of the full path to the aspect type of an aspect that is attached to the entry, in the format projectid.location.ASPECT_TYPE_ID
aspect=x Matches x as the full path to the aspect type of an aspect that is attached to the entry, in the format projectid.location.ASPECT_TYPE_ID
aspect:xOPERATORvalue

Searches for aspect field values. Matches x as a substring of the full path to the aspect type and field name of an aspect that is attached to the entry, in the format projectid.location.ASPECT_TYPE_ID.FIELD_NAME

The list of supported operators depends on the type of field in the aspect, as follows:

  • String: = (exact match) and : (substring)
  • All number types: =, :, <, >, <=, >=, =>, =<
  • Enum: =
  • Datetime: same as for numbers, but the values to compare are treated as datetimes instead of numbers
  • Boolean: =

Only top-level fields of the aspect are searchable.

For example, all of the following queries match entries where the value of the is-enrolled field in the employee-info aspect is true. Other entries that match on the substring are also returned.

  • aspect:example-project.us-central1.employee-info.is-enrolled=true
  • aspect:example-project.us-central1.employee=true
  • aspect:employee=true

Qualifier Description
has:x Matches x as a substring of the full path to the aspect type of an aspect that is attached to the entry, in the format projectid.location.ASPECT_TYPE_ID
has=x Matches x as the full path to the aspect type of an aspect that is attached to the entry, in the format projectid.location.ASPECT_TYPE_ID
has:xOPERATORvalue

Searches for aspect field values. Matches x as a substring of the full path to the aspect type and field name of an aspect that is attached to the entry, in the following formats:

  • Syntax for system aspect types:

    • ASPECT_TYPE_ID.FIELD_NAME
    • dataplex-types.ASPECT_TYPE_ID.FIELD_NAME
    • dataplex-types.LOCATION.ASPECT_TYPE_ID.FIELD_NAME

    For example, the following queries match entries where the value of the type field in the bigquery-dataset aspect is default:

    • bigquery-dataset.type=default
    • dataplex-types.bigquery-dataset.type=default
    • dataplex-types.global.bigquery-dataset.type=default
  • Syntax for custom aspect types:

    • If the aspect is created in the global region: PROJECT_ID.ASPECT_TYPE_ID.FIELD_NAME
    • If the aspect is created in a specific region: PROJECT_ID.REGION.ASPECT_TYPE_ID.FIELD_NAME

    For example, the following queries match entries where the value of the is-enrolled field in the employee-info aspect is true.

    • example-project.us-central1.employee-info.is-enrolled=true
    • example-project.employee-info.is-enrolled=true

    The list of supported operators depends on the type of field in the aspect, as follows:

    • String: = (exact match)
    • All number types: =, :, <, >, <=, >=, =>, =<
    • Enum: =
    • Datetime: same as for numbers, but the values to compare are treated as datetimes instead of numbers
    • Boolean: =

Only top-level fields of the aspect are searchable.

Logical operators

A query can consist of several predicates with logical operators. If you don't specify an operator, logical AND is implied. For example, foo bar returns resources that match both predicate foo and predicate bar.

Logical AND and logical OR are supported. For example, foo OR bar.

You can negate a predicate with a - (hyphen) or NOT prefix. For example, -name:foo returns resources with names that don't match the predicate foo.

Abbreviated syntax

An abbreviated search syntax is also available, using | (vertical bar) for OR operators and , (comma) for AND operators.

For example, to search for entries inside one of many projects using the OR operator, you can use the following abbreviated syntax:

projectid:(id1|id2|id3|id4)

The same search without using abbreviated syntax looks like the following:

projectid:id1 OR projectid:id2 OR projectid:id3 OR projectid:id4

To search for entries with matching column names, use the following:

  • AND: column:(name1, name2, name3)
  • OR: column:(name1|name2|name3)

This abbreviated syntax works for the qualified predicates except for label in keyword search.

What's next