Search syntax for Dataplex Catalog

This document describes the syntax for Dataplex search queries. Before you read this document, it is important that you understand the concepts of Dataplex Catalog such as data entries, aspects, aspect types, entry groups, and entry types. For more information, see Dataplex Catalog overview.

To launch a Dataplex Catalog search query in the Google Cloud console, go to the Dataplex Search page and select Dataplex Catalog as the search mode.

Go to Search

For more information, see Search for data assets in Dataplex Catalog.

In its simplest form, a Dataplex Catalog search query consists of a single predicate. Such a predicate can match several pieces of metadata:

  • A substring of a name, display name, or description of a data asset
  • A substring of the type of a data asset
  • A substring of a column name (or nested column name) in the schema of a data asset
  • A substring of a project ID
  • A string from an overview description

For example, the predicate foo matches the following entities:

  • Data asset with the name foo.bar
  • Data asset with the display name Foo Bar
  • Data asset with the description This is the foo script
  • Data asset with the exact type foo
  • Column foo_bar in the schema of a data asset
  • Nested column foo_bar in the schema of a data asset
  • Project prod-foo-bar
  • Data asset with an overview containing the word foo

Qualified predicates

You can qualify a predicate by prefixing it with a key that restricts the matching to a specific piece of metadata:

  • An equal sign (=) restricts the search to an exact match.
  • A colon (:) after the key matches the predicate to either a substring or a token within the value in the search results.

Tokenization splits the stream of text into a series of tokens, with each token usually corresponding to a single word.

For example:

  • name:foo selects entities with names that contain the foo substring, like foo1 and barfoo.
  • description:foo selects entities with the foo token in the description, like bar and foo.
  • location=foo matches data assets in a specified location with foo as the location name.

The predicate keys type, system, location, and orgid support only the exact match (=) qualifier, not the substring qualifier (:). For example, type=foo or orgid=number.

Dataplex Catalog supports the following qualifiers:

Qualifier Description
name:x Matches x as a substring of the data asset ID.
displayname:x Match x as a substring of the data asset display name.
column:x Matches x as a substring of the column name (or nested column name) in the schema of the data asset.
description:x Matches x as a token in the data asset description.
label:bar Matches BigQuery data assets that have a label (with some value) and the label key has bar as a substring.
label=bar Matches BigQuery data assets that have a label (with some value) and the label key equals bar as a string.
label:bar:x Matches x as a substring in the value of a label with key bar attached to a BigQuery data asset.
label=foo:bar Matches BigQuery data assets where the key equals foo and the key value equals bar.
label.foo=bar Matches BigQuery data assets where the key equals foo and the key value equals bar.
label.foo Matches BigQuery data assets that have a label whose key equals foo as a string.
type=TYPE Matches data assets of a specific entry type or its type alias.
projectid:bar Matches data assets within Google Cloud projects that match bar as a substring in the ID.
parent:x Matches x as a substring of the hierarchical path of a data asset. The parent path is a fully_qualified_name of the parent resource.
orgid=number Matches data assets within a Google Cloud organization with the exact ID value of number.
system=SYSTEM Matches data assets from a specified system.
location=LOCATION

Matches data assets in a specified location with an exact name. For example, location=us-central1 matches assets hosted in Iowa.

BigQuery Omni assets support this qualifier by using the BigQuery Omni location name. For example, location=aws-us-east-1 matches BigQuery Omni assets in Northern Virginia.

createtime

Finds data assets that were created within, before, or after a given date or time.

For example:

  • createtime:2019-01-01 matches data assets created on 2019-01-01.
  • createtime<2019-02 matches data assets created before 2019-02-01T00:00:00.
  • createtime>2019-02 matches data assets created after 2019-02-01T00:00:00.

Timestamp format: YYYY-MM-DDThh:mm:ss

All timestamps must be in GMT; time zones are not supported. Partial timestamps, hyphen (-) date separators, and slash (/) date separators are supported.

For example:

  • 2010-10-22T05:36:24
  • 2010-10-22T05:36
  • 2010-10-22T05
  • 2010-10-22
  • 2010-10
  • 2010
  • 2010/10/22
updatetime

Finds data assets that were updated within, before, or after a given date or time.

For example:

  • updatetime:2019-01-01 matches data assets updated on 2019-01-01.
  • updatetime<2019-02 matches data assets updated before 2019-02-01T00:00:00.
  • updatetime>2019-02 matches data assets updated after 2019-02-01T00:00:00.

Timestamp format: YYYY-MM-DDThh:mm:ss

All timestamps must be in GMT; time zones are not supported. Partial timestamps, hyphen (-) date separators, and slash (/) date separators are supported.

For example:

  • 2010-10-22T05:36:24
  • 2010-10-22T05:36
  • 2010-10-22T05
  • 2010-10-22
  • 2010-10
  • 2010
  • 2010/10/22
fully_qualified_name:x Matches x as a substring of fully_qualified_name.
fully_qualified_name=x Matches x as fully_qualified_name.

To search for entries based on their attached aspects, use the following query syntax.

Qualifier Description
aspect:x Matches x as a substring of the full path to the aspect type of an aspect that is attached to the entry, in the format projectid.location.ASPECT_TYPE_ID
aspect=x Matches x as the full path to the aspect type of an aspect that is attached to the entry, in the format projectid.location.ASPECT_TYPE_ID
aspect:xOPERATORvalue

Searches for aspect field values. Matches x as a substring of the full path to the aspect type and field name of an aspect that is attached to the entry, in the format projectid.location.ASPECT_TYPE_ID.FIELD_NAME

The list of supported operators depends on the type of field in the aspect, as follows:

  • String: = (exact match) and : (substring)
  • All number types: =, :, <, >, <=, >=, =>, =<
  • Enum: =
  • Datetime: same as for numbers, but the values to compare are treated as datetimes instead of numbers
  • Boolean: =

Only top-level fields of the aspect are searchable.

For example, all of the following queries match entries where the value of the is-enrolled field in the employee-info aspect is true. Other entries that match on the substring are also returned.

  • aspect:example-project.us-central1.employee-info.is-enrolled=true
  • aspect:example-project.us-central1.employee=true
  • aspect:employee=true

Logical operators

A query can consist of several predicates with logical operators. If you don't specify an operator, logical AND is implied. For example, foo bar returns entities that match both predicate foo and predicate bar.

Logical AND and logical OR are supported. For example, foo OR bar.

You can negate a predicate with a - (hyphen) or NOT prefix. For example, -name:foo returns entities with names that don't match the predicate foo.

Logical operators aren't case-sensitive. For example, both or and OR are acceptable.

Abbreviated syntax

An abbreviated search syntax is also available, using | (vertical bar) for OR operators and , (comma) for AND operators.

For example, to search for entries inside one of many projects using the OR operator, you can use the following abbreviated syntax:

projectid:(id1|id2|id3|id4)

The same search without using abbreviated syntax looks like the following:

projectid:id1 OR projectid:id2 OR projectid:id3 OR projectid:id4

To search for entries with matching column names, use the following:

  • AND: column:(name1, name2, name3)
  • OR: column:(name1|name2|name3)

This abbreviated syntax works for the qualified predicates except for label.

What's next