This document describes the syntax for Dataplex search queries. Before you read this document, it is important that you understand the concepts of Dataplex Catalog such as data entries, aspects, aspect types, entry groups, and entry types. For more information, see Dataplex Catalog overview.
To launch a Dataplex Catalog search query in the Google Cloud console, go to the Dataplex Search page and select Dataplex Catalog as the search mode.
For more information, see Search for data assets in Dataplex Catalog.
Simple search
In its simplest form, a Dataplex Catalog search query consists of a single predicate. Such a predicate can match several pieces of metadata:
- A substring of a name, display name, or description of a data asset
- A substring of the type of a data asset
- A substring of a column name (or nested column name) in the schema of a data asset
- A substring of a project ID
- A string from an overview description
For example, the predicate foo
matches the following entities:
- Data asset with the name
foo.bar
- Data asset with the display name
Foo Bar
- Data asset with the description
This is the foo script
- Data asset with the exact type
foo
- Column
foo_bar
in the schema of a data asset - Nested column
foo_bar
in the schema of a data asset - Project
prod-foo-bar
- Data asset with an overview containing the word
foo
Qualified predicates
You can qualify a predicate by prefixing it with a key that restricts the matching to a specific piece of metadata:
- An equal sign (
=
) restricts the search to an exact match. - A colon (
:
) after the key matches the predicate to either a substring or a token within the value in the search results.
Tokenization splits the stream of text into a series of tokens, with each token usually corresponding to a single word.
For example:
name:foo
selects entities with names that contain thefoo
substring, likefoo1
andbarfoo
.description:foo
selects entities with thefoo
token in the description, likebar
andfoo
.location=foo
matches data assets in a specified location withfoo
as the location name.
The predicate keys type
, system
, location
, and orgid
support only the
exact match (=
) qualifier, not the substring qualifier (:
). For example,
type=foo
or orgid=number
.
Dataplex Catalog supports the following qualifiers:
Qualifier | Description |
---|---|
name:x |
Matches x as a substring of the data asset ID. |
displayname:x |
Match x as a substring of the data asset display name. |
column:x |
Matches x as a substring of the column name (or nested
column name) in the schema of the data asset. |
description:x |
Matches x as a token in the data asset description. |
label:bar |
Matches BigQuery data assets that have a label (with some
value) and the label key has bar as a substring. |
label=bar |
Matches BigQuery data assets that have a label (with
some value) and the label key equals bar as a string. |
label:bar:x |
Matches x as a substring in the value of a label with
key bar attached to a BigQuery data asset. |
label=foo:bar |
Matches BigQuery data assets where the key equals
foo and the key value equals bar . |
label.foo=bar |
Matches BigQuery data assets where the key equals
foo and the key value equals bar . |
label.foo |
Matches BigQuery data assets that have a label whose
key equals foo as a string. |
type=TYPE |
Matches data assets of a specific entry type or its type alias. |
projectid:bar |
Matches data assets within Google Cloud projects that match
bar as a substring in the ID. |
parent:x |
Matches x as a substring of the hierarchical path of a
data asset. The parent path is a fully_qualified_name of
the parent resource. |
orgid=number |
Matches data assets within a Google Cloud organization with
the exact ID value of number . |
system=SYSTEM |
Matches data assets from a specified system. |
location=LOCATION |
Matches data assets in a specified location with an exact name.
For example, BigQuery Omni assets support this qualifier by using the
BigQuery Omni location name.
For example,
|
createtime |
Finds data assets that were created within, before, or after a given date or time. For example:
Timestamp format: All timestamps must be in GMT; time zones are not supported. Partial
timestamps, hyphen ( For example:
|
updatetime |
Finds data assets that were updated within, before, or after a given date or time. For example:
Timestamp format: All timestamps must be in GMT; time zones are not supported. Partial
timestamps, hyphen ( For example:
|
fully_qualified_name:x |
Matches x as a substring of fully_qualified_name . |
fully_qualified_name=x |
Matches x as fully_qualified_name . |
Aspect search
To search for entries based on their attached aspects, use the following query syntax.
Qualifier | Description |
---|---|
aspect:x |
Matches x as a substring of the full path to the aspect
type of an aspect that is attached to the entry, in the format
projectid.location.ASPECT_TYPE_ID |
aspect=x |
Matches x as the full path to the aspect
type of an aspect that is attached to the entry, in the format
projectid.location.ASPECT_TYPE_ID |
aspect:xOPERATORvalue |
Searches for aspect field values. Matches The list of supported operators depends on the type of field in the aspect, as follows:
Only top-level fields of the aspect are searchable. For example, all of the following queries match entries where the value
of the
|
Logical operators
A query can consist of several predicates with logical operators. If you don't
specify an operator, logical AND
is implied. For example, foo bar
returns
entities that match both predicate foo
and predicate bar
.
Logical AND
and logical OR
are supported. For example, foo OR bar
.
You can negate a predicate with a -
(hyphen) or NOT
prefix. For example,
-name:foo
returns entities with names that don't match the predicate foo
.
Logical operators aren't case-sensitive. For example, both or
and OR
are
acceptable.
Abbreviated syntax
An abbreviated search syntax is also available, using |
(vertical bar) for
OR
operators and ,
(comma) for AND
operators.
For example, to search for entries inside one of many projects using the OR
operator, you can use the following abbreviated syntax:
projectid:(id1|id2|id3|id4)
The same search without using abbreviated syntax looks like the following:
projectid:id1 OR projectid:id2 OR projectid:id3 OR projectid:id4
To search for entries with matching column names, use the following:
- AND:
column:(name1, name2, name3)
- OR:
column:(name1|name2|name3)
This abbreviated syntax works for the
qualified predicates except for label
.
What's next
- Learn how to search for data assets in Dataplex Catalog
- Learn more about Dataplex Catalog.
- Learn how to enrich entries with metadata by using aspects.
- Learn how to manage entries and ingest custom sources.