This document describes how to use search in Dataplex Catalog to search for resources such as BigQuery datasets, Cloud SQL instances, and others. For more information about the Google Cloud assets that are supported in Dataplex Catalog, see Supported Google Cloud sources.
Search scope
The search results in Dataplex Catalog respect permissions that you have over the corresponding resources in source systems.
For example, if you have BigQuery metadata read access to an object, that object appears in your Dataplex Catalog search results. If you have access to a BigQuery table but not to the dataset containing that table, the table still shows up as expected in the Dataplex Catalog search.
The search results include only those resources that belong to the same VPC-SC perimeter as the project under which search is performed. When using the Google Cloud console, this is the project that is selected in the console.
To broaden the scope of your search results beyond the resources within your project's VPC Service Controls perimeter, use VPC Service Controls ingress and egress rules. These rules facilitate private and efficient data exchange across your organization. You can configure ingress and egress rules using the Google Cloud console or through JSON or YAML files. Refer to the following YAML example and consult the VPC Service Controls documentation to tailor the rule to your specific requirements.
egressPolicies:
- egressFrom:
identityType: ANY_USER_ACCOUNT
egressTo:
# Specify which resources should be present in the search results. In this example,
# BigQuery.
operations:
- methodSelectors:
- method: '*'
serviceName: bigquery.googleapis.com
# Specify project ids under which the search is performed.
resources:
- projects/SEARCH_PROJECT_ID
ingressPolicies:
- ingressFrom:
identityType: ANY_USER_ACCOUNT
sources:
- accessLevel: '*'
ingressTo:
# Specify which resources should be present in the search results. In this example,
# BigQuery.
operations:
- methodSelectors:
- method: '*'
serviceName: bigquery.googleapis.com
# Specify project ids to expose in search results.
resources:
- projects/INGRESS_PROJECT_ID
For more information about Dataplex Catalog Identity and Access Management roles, see Dataplex IAM roles.
Recall limitations in search
Dataplex Catalog search queries don't guarantee full recall. Results that match your query might not be returned, even in subsequent result pages. Additionally, returned (and not returned) results can vary if you repeat search queries.
Filters
Filters let you narrow down the search results. All filters are grouped in sections:
- Systems such as BigQuery, Cloud SQL, and others. The Dataplex system contains custom entries.
- Aspects (tags) list all aspects available to you.
- Project lists all projects available to you.
- Type aliases describe resource types, such as databases, datasets, models, tables, views, services, and custom types.
- Datasets come from BigQuery.
You can combine filters from multiple sections to find assets that match at
least one condition from every selected section. Multiple filters that are
selected within a single section are evaluated using the OR
logical operator.
For example, consider the filter combination in the following image (click the
image to enlarge). These search filters are selected: systems
BigQuery
, type aliases table
and view
, aspects My aspect
type 1
and My aspect type 2
, project my-test-project
, and datasets
test_bq_dataset
.
Dataplex Catalog looks for the following assets:
- BigQuery tables in
test_bq_dataset
with aspectMy aspect type 1
- BigQuery tables in
test_bq_dataset
with aspectMy aspect type 2
- BigQuery views in
test_bq_dataset
with aspectMy aspect type 1
- BigQuery views in
test_bq_dataset
with aspectMy aspect type 2
Filter by aspect value
The Aspects filters let you query for assets tagged using a specific
template. You can use the Customize menu to further refine results and
filter by specific aspect values. The aspect value filter conditions depend on
that aspect field's data type. For example, for the datetime
and number
fields, you can specify a specific date or a range.
Filter visibility
The filters Systems, Type aliases, Project, and Datasets are displayed depending on the current query in the Search field.
Before you begin
Before you search for resources, make sure you have the required roles and enable the API.
Required roles
This section describes the roles and permissions required to search for resources and to access the search results.
For more information about granting roles, see Manage access.
You might also be able to get the required permissions through custom roles or other predefined roles.
Required roles for searching entries
To search for entries, you need at least one of the Dataplex Catalog IAM roles on the project that is used for search. Permissions on search results are checked independently of the selected project.
Required roles for accessing search results
The search results in Dataplex Catalog are scoped according to your role. To search for an asset in Dataplex Catalog, you must have permissions to access the corresponding resource in the source system. For more information, see the Search scope section of this document.
For example, to search for BigQuery datasets, tables, views, and models, you need respective permissions for those entries. For more information, see BigQuery permissions.
The following list describes the minimum permissions required:
- To search for a table, you need
bigquery.tables.get
permission for that table. - To search for a dataset, you need
bigquery.datasets.get
permission for that dataset. - To search for metadata for a dataset or a table, you need the
BigQuery Metadata Viewer role (
roles/bigquery.metadataViewer
).
As another example, to search for Cloud SQL instances, databases, schemas, tables, and views, you need respective permissions on those entries. For more information, see Cloud SQL roles and permissions.
To search for custom entries, you need the Dataplex Catalog Viewer role
(roles/dataplex.catalogViewer
).
Enable the API
Enable the Dataplex API.
Search for resources
Console
To search for resources, follow these steps:
In the Google Cloud console, go to the Dataplex Search page.
For Choose search platform, select Dataplex Catalog as the search mode.
Selecting Dataplex Catalog lets you search over the Dataplex Catalog metadata storage. Selecting Data Catalog lets you search over your Data Catalog repository, if you're an existing Data Catalog user.
In the search field, enter your query, or use the Filters panel to refine the search parameters.
You can manually add the following filters:
- Add a project filter: in Project, click Add project. Search for a specific project, select the project, and then click Open.
- Add an aspect type filter: in Aspects, click the Add more aspect types menu. Search for a specific template, select it, and then click OK.
Optional: In addition to the assets available to you, you can search for resources that are publicly available in Google Cloud by selecting Include public datasets.
Use the following tips to construct a search query:
- Enclose your search expression in quotes if it contains spaces. For
example,
"search terms"
. You can precede a keyword with
NOT
to match the logical negation of thekeyword:term
filter. You can also useAND
andOR
Boolean operators to combine search expressions. TheAND
,OR
, andNOT
operators aren't case-sensitive.For example,
NOT column:term
lists all columns except those that match the specified term. For a list of keywords and other terms you can use in a Dataplex Catalog search expression, see Search syntax.
gcloud
To search for resources, use the
gcloud dataplex entries search
command.
REST
To search for resources, use the searchEntries
method.
View details of an entry
Console
Use Dataplex Catalog search to view the details of an entry.
In the Google Cloud console, go to the Dataplex Search page.
Select Dataplex Catalog as the search mode.
In the search box, enter the name of an entry.
Click the entry.
The entry details page opens. The page includes the following sections:
- Entry details: includes information such as the entry type, system, platform, fully qualified name, creation time, last modification time, description, and stewards.
- Overview: an overview of the entry, if available.
- Aspects: the required and optional aspects defined for the entry. For more information, see Categories of aspects.
gcloud
To view the details of an entry, use the
gcloud dataplex entries lookup
command.
REST
To view the details of an entry, use the
lookupEntry
method.
What's next
- Understand search syntax for Dataplex Catalog.
- Learn more about Dataplex Catalog.
- Learn how to enrich entries with metadata using aspects.
- Learn how to manage entries and ingest custom sources.