Search for data assets in Dataplex Catalog

Use search in Dataplex Catalog to search for data assets such as BigQuery datasets, Cloud SQL instances, and others. For more information about the Google Cloud assets that are supported in Dataplex Catalog, see Supported Google Cloud sources.

Search scope

The search results in Dataplex Catalog respect permissions that you have over the corresponding resources in source systems.

For example, if you have BigQuery metadata read access to an object, that object appears in your Dataplex Catalog search results. If you have access to a BigQuery table but not to the dataset containing that table, the table still shows up as expected in the Dataplex Catalog search.

The search results include only those resources that belong to the same VPC-SC perimeter as the project under which search is performed. When using the Google Cloud console, this is the project that is selected in the console.

For more information about Dataplex Catalog IAM roles, see Dataplex IAM roles.

Recall limitations in search

Dataplex Catalog search queries don't guarantee full recall. Results that match your query might not be returned, even in subsequent result pages. Additionally, returned (and not returned) results can vary if you repeat search queries.

Date-sharded tables

Dataplex Catalog aggregates date-sharded tables into a single logical entry. This entry has the same schema as the table shard with the most recent date. The entry derives its access level from the dataset it belongs to. Dataplex Catalog search shows these logical entries only if you have access to the dataset that contains them. Individual date-sharded tables are not visible in Dataplex Catalog search, even if they are present in Dataplex Catalog and can be tagged.

Filters

Filters let you narrow down the search results. All filters are grouped in sections:

  • Systems such as BigQuery, Cloud SQL, and others. The Dataplex system contains custom entries.
  • Aspects (tags) list all aspects available to you.
  • Project lists all projects available to you.
  • Type aliases that describe resource types, such as databases, datasets, models, tables, views, services, and custom types.
  • Datasets come from BigQuery.

You can combine filters from multiple sections to find assets that match at least one condition from every selected section. Multiple filters that are selected within a single section are evaluated using the OR logical operator.

For example, consider the filter combination in the following image (click image to enlarge). These search filters are selected: systems BigQuery, type aliases table and view, aspects My aspect type 1 and My aspect type 2, project my-test-project, and datasets test_bq_dataset.

Search filters showing multiple selections.

Dataplex Catalog looks for the following assets:

  • BigQuery tables in test_bq_dataset with aspect My aspect type 1
  • BigQuery tables in test_bq_dataset with aspect My aspect type 2
  • BigQuery views in test_bq_dataset with aspect My aspect type 1
  • BigQuery views in test_bq_dataset with aspect My aspect type 2

Filter by aspect value

The Aspects filters let you query for assets tagged using a specific template. You can use the Customize menu to further refine results and filter by specific aspect values. The aspect value filter conditions depend on that aspect field's data type. For example, for the datetime and number fields, you can specify a specific date or a range.

Filter visibility

The filters Systems, Type aliases, Project, and Datasets are displayed depending on the current query in the Search field.

Before you begin

Before you search for data assets, do the following things.

Required roles

The search results in Dataplex Catalog are scoped according to your role. To search for an asset in Dataplex Catalog, you must have permissions to access the corresponding resource in the source system. For more information, see the Search scope section of this document.

For example, to search for BigQuery datasets, tables, views, and models, you need respective permissions for those entries. For more information, see BigQuery permissions. The following list describes the minimum permissions required:

  • To search for a table, you need bigquery.tables.get permission for that table.
  • To search for a dataset, you need bigquery.tables.get permission for that dataset.
  • To search for metadata for a dataset or a table, you need the BigQuery Metadata Viewer role (roles/bigquery.metadataViewer).

As another example, to search for Cloud SQL instances, databases, schemas, tables, and views, you need respective permissions on those entries. For more information, see Cloud SQL roles and permissions.

To search for custom entries, you need the Dataplex Catalog Viewer role (roles/dataplex.catalogViewer).

For more information about granting roles, see Manage access.

You might also be able to get the required permissions through custom roles or other predefined roles.

Enable the API

Enable the Dataplex API.

Enable the API

Search for data assets

Console

To search for data assets, follow these steps:

  1. In the Google Cloud console, go to the Dataplex Search page.

    Go to Search

  2. For Choose search platform, select Dataplex Catalog as the search mode.

    Selecting Dataplex Catalog lets you search over the Dataplex Catalog metadata storage. Selecting Data Catalog lets you search over your Data Catalog repository, if you're an existing Data Catalog user.

  3. In the search field, enter your query, or use the Filters panel to refine the search parameters.

    You can manually add the following filters:

    • Add a project filter: in Project, click Add project. Search for a specific project, select the project, and then click Open.
    • Add an aspect type filter: in Aspects, click the Add more aspect types menu. Search for a specific template, select it, and then click OK.
  4. Optional: In addition to the assets available to you, you can search for data assets that are publicly available in Google Cloud by selecting Include public datasets.

Use the following tips to construct a search query:

  • Enclose your search expression in quotes if it contains spaces. For example, "search terms".
  • You can precede a keyword with NOT to match the logical negation of the keyword:term filter. You can also use AND and OR Boolean operators to combine search expressions. The AND, OR, and NOT operators aren't case-sensitive.

    For example, NOT column:term lists all columns except those that match the specified term. For a list of keywords and other terms you can use in a Dataplex Catalog search expression, see Search syntax.

gcloud

To search for data assets, use the gcloud dataplex entries search command.

REST

To search for data assets, use the searchEntries method.

View details of an entry

Console

Use Dataplex Catalog search to view the details of an entry.

  1. In the Google Cloud console, go to the Dataplex Search page.

    Go to Search

  2. Select Dataplex Catalog as the search mode.

  3. In the search box, enter the name of an entry.

  4. Click the entry.

    The entry details page opens. The page includes the following sections:

    • Entry details: includes information such as the entry type, system, platform, fully qualified name, creation time, last modification time, description, and stewards.
    • Overview: an overview of the entry, if available.
    • Aspects: the required and optional aspects defined for the entry. For more information, see Categories of aspects.

gcloud

To view the details of an entry, use the gcloud dataplex entries lookup command.

REST

To view the details of an entry, use the lookupEntry method.

What's next