Add metadata to a BigQuery table

Learn how to get started with metadata management in Dataplex Universal Catalog.

This quickstart shows you how to add metadata to a BigQuery table. In this quickstart, you do the following things:

  1. Create a BigQuery dataset and table based on a public dataset.

  2. Create a template that defines a set of related metadata fields.

    The template is called an aspect type. The set of related metadata fields, which describe the business and technical metadata for your data assets, is called an aspect.

  3. Add metadata to the table.

    In Dataplex Universal Catalog, each data asset is represented as an entry. To attach metadata to a data asset, you add aspects to the entry.

Before you begin

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Roles required to select or create a project

    • Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
    • Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.

    Go to project selector

  3. Verify that billing is enabled for your Google Cloud project.

  4. Enable the Dataplex and BigQuery APIs.

    Roles required to enable APIs

    To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains the serviceusage.services.enable permission. Learn how to grant roles.

    Enable the APIs

  5. Make sure that you have the following role or roles on the project: Dataplex Catalog Admin, BigQuery Data Owner, BigQuery Job User

    Check for the roles

    1. In the Google Cloud console, go to the IAM page.

      Go to IAM
    2. Select the project.
    3. In the Principal column, find all rows that identify you or a group that you're included in. To learn which groups you're included in, contact your administrator.

    4. For all rows that specify or include you, check the Role column to see whether the list of roles includes the required roles.

    Grant the roles

    1. In the Google Cloud console, go to the IAM page.

      Go to IAM
    2. Select the project.
    3. Click Grant access.
    4. In the New principals field, enter your user identifier. This is typically the email address for a Google Account.

    5. In the Select a role list, select a role.
    6. To grant additional roles, click Add another role and add each additional role.
    7. Click Save.
  6. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Roles required to select or create a project

    • Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
    • Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.

    Go to project selector

  7. Verify that billing is enabled for your Google Cloud project.

  8. Enable the Dataplex and BigQuery APIs.

    Roles required to enable APIs

    To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains the serviceusage.services.enable permission. Learn how to grant roles.

    Enable the APIs

  9. Make sure that you have the following role or roles on the project: Dataplex Catalog Admin, BigQuery Data Owner, BigQuery Job User

    Check for the roles

    1. In the Google Cloud console, go to the IAM page.

      Go to IAM
    2. Select the project.
    3. In the Principal column, find all rows that identify you or a group that you're included in. To learn which groups you're included in, contact your administrator.

    4. For all rows that specify or include you, check the Role column to see whether the list of roles includes the required roles.

    Grant the roles

    1. In the Google Cloud console, go to the IAM page.

      Go to IAM
    2. Select the project.
    3. Click Grant access.
    4. In the New principals field, enter your user identifier. This is typically the email address for a Google Account.

    5. In the Select a role list, select a role.
    6. To grant additional roles, click Add another role and add each additional role.
    7. Click Save.

Create a dataset and a table

  1. In the Google Cloud console, go to the BigQuery Studio page.

    Go to BigQuery Studio

  2. Create a dataset:

    1. In the Explorer pane, find your project. Click View actions, and then click Create dataset.

    2. In the Dataset ID field, enter catalog_demo_dataset.

      Keep the default values for the other fields.

    3. Click Create dataset.

  3. Copy a public table to your dataset:

    1. In the Explorer pane, search for the table named bigquery-public-data.new_york_citibike.citibike_stations. You might need to set the search scope to include the bigquery-public-data project.

      This table is part of the NYC Citi Bike Trips dataset, a public dataset that contains data about a bike share program.

    2. Select the citibike_stations table.

    3. Click Copy. Enter the following information:

      • Project: select your project.
      • Dataset: select catalog_demo_dataset.
      • Table: enter bike_stations.
    4. Click Copy.

  4. In the Explorer pane, locate the catalog_demo_dataset dataset, and confirm that the bike_stations table is listed in the dataset.

Define a metadata template: create an aspect type

  1. In the Google Cloud console, go to the Dataplex Universal Catalog Catalog page.

    Go to Catalog

  2. Click the Aspect types & tag templates tab, and then click the Custom tab.

  3. Click Create aspect type.

  4. In the Aspect type ID field, enter data-governance-demo.

  5. For Location, select global.

  6. In the Template section, click Add field. Use the information in the following table to add several fields to the aspect type:

    Name Type Is required Description
    source-of-data-asset Text No -
    retention-date Date and time No -
    data-classification

    Enum

    Add the values Public, Sensitive, and Confidential.

    Yes -
    has-pii Boolean Yes Whether the data asset has personally identifiable information
  7. Click Save.

Add metadata to the table: add an aspect to the entry

  1. In the Google Cloud console, go to the Dataplex Universal Catalog Search page.

    Go to Search

  2. In the search box, enter catalog_demo_dataset.

  3. Select the bike_stations table.

  4. Add a custom aspect to the entry:

    1. In the Tags & aspects section, next to Optional tags & aspects, click Add.

    2. Select the data-governance-demo aspect type.

      This creates an aspect that uses your aspect type as a template.

    3. Enter the following values:

      • Source of data asset: Copied from NYC Citi Bike Trips public dataset
      • Retention date: enter a date.
      • Data classification: Public
      • Has PII: False
    4. Click Save.

  5. To see the metadata values that you added, in the Tags & aspects section, select the data-governance-demo aspect.

Clean up

To avoid incurring charges to your Google Cloud account for the resources used on this page, follow these steps.

Delete the project

The easiest way to stop billing is to delete the project that you created for this quickstart.

  1. In the Google Cloud console, go to the Manage resources page.

    Go to Manage resources

  2. In the project list, select the project that you want to delete, and then click Delete.
  3. In the dialog, type the project ID, and then click Shut down to delete the project.

Delete individual resources

If you want to reuse your project, delete the resources that you created.

  1. Delete the dataset:

    1. In the Google Cloud console, go to the BigQuery Studio page.

      Go to BigQuery Studio

    2. In the Explorer pane, search for the catalog_demo_dataset dataset.

    3. Click View actions, and then click Delete. Confirm when prompted.

  2. Delete the aspect type:

    1. In the Google Cloud console, go to the Dataplex Universal Catalog Catalog page.

      Go to Catalog

    2. Click the Aspect types & tag templates tab, and then click the Custom tab.

    3. Click the data-governance-demo aspect type.

    4. Click Delete. Confirm when prompted.

What's next