Add metadata to a BigQuery table
Learn how to get started with metadata management in Dataplex Universal Catalog.
This quickstart shows you how to add metadata to a BigQuery table. In this quickstart, you do the following things:
Create a BigQuery dataset and table based on a public dataset.
Create a template that defines a set of related metadata fields.
The template is called an aspect type. The set of related metadata fields, which describe the business and technical metadata for your data assets, is called an aspect.
Add metadata to the table.
In Dataplex Universal Catalog, each data asset is represented as an entry. To attach metadata to a data asset, you add aspects to the entry.
Before you begin
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
-
Create a project: To create a project, you need the Project Creator
(
roles/resourcemanager.projectCreator
), which contains theresourcemanager.projects.create
permission. Learn how to grant roles.
-
Verify that billing is enabled for your Google Cloud project.
-
Enable the Dataplex and BigQuery APIs.
Roles required to enable APIs
To enable APIs, you need the Service Usage Admin IAM role (
roles/serviceusage.serviceUsageAdmin
), which contains theserviceusage.services.enable
permission. Learn how to grant roles. -
Make sure that you have the following role or roles on the project: Dataplex Catalog Admin, BigQuery Data Owner, BigQuery Job User
Check for the roles
-
In the Google Cloud console, go to the IAM page.
Go to IAM - Select the project.
-
In the Principal column, find all rows that identify you or a group that you're included in. To learn which groups you're included in, contact your administrator.
- For all rows that specify or include you, check the Role column to see whether the list of roles includes the required roles.
Grant the roles
-
In the Google Cloud console, go to the IAM page.
Go to IAM - Select the project.
- Click Grant access.
-
In the New principals field, enter your user identifier. This is typically the email address for a Google Account.
- In the Select a role list, select a role.
- To grant additional roles, click Add another role and add each additional role.
- Click Save.
-
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
-
Create a project: To create a project, you need the Project Creator
(
roles/resourcemanager.projectCreator
), which contains theresourcemanager.projects.create
permission. Learn how to grant roles.
-
Verify that billing is enabled for your Google Cloud project.
-
Enable the Dataplex and BigQuery APIs.
Roles required to enable APIs
To enable APIs, you need the Service Usage Admin IAM role (
roles/serviceusage.serviceUsageAdmin
), which contains theserviceusage.services.enable
permission. Learn how to grant roles. -
Make sure that you have the following role or roles on the project: Dataplex Catalog Admin, BigQuery Data Owner, BigQuery Job User
Check for the roles
-
In the Google Cloud console, go to the IAM page.
Go to IAM - Select the project.
-
In the Principal column, find all rows that identify you or a group that you're included in. To learn which groups you're included in, contact your administrator.
- For all rows that specify or include you, check the Role column to see whether the list of roles includes the required roles.
Grant the roles
-
In the Google Cloud console, go to the IAM page.
Go to IAM - Select the project.
- Click Grant access.
-
In the New principals field, enter your user identifier. This is typically the email address for a Google Account.
- In the Select a role list, select a role.
- To grant additional roles, click Add another role and add each additional role.
- Click Save.
-
Create a dataset and a table
In the Google Cloud console, go to the BigQuery Studio page.
Create a dataset:
In the Explorer pane, find your project. Click
View actions, and then click Create dataset.In the Dataset ID field, enter
catalog_demo_dataset
.Keep the default values for the other fields.
Click Create dataset.
Copy a public table to your dataset:
In the Explorer pane, search for the table named
bigquery-public-data.new_york_citibike.citibike_stations
. You might need to set the search scope to include thebigquery-public-data
project.This table is part of the NYC Citi Bike Trips dataset, a public dataset that contains data about a bike share program.
Select the
citibike_stations
table.Click Copy. Enter the following information:
- Project: select your project.
- Dataset: select
catalog_demo_dataset
. - Table: enter
bike_stations
.
Click Copy.
In the Explorer pane, locate the
catalog_demo_dataset
dataset, and confirm that thebike_stations
table is listed in the dataset.
Define a metadata template: create an aspect type
In the Google Cloud console, go to the Dataplex Universal Catalog Catalog page.
Click the Aspect types & tag templates tab, and then click the Custom tab.
Click Create aspect type.
In the Aspect type ID field, enter
data-governance-demo
.For Location, select
global
.In the Template section, click Add field. Use the information in the following table to add several fields to the aspect type:
Name Type Is required Description source-of-data-asset
Text No - retention-date
Date and time No - data-classification
Enum
Add the values
Public
,Sensitive
, andConfidential
.Yes - has-pii
Boolean Yes Whether the data asset has personally identifiable information
Click Save.
Add metadata to the table: add an aspect to the entry
In the Google Cloud console, go to the Dataplex Universal Catalog Search page.
In the search box, enter
catalog_demo_dataset
.Select the bike_stations table.
Add a custom aspect to the entry:
In the Tags & aspects section, next to Optional tags & aspects, click
Add.Select the
data-governance-demo
aspect type.This creates an aspect that uses your aspect type as a template.
Enter the following values:
- Source of data asset:
Copied from NYC Citi Bike Trips public dataset
- Retention date: enter a date.
- Data classification:
Public
- Has PII:
False
- Source of data asset:
Click Save.
To see the metadata values that you added, in the Tags & aspects section, select the data-governance-demo aspect.
Clean up
To avoid incurring charges to your Google Cloud account for the resources used on this page, follow these steps.
Delete the project
The easiest way to stop billing is to delete the project that you created for this quickstart.
- In the Google Cloud console, go to the Manage resources page.
- In the project list, select the project that you want to delete, and then click Delete.
- In the dialog, type the project ID, and then click Shut down to delete the project.
Delete individual resources
If you want to reuse your project, delete the resources that you created.
Delete the dataset:
In the Google Cloud console, go to the BigQuery Studio page.
In the Explorer pane, search for the
catalog_demo_dataset
dataset.Click
View actions, and then click Delete. Confirm when prompted.
Delete the aspect type:
In the Google Cloud console, go to the Dataplex Universal Catalog Catalog page.
Click the Aspect types & tag templates tab, and then click the Custom tab.
Click the
data-governance-demo
aspect type.Click Delete. Confirm when prompted.
What's next
- Learn more about metadata management.
- Learn how to search for resources.
- Learn how to work with aspects and aspect types.