Quickstart for tagging tables

In this quickstart, you:

  1. Create a BigQuery dataset, and then copy public taxi data to a new table in your dataset.
  2. Create a tag template with a schema that defines four tag fields of distinct types (string, double, boolean, and enumerated).
  3. Lookup the Data Catalog entry for your table.
  4. Attach the tag to your table.

Before you begin

  1. Set up your project:

    1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
    2. In the Google Cloud Console, on the project selector page, select or create a Google Cloud project.

      Go to project selector

    3. Enable the Data Catalog and BigQuery APIs.

      Enable the APIs

    4. Install and initialize the Cloud SDK.

  2. Add a public dataset to your project.

    1. Go to BigQuery in the Google Cloud Console.
    2. In the left Explorer pane, click + ADD DATA and select Explore public datasets from the drop-down list.

      Explore public datasets

    3. In the Datasets panel, search for "New York taxi trips" and click the relevant search result.

    4. In the dataset overview panel, click VIEW DATASET.

  3. Create a new dataset. You must be the dataset owner to attach a tag to a table in the dataset as shown in this quickstart.

    1. From BigQuery in the Google Cloud Console.
    2. In the left Explorer pane, click your project ID, then click CREATE DATASET.
    3. In the Create Dataset dialog:
      • For Dataset ID, enter "demo_dataset".
      • For Data location, accept the Default location, which sets the dataset location to US multi-region.
      • For Default data expiration, choose one of the following options:
        • Never: (Default) Tables created in the dataset are never automatically deleted. You must delete them manually.
        • Number of days after table creation: Any table created in the dataset is deleted after the specified days from its creation time. This value is applied if you do not set a table expiration when the table is created.
      • For Encryption, leave the Google-managed key option selected.
      • Click Create dataset.
  4. Copy a public New York Taxi table to your demo_dataset.

    1. From BigQuery in the Google Cloud Console, in the left Explorer pane, search for "tlc_yellow_trips" tables and select one of them, such as tlc_yellow_trips_2017. Then click COPY.
    2. In the Destination section of the Copy table dialog:

      1. Select your project from the Project name drop-down list.
      2. Select "demo_dataset" from the Dataset name drop-down list.
      3. Insert "trips" for the Table name, then click COPY.
    3. In the left Explorer pane, confirm that the trips table is listed in your demo_dataset.

      You add Data Catalog tags to the table in the next section.

Create a tag template and attach the tag to your table

Console

You can create a tag template from the Data Catalog UI in the Google Cloud Console.

  1. Select "Create tag template" to open the Create template page. Fill in the template form to define a "Demo Tag Template".
    1. Template ID: demo_tag_template
    2. Template display name: Demo Tag Template
    3. Location: Select a location from the drop-down list
  2. Next create four tag fields (previously called tag "attributes"). Click "Add field" to open the New field dialog. Create four fields with the values listed below. Note that the "source" field defines a required tag field.
      • Field ID: source
      • Make this field required: Checked
      • Field display name: Source of data asset
      • Type: String
      • Click Done
      • Field ID: num_rows
      • Make this field required: Not checked
      • Field display name: Number of rows in the data asset
      • Type: Double
      • Click Done
      • Field ID: has_pii
      • Make this field required: Not checked
      • Field display name: Has PII
      • Type: Boolean
      • Click Done
      • Field ID: pii_type
      • Make this field required: Not checked
      • Field display name: PII type
      • Type: Enumerated
        Add 3 values:
        1. EMAIL_ADDRESS
        2. US_SOCIAL_SECURITY_NUMBER
        3. NONE
      • Click Done

    The completed tag template form should list the four tag fields:

    Click CREATE. The Data Catalog Tag template page displays template details and fields.

  3. To attach a tag to a table in your dataset, open the Data Catalog UI, insert "demo_dataset" in the Search box, then click Search.
  4. The demo_dataset and the trips table that you copied into the dataset are displayed in the search results. Click the trips link.
  5. The Entry details page opens. Click Attach Tags.
  6. In the Attach Tags panel:
    1. Under Choose what to tag, select the trips table and click OK.
    2. Under Choose the tag templates, search for and select Demo Tag Template and click OK.
    3. Under Fill in tag values, fill in the following values for each field:
      • Source of data asset: Copied from tlc_yellow_trips_2017
      • Number of rows in the data asset: 113496874
      • Has PII: FALSE
      • PII type: NONE

      Click Save. The tag fields are now listed in the Tags section below BigQuery dataset details.

gcloud

Run the gcloud data-catalog tag-templates create command shown below to create a tag template with the following four tag fields (also called "attributes"):

  1. display_name: Source of data asset
    id: source
    required: TRUE
    type: String
  2. display_name: Number of rows in the data asset
    id: num_rows
    required: FALSE
    type: Double
  3. display_name: Has PII
    id: has_pii
    required: FALSE
    type: Boolean
  4. display_name: PII type
    id: pii_type
    required: FALSE
    type: Enumerated
    values:
    1. EMAIL_ADDRESS
    2. US_SOCIAL_SECURITY_NUMBER
    3. NONE
# -------------------------------
# Create a Tag Template.
# -------------------------------
gcloud data-catalog tag-templates create demo_template \
    --location=us-central1 \
    --display-name="Demo Tag Template" \
    --field=id=source,display-name="Source of data asset",type=string,required=TRUE \
    --field=id=num_rows,display-name="Number of rows in the data asset",type=double \
    --field=id=has_pii,display-name="Has PII",type=bool \
    --field=id=pii_type,display-name="PII type",type='enum(EMAIL_ADDRESS|US_SOCIAL_SECURITY_NUMBER|NONE)'

# -------------------------------
# Lookup the Data Catalog entry for the table.
# -------------------------------
ENTRY_NAME=$(gcloud data-catalog entries lookup '//bigquery.googleapis.com/projects/PROJECT_ID/datasets/DATASET/tables/TABLE' --format="value(name)")

# -------------------------------
# Attach a Tag to the table.
# -------------------------------

# Create the Tag file.
cat > tag_file.json << EOF
  {
    "source": "BigQuery",
    "num_rows": 1000,
    "has_pii": true,
    "pii_type": "EMAIL_ADDRESS"
  }
EOF

gcloud data-catalog tags create --entry=${ENTRY_NAME} \
    --tag-template=demo_template --tag-template-location=us-central1 --tag-file=tag_file.json

Go

Before trying this sample, follow the Go setup instructions in the Data Catalog quickstart using client libraries. For more information, see the Data Catalog Go API reference documentation.


// The datacatalog_quickstart application demonstrates how to define a tag
// template, populate values in the template, and attach a tag based on the
// template to a BigQuery table.
package main

import (
	"context"
	"flag"
	"fmt"
	"log"
	"strings"
	"time"

	datacatalog "cloud.google.com/go/datacatalog/apiv1"
	datacatalogpb "google.golang.org/genproto/googleapis/cloud/datacatalog/v1"
)

func main() {
	projectID := flag.String("project_id", "", "Cloud Project ID, used for session creation.")
	location := flag.String("location", "us-central1", "data catalog region to use for the quickstart")
	table := flag.String("table", "myproject.mydataset.mytable", "bigquery table to tag in project.dataset.table format")

	flag.Parse()

	ctx := context.Background()
	client, err := datacatalog.NewClient(ctx)
	if err != nil {
		log.Fatalf("datacatalog.NewClient: %v", err)
	}
	defer client.Close()

	// Create the tag template.
	tmpl, err := createQuickstartTagTemplate(ctx, client, *projectID, *location)
	if err != nil {
		log.Fatalf("createQuickstartTagTemplate: %v", err)
	}
	fmt.Printf("Created tag template: %s\n", tmpl.GetName())

	// Convert a BigQuery resource identifier into the equivalent datacatalog
	// format.
	resource, err := convertBigQueryResourceRepresentation(*table)
	if err != nil {
		log.Fatalf("couldn't parse --table flag (%s): %v", *table, err)
	}

	// Lookup the entry metadata for the BQ table resource.
	entry, err := client.LookupEntry(ctx, &datacatalogpb.LookupEntryRequest{
		TargetName: &datacatalogpb.LookupEntryRequest_LinkedResource{
			LinkedResource: resource,
		},
	})
	if err != nil {
		log.Fatalf("client.LookupEntry: %v", err)
	}
	fmt.Printf("Successfully looked up table entry: %s\n", entry.GetName())

	// Create a tag based on the template, and apply it to the entry.
	tag, err := createQuickstartTag(ctx, client, "my-quickstart-tag", tmpl.GetName(), entry.GetName())
	if err != nil {
		log.Fatalf("couldn't create tag: %v", err)
	}
	fmt.Printf("Created tag: %s", tag.GetName())
}

// createQuickstartTagTemplate registers a tag template in datacatalog.
func createQuickstartTagTemplate(ctx context.Context, client *datacatalog.Client, projectID, location string) (*datacatalogpb.TagTemplate, error) {
	loc := fmt.Sprintf("projects/%s/locations/%s", projectID, location)

	// Define the tag template.
	template := &datacatalogpb.TagTemplate{
		DisplayName: "Quickstart Tag Template",
		Fields: map[string]*datacatalogpb.TagTemplateField{
			"source": {
				DisplayName: "Source of data asset",
				Type: &datacatalogpb.FieldType{
					TypeDecl: &datacatalogpb.FieldType_PrimitiveType_{
						PrimitiveType: datacatalogpb.FieldType_STRING,
					},
				},
			},
			"num_rows": {
				DisplayName: "Number of rows in data asset",
				Type: &datacatalogpb.FieldType{
					TypeDecl: &datacatalogpb.FieldType_PrimitiveType_{
						PrimitiveType: datacatalogpb.FieldType_DOUBLE,
					},
				},
			},
			"has_pii": {
				DisplayName: "Has PII",
				Type: &datacatalogpb.FieldType{
					TypeDecl: &datacatalogpb.FieldType_PrimitiveType_{
						PrimitiveType: datacatalogpb.FieldType_BOOL,
					},
				},
			},
			"pii_type": {
				DisplayName: "PII Type",
				Type: &datacatalogpb.FieldType{
					TypeDecl: &datacatalogpb.FieldType_EnumType_{
						EnumType: &datacatalogpb.FieldType_EnumType{
							AllowedValues: []*datacatalogpb.FieldType_EnumType_EnumValue{
								{DisplayName: "EMAIL"},
								{DisplayName: "SOCIAL SECURITY NUMBER"},
								{DisplayName: "NONE"},
							},
						},
					},
				},
			},
		},
	}

	//Construct the creation request using the template definition.
	req := &datacatalogpb.CreateTagTemplateRequest{
		Parent:        loc,
		TagTemplateId: "quickstart_tag_template",
		TagTemplate:   template,
	}

	return client.CreateTagTemplate(ctx, req)

}

// createQuickstartTag populates tag values according to the template, and attaches
// the tag to the designeated entry.
func createQuickstartTag(ctx context.Context, client *datacatalog.Client, tagID, templateName, entryName string) (*datacatalogpb.Tag, error) {
	tag := &datacatalogpb.Tag{
		Name:     fmt.Sprintf("%s/tags/%s", entryName, tagID),
		Template: templateName,
		Fields: map[string]*datacatalogpb.TagField{
			"source": {
				Kind: &datacatalogpb.TagField_StringValue{StringValue: "Copied from tlc_yellow_trips_2018"},
			},
			"num_rows": {
				Kind: &datacatalogpb.TagField_DoubleValue{DoubleValue: 113496874},
			},
			"has_pii": {
				Kind: &datacatalogpb.TagField_BoolValue{BoolValue: false},
			},
			"pii_type": {
				Kind: &datacatalogpb.TagField_EnumValue_{
					EnumValue: &datacatalogpb.TagField_EnumValue{
						DisplayName: "NONE",
					},
				},
			},
		},
	}

	req := &datacatalogpb.CreateTagRequest{
		Parent: entryName,
		Tag:    tag,
	}
	return client.CreateTag(ctx, req)
}

// convertBigQueryResourceRepresentation converts a table identifier in standard sql form
// (project.datadata.table) into the representation used within data catalog.
func convertBigQueryResourceRepresentation(table string) (string, error) {
	parts := strings.Split(table, ".")
	if len(parts) != 3 {
		return "", fmt.Errorf("specified table string is not in expected project.dataset.table format: %s", table)
	}
	return fmt.Sprintf("//bigquery.googleapis.com/projects/%s/datasets/%s/tables/%s", parts[0], parts[1], parts[2]), nil
}

Java

Before trying this sample, follow the Java setup instructions in the Data Catalog quickstart using client libraries. For more information, see the Data Catalog Java API reference documentation.

import com.google.cloud.datacatalog.v1.CreateTagRequest;
import com.google.cloud.datacatalog.v1.CreateTagTemplateRequest;
import com.google.cloud.datacatalog.v1.DataCatalogClient;
import com.google.cloud.datacatalog.v1.Entry;
import com.google.cloud.datacatalog.v1.FieldType;
import com.google.cloud.datacatalog.v1.FieldType.EnumType;
import com.google.cloud.datacatalog.v1.FieldType.EnumType.EnumValue;
import com.google.cloud.datacatalog.v1.FieldType.PrimitiveType;
import com.google.cloud.datacatalog.v1.LocationName;
import com.google.cloud.datacatalog.v1.LookupEntryRequest;
import com.google.cloud.datacatalog.v1.Tag;
import com.google.cloud.datacatalog.v1.TagField;
import com.google.cloud.datacatalog.v1.TagTemplate;
import com.google.cloud.datacatalog.v1.TagTemplateField;
import java.io.IOException;

public class Quickstart {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "my-project";
    String tagTemplateId = "my_tag_template";
    createTags(projectId, tagTemplateId);
  }

  public static void createTags(String projectId, String tagTemplateId) throws IOException {
    // Currently, Data Catalog stores metadata in the us-central1 region.
    String location = "us-central1";

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (DataCatalogClient dataCatalogClient = DataCatalogClient.create()) {

      // -------------------------------
      // Create a Tag Template.
      // -------------------------------
      TagTemplateField sourceField =
          TagTemplateField.newBuilder()
              .setDisplayName("Source of data asset")
              .setType(FieldType.newBuilder().setPrimitiveType(PrimitiveType.STRING).build())
              .build();

      TagTemplateField numRowsField =
          TagTemplateField.newBuilder()
              .setDisplayName("Number of rows in data asset")
              .setType(FieldType.newBuilder().setPrimitiveType(PrimitiveType.DOUBLE).build())
              .build();

      TagTemplateField hasPiiField =
          TagTemplateField.newBuilder()
              .setDisplayName("Has PII")
              .setType(FieldType.newBuilder().setPrimitiveType(PrimitiveType.BOOL).build())
              .build();

      TagTemplateField piiTypeField =
          TagTemplateField.newBuilder()
              .setDisplayName("PII type")
              .setType(
                  FieldType.newBuilder()
                      .setEnumType(
                          EnumType.newBuilder()
                              .addAllowedValues(
                                  EnumValue.newBuilder().setDisplayName("EMAIL").build())
                              .addAllowedValues(
                                  EnumValue.newBuilder()
                                      .setDisplayName("SOCIAL SECURITY NUMBER")
                                      .build())
                              .addAllowedValues(
                                  EnumValue.newBuilder().setDisplayName("NONE").build())
                              .build())
                      .build())
              .build();

      TagTemplate tagTemplate =
          TagTemplate.newBuilder()
              .setDisplayName("Demo Tag Template")
              .putFields("source", sourceField)
              .putFields("num_rows", numRowsField)
              .putFields("has_pii", hasPiiField)
              .putFields("pii_type", piiTypeField)
              .build();

      CreateTagTemplateRequest createTagTemplateRequest =
          CreateTagTemplateRequest.newBuilder()
              .setParent(
                  LocationName.newBuilder()
                      .setProject(projectId)
                      .setLocation(location)
                      .build()
                      .toString())
              .setTagTemplateId(tagTemplateId)
              .setTagTemplate(tagTemplate)
              .build();

      // Create the Tag Template.
      tagTemplate = dataCatalogClient.createTagTemplate(createTagTemplateRequest);

      // -------------------------------
      // Lookup Data Catalog's Entry referring to the table.
      // -------------------------------
      String linkedResource =
          String.format(
              "//bigquery.googleapis.com/projects/%s/datasets/test_dataset/tables/test_table",
              projectId);
      LookupEntryRequest lookupEntryRequest =
          LookupEntryRequest.newBuilder().setLinkedResource(linkedResource).build();
      Entry tableEntry = dataCatalogClient.lookupEntry(lookupEntryRequest);

      // -------------------------------
      // Attach a Tag to the table.
      // -------------------------------
      TagField sourceValue =
          TagField.newBuilder().setStringValue("Copied from tlc_yellow_trips_2017").build();
      TagField numRowsValue = TagField.newBuilder().setDoubleValue(113496874).build();
      TagField hasPiiValue = TagField.newBuilder().setBoolValue(false).build();
      TagField piiTypeValue =
          TagField.newBuilder()
              .setEnumValue(TagField.EnumValue.newBuilder().setDisplayName("NONE").build())
              .build();

      Tag tag =
          Tag.newBuilder()
              .setTemplate(tagTemplate.getName())
              .putFields("source", sourceValue)
              .putFields("num_rows", numRowsValue)
              .putFields("has_pii", hasPiiValue)
              .putFields("pii_type", piiTypeValue)
              .build();

      CreateTagRequest createTagRequest =
          CreateTagRequest.newBuilder().setParent(tableEntry.getName()).setTag(tag).build();

      dataCatalogClient.createTag(createTagRequest);
      System.out.printf("Tag created successfully");
    }
  }
}

Node.js

Before trying this sample, follow the Node.js setup instructions in the Data Catalog quickstart using client libraries. For more information, see the Data Catalog Node.js API reference documentation.

// Import the Google Cloud client library and create a client.
const {DataCatalogClient} = require('@google-cloud/datacatalog').v1;
const datacatalog = new DataCatalogClient();

async function quickstart() {
  // Common fields.
  let request;
  let responses;

  /**
   * TODO(developer): Uncomment the following lines before running the sample.
   */
  // const projectId = 'my_project'; // Google Cloud Platform project
  // const datasetId = 'demo_dataset';
  // const tableId = 'trips';

  // Currently, Data Catalog stores metadata in the
  // us-central1 region.
  const location = 'us-central1';

  // Create Fields.
  const fieldSource = {
    displayName: 'Source of data asset',
    type: {
      primitiveType: 'STRING',
    },
  };

  const fieldNumRows = {
    displayName: 'Number of rows in data asset',
    type: {
      primitiveType: 'DOUBLE',
    },
  };

  const fieldHasPII = {
    displayName: 'Has PII',
    type: {
      primitiveType: 'BOOL',
    },
  };

  const fieldPIIType = {
    displayName: 'PII type',
    type: {
      enumType: {
        allowedValues: [
          {
            displayName: 'EMAIL',
          },
          {
            displayName: 'SOCIAL SECURITY NUMBER',
          },
          {
            displayName: 'NONE',
          },
        ],
      },
    },
  };

  // Create Tag Template.
  const tagTemplateId = 'demo_tag_template';

  const tagTemplate = {
    displayName: 'Demo Tag Template',
    fields: {
      source: fieldSource,
      num_rows: fieldNumRows,
      has_pii: fieldHasPII,
      pii_type: fieldPIIType,
    },
  };

  const tagTemplatePath = datacatalog.tagTemplatePath(
    projectId,
    location,
    tagTemplateId
  );

  // Delete any pre-existing Template with the same name.
  try {
    request = {
      name: tagTemplatePath,
      force: true,
    };
    await datacatalog.deleteTagTemplate(request);
    console.log(`Deleted template: ${tagTemplatePath}`);
  } catch (error) {
    console.log(`Cannot delete template: ${tagTemplatePath}`);
  }

  // Create the Tag Template request.
  const locationPath = datacatalog.locationPath(projectId, location);

  request = {
    parent: locationPath,
    tagTemplateId: tagTemplateId,
    tagTemplate: tagTemplate,
  };

  // Execute the request.
  responses = await datacatalog.createTagTemplate(request);
  const createdTagTemplate = responses[0];
  console.log(`Created template: ${createdTagTemplate.name}`);

  // Lookup Data Catalog's Entry referring to the table.
  responses = await datacatalog.lookupEntry({
    linkedResource:
      '//bigquery.googleapis.com/projects/' +
      `${projectId}/datasets/${datasetId}/tables/${tableId}`,
  });
  const entry = responses[0];
  console.log(`Entry name: ${entry.name}`);
  console.log(`Entry type: ${entry.type}`);
  console.log(`Linked resource: ${entry.linkedResource}`);

  // Attach a Tag to the table.
  const tag = {
    name: entry.name,
    template: createdTagTemplate.name,
    fields: {
      source: {
        stringValue: 'copied from tlc_yellow_trips_2017',
      },
      num_rows: {
        doubleValue: 113496874,
      },
      has_pii: {
        boolValue: false,
      },
      pii_type: {
        enumValue: {
          displayName: 'NONE',
        },
      },
    },
  };

  request = {
    parent: entry.name,
    tag: tag,
  };

  // Create the Tag.
  await datacatalog.createTag(request);
  console.log(`Tag created for entry: ${entry.name}`);
}
quickstart();

Python

Before trying this sample, follow the Python setup instructions in the Data Catalog quickstart using client libraries. For more information, see the Data Catalog Python API reference documentation.

# Import required modules.
from google.cloud import datacatalog_v1

# TODO: Set these values before running the sample.
# Google Cloud Platform project.
project_id = "my_project"
# Set dataset_id to the ID of existing dataset.
dataset_id = "demo_dataset"
# Set table_id to the ID of existing table.
table_id = "trips"
# Tag template to create.
tag_template_id = "example_tag_template"

# For all regions available, see:
# https://cloud.google.com/data-catalog/docs/concepts/regions
location = "us-central1"

# Use Application Default Credentials to create a new
# Data Catalog client. GOOGLE_APPLICATION_CREDENTIALS
# environment variable must be set with the location
# of a service account key file.
datacatalog_client = datacatalog_v1.DataCatalogClient()

# Create a Tag Template.
tag_template = datacatalog_v1.types.TagTemplate()

tag_template.display_name = "Demo Tag Template"

tag_template.fields["source"] = datacatalog_v1.types.TagTemplateField()
tag_template.fields["source"].display_name = "Source of data asset"
tag_template.fields[
    "source"
].type_.primitive_type = datacatalog_v1.types.FieldType.PrimitiveType.STRING

tag_template.fields["num_rows"] = datacatalog_v1.types.TagTemplateField()
tag_template.fields["num_rows"].display_name = "Number of rows in data asset"
tag_template.fields[
    "num_rows"
].type_.primitive_type = datacatalog_v1.types.FieldType.PrimitiveType.DOUBLE

tag_template.fields["has_pii"] = datacatalog_v1.types.TagTemplateField()
tag_template.fields["has_pii"].display_name = "Has PII"
tag_template.fields[
    "has_pii"
].type_.primitive_type = datacatalog_v1.types.FieldType.PrimitiveType.BOOL

tag_template.fields["pii_type"] = datacatalog_v1.types.TagTemplateField()
tag_template.fields["pii_type"].display_name = "PII type"

for display_name in ["EMAIL", "SOCIAL SECURITY NUMBER", "NONE"]:
    enum_value = datacatalog_v1.types.FieldType.EnumType.EnumValue(
        display_name=display_name
    )
    tag_template.fields["pii_type"].type_.enum_type.allowed_values.append(
        enum_value
    )

expected_template_name = datacatalog_v1.DataCatalogClient.tag_template_path(
    project_id, location, tag_template_id
)

# Create the Tag Template.
try:
    tag_template = datacatalog_client.create_tag_template(
        parent=f"projects/{project_id}/locations/{location}",
        tag_template_id=tag_template_id,
        tag_template=tag_template,
    )
    print(f"Created template: {tag_template.name}")
except OSError as e:
    print(f"Cannot create template: {expected_template_name}")
    print(f"{e}")

# Lookup Data Catalog's Entry referring to the table.
resource_name = (
    f"//bigquery.googleapis.com/projects/{project_id}"
    f"/datasets/{dataset_id}/tables/{table_id}"
)
table_entry = datacatalog_client.lookup_entry(
    request={"linked_resource": resource_name}
)

# Attach a Tag to the table.
tag = datacatalog_v1.types.Tag()

tag.template = tag_template.name
tag.name = "my_super_cool_tag"

tag.fields["source"] = datacatalog_v1.types.TagField()
tag.fields["source"].string_value = "Copied from tlc_yellow_trips_2018"

tag.fields["num_rows"] = datacatalog_v1.types.TagField()
tag.fields["num_rows"].double_value = 113496874

tag.fields["has_pii"] = datacatalog_v1.types.TagField()
tag.fields["has_pii"].bool_value = False

tag.fields["pii_type"] = datacatalog_v1.types.TagField()
tag.fields["pii_type"].enum_value.display_name = "NONE"

tag = datacatalog_client.create_tag(parent=table_entry.name, tag=tag)
print(f"Created tag: {tag.name}")

REST & CMD LINE

REST & CMD LINE

If you do not have access to Cloud Client libraries for your language or want to test the API using REST requests, see the following examples and refer to the Data Catalog REST API documentation.

1. Create a tag template.

Before using any of the request data, make the following replacements:

  • project-id: your GCP project ID

HTTP method and URL:

POST https://datacatalog.googleapis.com/v1/projects/project-id/locations/us-central1/tagTemplates?tagTemplateId=demo_tag_template

Request JSON body:


{
  "displayName":"Demo Tag Template",
  "fields":{
    "source":{
      "displayName":"Source of data asset",
      "isRequired": "true",
      "type":{
        "primitiveType":"STRING"
      }
    },
    "num_rows":{
      "displayName":"Number of rows in data asset",
      "isRequired": "false",
      "type":{
        "primitiveType":"DOUBLE"
      }
    },
    "has_pii":{
      "displayName":"Has PII",
      "isRequired": "false",
      "type":{
        "primitiveType":"BOOL"
      }
    },
    "pii_type":{
      "displayName":"PII type",
      "isRequired": "false",
      "type":{
        "enumType":{
          "allowedValues":[
            {
              "displayName":"EMAIL_ADDRESS"
            },
            {
              "displayName":"US_SOCIAL_SECURITY_NUMBER"
            },
            {
              "displayName":"NONE"
            }
          ]
        }
      }
    }
  }
}

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

{
  "name":"projects/project-id/locations/us-central1/tagTemplates/demo_tag_template",
  "displayName":"Demo Tag Template",
  "fields":{
    "num_rows":{
      "displayName":"Number of rows in data asset",
      "isRequired": "false",
      "type":{
        "primitiveType":"DOUBLE"
      }
    },
    "has_pii":{
      "displayName":"Has PII",
      "isRequired": "false",
      "type":{
        "primitiveType":"BOOL"
      }
    },
    "pii_type":{
      "displayName":"PII type",
      "isRequired": "false",
      "type":{
        "enumType":{
          "allowedValues":[
            {
              "displayName":"EMAIL_ADDRESS"
            },
            {
              "displayName":"NONE"
            },
            {
              "displayName":"US_SOCIAL_SECURITY_NUMBER"
            }
          ]
        }
      }
    },
    "source":{
      "displayName":"Source of data asset",
      "isRequired":"true",
      "type":{
        "primitiveType":"STRING"
      }
    }
  }
}

2. Lookup the Data Catalog entry-id for your BigQuery table.

Before using any of the request data, make the following replacements:

  • project-id: GCP project ID

HTTP method and URL:

GET https://datacatalog.googleapis.com/v1/entries:lookup?linkedResource=//bigquery.googleapis.com/projects/project-id/datasets/demo_dataset/tables/trips

Request JSON body:

Request body is empty.

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

{
  "name": "projects/project-id/locations/US/entryGroups/@bigquery/entries/entry-id",
  "type": "TABLE",
  "schema": {
    "columns": [
      {
        "type": "STRING",
        "description": "A code indicating the TPEP provider that provided the record. 1= ",
        "mode": "REQUIRED",
        "column": "vendor_id"
      },
      ...
    ]
  },
  "sourceSystemTimestamps": {
    "createTime": "2019-01-25T01:45:29.959Z",
    "updateTime": "2019-03-19T23:20:26.540Z"
  },
  "linkedResource": "//bigquery.googleapis.com/projects/project-id/datasets/demo_dataset/tables/trips",
  "bigqueryTableSpec": {
    "tableSourceType": "BIGQUERY_TABLE"
  }
}

3. Create a tag from the template and attach it to your BigQuery table.

Before using any of the request data, make the following replacements:

  • project-id: GCP project ID
  • entry-id: Data Catalog entry ID for the Demo Dataset trips table (returned in the lookup results in the previous step).

HTTP method and URL:

POST https://datacatalog.googleapis.com/v1/projects/project-id/locations/us-central1/entryGroups/@bigquery/entries/entry-id/tags

Request JSON body:

{
  "template":"projects/project-id/locations/us-central1/tagTemplates/demo_tag_template",
  "fields":{
    "source":{
      "stringValue":"Copied from tlc_yellow_trips_2017"
    },
    "num_rows":{
      "doubleValue":113496874
    },
    "has_pii":{
      "boolValue":false
    },
    "pii_type":{
      "enumValue":{
        "displayName":"NONE"
      }
    }
  }
}

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

{
  "name":"projects/project-id/locations/US/entryGroups/@bigquery/entries/entry-id/tags/tag-id",
  "template":"projects/project-id/locations/us-central1/tagTemplates/demo_tag_template",
  "fields":{
    "pii_type":{
      "displayName":"PII type",
      "enumValue":{
        "displayName":"NONE"
      }
    },
    "has_pii":{
      "displayName":"Has PII",
      "boolValue":false
    },
    "source":{
      "displayName":"Source of data asset",
      "stringValue":"Copied from tlc_yellow_trips_2017"
    },
    "num_rows":{
      "displayName":"Number of rows in data asset",
      "doubleValue":113496874
    }
  },
  "templateDisplayName":"Demo Tag Template"
}
Caution: Renaming the table in BigQuery deletes all the tags attached to it and its columns.

Clean up

To avoid incurring charges to your Google Cloud account for the resources used in this page, follow these steps.

Deleting the project

The easiest way to eliminate billing is to delete the project that you created for the tutorial.

To delete the project:

  1. In the Cloud Console, go to the Manage resources page.

    Go to Manage resources

  2. In the project list, select the project that you want to delete, and then click Delete.
  3. In the dialog, type the project ID, and then click Shut down to delete the project.

Deleting the dataset

  1. If necessary, open the BigQuery web UI.

    Go to the BigQuery web UI

  2. In the navigation panel, in the Resources section, click the demo_dataset dataset you created.

  3. In the details panel, on the right side, click Delete dataset. This action deletes the dataset, the table, and all the data.

  4. In the Delete dataset dialog box, confirm the delete command by typing the name of your dataset (demo_dataset) and then click Delete.

Deleting the tag template

  1. Open the Data Catalog UI in the Google Cloud Console. Under Tag Template, click Manage tag templates.

  2. Click Demo Tag Template.

  3. On the Tag Template page, click Delete to delete the Demo Tag Template.

What's next