Labeling Datasets

To help organize your projects, you can add arbitrary labels to your datasets. After labeling your datasets, you can search for datasets based on label values. For example, you can use labels to group datasets by purpose, environment, department, and so on.

Before you begin

Labeling a dataset

Labels are key:value pairs that you can attach to a dataset. BigQuery labels have the same characteristics as labels in Compute Engine and Cloud Resource Manager:

  • Label keys and values can be no longer than 63 characters.
  • Label keys and values can contain only lowercase letters, numbers, underscores, hyphens, and international characters.
  • Label keys and values cannot exceed 128 bytes in size.
  • Label keys must begin with a letter.
  • Label keys must be unique within a dataset.

Each of your datasets can have up to 64 labels. Datasets do not have any default labels.

Labels must have a key, but can have an empty value. Labels with no value can be used as tags.

Creating a label

To create a label for a dataset:

Command-line

Use the bq update command with the set_label flag.

  • bq update --set_label [KEY1:VALUE1] --set_label [KEY2:VALUE2] --set_label [KEYN:VALUEN] [DATASET_ID]

Where [KEY1:VALUE1], [KEY2:VALUE2], and [KEYN:VALUEN] are label key:value pairs that you want to create, and [DATASET_ID] is a valid dataset ID.

For example, to add a label to track departments, you can create a label with a department key:

  • bq update --set_label department:shipping [DATASET_ID]

API

To create a label using the API, add a labels field to the dataset's configuration object when creating or updating the dataset.

{
  "kind": "bigquery#dataset",
  ...
  "friendlyName": string,
  "description": string,
  "labels": {
    "[KEY1]": "[VALUE1]",
    "[KEY2]": "[VALUE2]"
  }
   ...
}

Where [KEY1] and [KEY2] are label keys, and [VALUE1] and [VALUE2] are the corresponding label values that you want to create.

Updating a label

To update a label for a dataset:

Command-line

Use the bq update command with the set_label flag.

  • bq update --set_label [KEY1:VALUE1] --set_label [KEY2:VALUE2] --set_label [KEYN:VALUEN] [DATASET_ID]

Where [KEY1:VALUE1], [KEY2:VALUE2], and [KEYN:VALUEN] are label key:value pairs that you want to update, and [DATASET_ID] is a valid dataset ID.

For example, to update a label that tracks departments, specify a new value for the department key:

  • bq update --set_label department:receiving [DATASET_ID]

API

  1. Modify the labels field in the dataset's configuration object.

    {
      "kind": "bigquery#dataset",
      ...
      "friendlyName": string,
      "description": string,
      "labels": {
        "[KEY1]": "[VALUE1]",
        "[KEY2]": "[VALUE2]"
      }
       ...
    }
    

    Where [KEY1] and [KEY2] are label keys, and [VALUE1] and [VALUE2] are the corresponding label values that you want to create.

  2. To replace all labels in the dataset, call Datasets.update().

  3. Alternately, to replace only the labels specified and otherwise preserve existing ones, call Datasets.patch().

If there are concurrent calls to update() and delete(), BigQuery executes the last one received.

Deleting a label

To delete a label from a dataset:

Command-line

Use the bq update command with the clear_label flag.

  • bq update --clear_label [KEY1] --clear_label [KEY2] --clear_label [KEYN] [DATASET_ID]

Where [KEY1], [KEY2], and [KEYN] are label keys that you want to delete, and [DATASET_ID] is a valid dataset ID.

For example, to delete a label that tracks departments, specify the department key:

  • bq update --clear_label department [DATASET_ID]

API

To remove all labels from a dataset:

  1. Remove the labels field from the dataset's configuration object.

    {
      "kind": "bigquery#dataset",
      ...
      "friendlyName": string,
      "description": string,
       ...
    }
    
  2. Call Datasets.update().

To delete specific labels from a dataset and otherwise preserve the other existing labels:

  1. Change the value of the labels you want to delete to null:

    {
      "kind": "bigquery#dataset",
      ...
      "friendlyName": string,
      "description": string,
      "labels": {
        "[KEY1]": null,
        "[KEY2]": null
      }
       ...
    }
    

    Where [KEY1] and [KEY2] are label keys that you want to delete.

  2. Call Datasets.patch().

If there are concurrent calls to update() and delete(), BigQuery executes the last one received.

Deleting a label's value

A label that has a key but an empty value can be used as a tag. To delete a label's value, specify a label with an empty value:

Command-line

Use the bq update command with the set_label flag. Specify the key, followed by a colon, but leave the value unspecified.

  • bq update --set_label [KEY1]: --set_label [KEY2]: --set_label [KEYN]: [DATASET_ID]

Where [KEY1], [KEY2], and [KEYN] are label keys that you want to keep without values, and [DATASET_ID] is a valid dataset ID.

For example, to delete the value associated with the department key, specify the department key with an empty value:

  • bq update --set_label department: [DATASET_ID]

API

To delete a label's value, but retain the label's key:

  1. Change the value of the labels you want to delete to the empty string (""):

    {
      "kind": "bigquery#dataset",
      ...
      "friendlyName": string,
      "description": string,
      "labels": {
        "[KEY1]": "",
        "[KEY2]": ""
      }
       ...
    }
    

    Where [KEY1] and [KEY2] are label keys that you want to keep, but with no values attached to them.

  2. Call Datasets.patch().

If there are concurrent calls to update() and delete(), BigQuery executes the last one received.

Viewing a dataset's labels

To view a dataset's labels:

Command-line

Use the bq show command with the dataset ID.

  • bq show [DATASET_ID]

The command-line tool returns the labels associated with that dataset.

API

Call the Datasets.get() command. The response includes all labels associated with that dataset.

Alternately, you can use Datasets.list() to view the labels for multiple datasets. You can also use filters to return a group of datasets that share common labels.

Filtering datasets using labels

To filter for a set of datasets based on labels, create a filter specification for use in the bq command-line tool or the BigQuery API that uses the following syntax:

field[:value][ field[:value]...

Where:

  • field is expressed as labels.[KEY] where [KEY] is a label key.
  • value is an optional label value.

The filter specification has the following characteristics:

  • Only the AND logical operator is supported. Space-separated comparisons are treated as having implicit AND operators.
  • The only field currently eligible for filtering is "labels.key" where "key" is the name of a label.
  • The filter can include up to ten expressions.
  • Filtering is case-sensitive.

Filter specification examples

To search for all datasets that have a department key with a value of shipping, use the following specification:

labels.department:shipping

To search for multiple key:value pairs, separate them with a space. For example, to search for all datasets where the value of the department key is shipping and the value of the location key is usa, use the following specification:

labels.department:shipping labels.location:usa

You can filter on the presence of a key alone, rather than matching against a key:value pair. The following filter specification returns all datasets with labels named department, without regard to the values that might be associated with the keys:

labels.department

An equivalent specification uses an asterisk to represent all possible values associated with the department key:

labels.department:*

Generating filtered lists of datasets

To generate a filtered list of datasets:

Command-line

Use the bq ls command with the filter flag where [FILTER_SPECIFICATION] is a valid filter specification and [DATASET_ID] is a valid dataset ID:

  • bq ls --filter "[FILTER_SPECIFICATION]" [DATASET_ID]

The command-line tool returns a list of datasets that meet the filter requirements.

For example, the following command returns a list of datasets that have a department key with a value of shipping:

  • bq ls --filter "labels.department:shipping" [DATASET_ID]

API

  1. Add a filter property that specifies the filter you want to use:

    {
      "kind": "bigquery#dataset",
      ...
      "friendlyName": string,
      "description": string,
      "filter": "[FILTER_SPECIFICATION]",
       ...
    }
    

    Where [FILTER_SPECIFICATION] is a valid filter specification.

  2. Call the Datasets.list() command. The response includes all labels associated with that dataset.

What's next

Send feedback about...

BigQuery Documentation