Managing Table Data

This document describes how to manage table data in BigQuery. You can work with BigQuery table data in the following ways:

  • Load data into a table
  • Browse (or preview) table data
  • Query table data
  • Append to or overwrite table data
  • Modify table data using the Data Manipulation Language
  • Copy table data
  • Export table data

For information on managing table schemas, see Modifying table schemas.

Loading data into a table

You can load data when you create a table, or you can create an empty table and load the data later. When you load data, you can use schema auto-detect for supported data formats, or you can specify the schema.

For more information on loading data, see the documentation for your source data's format and location:

Browsing table data

You can browse table data by:

  • Using the GCP Console or the classic BigQuery web UI
  • Using the command-line tool's bq head command
  • Calling the tabledata.list API method

Required permissions

At the dataset level, browsing table data requires READER access to the dataset that contains the table being browsed.

Instead of using dataset-level permissions, you can leverage a project-level IAM role that includes bigquery.tables.getData permissions. All predefined, project-level IAM roles include bigquery.tables.getData permissions except for bigquery.user, bigquery.jobUser, and bigquery.metadataViewer.

In addition, because the bigquery.user role has bigquery.datasets.create permissions, a user assigned to the bigquery.user role can read the data in any table in any dataset that user creates. When a user assigned to the bigquery.user role creates a dataset, that user is given OWNER access to the dataset. OWNER access to a dataset gives the user full control over it and all the tables in it.

For more information on IAM roles and permissions in BigQuery, see Access control. For more information on dataset-level roles, see Primitive roles for datasets.

Browsing table data

To browse table data:

Console

  1. Open the BigQuery web UI in the GCP Console.
    Go to the BigQuery web UI

  2. In the navigation panel, in the Resources section, expand your project and select a dataset. This displays the tables and views in the dataset.

  3. Click a table in the list.

  4. Cick Details and note the value in Number of rows. You may need this value to control the starting point for your results using the CLI or API.

  5. Click Preview. A sample set of data is displayed.

    Table preview

Classic UI

  1. In the classic BigQuery web UI, in the navigation pane, click the blue arrow to the left of your dataset to expand it, or double-click the dataset name. This displays the tables and views in the dataset.

  2. Click a table in the list.

  3. Cick Details and note the value in Number of Rows. You may need this value to control the starting point for your results using the CLI or API.

  4. Click Preview. A sample set of data is displayed.

    Table preview

Command-line

Issue the bq head command with the --max_rows flag to list all fields in a particular number of table rows. If --max_rows is not specified, the default is 100.

To browse a subset of fields in the table (including nested and repeated fields), use the --selected_fields flag and enter the fields as a comma-separated list.

To specify the number of rows to skip before displaying table data, use the --start_row=[INTEGER] flag (or the -s shortcut). The default value is 0. You can retrieve the number of rows in a table by using the bq show command to retrieve table information.

If the table you're browsing is in a project other than your default project, add the project ID to the command in the following format: [PROJECT_ID]:[DATASET].[TABLE].

bq head --max_rows [INTEGER1] --start_row [INTEGER2] --selected_fields "[FIELDS]" [PROJECT_ID]:[DATASET].[TABLE]

Where:

  • [INTEGER1] is the number of rows to display.
  • [INTEGER2] is the number of rows to skip before displaying data.
  • [FIELDS] is a comma-separated list of fields.
  • [PROJECT_ID] is your project ID.
  • [DATASET] is the name of the dataset containing the table.
  • [TABLE] is the name of the table to browse.

Examples:

Enter the following command to list all fields in the first 10 rows in mydataset.mytable. mydataset is in your default project.

bq head --max_rows=10 mydataset.mytable

Enter the following command to list all fields in the first 100 rows in mydataset.mytable. mydataset is in myotherproject, not your default project.

bq head myotherproject:mydataset.mytable

Enter the following command to display only field1 and field2 in mydataset.mytable. The command uses the --start_row flag to skip to row 100. mydataset.mytable is in your default project.

bq head --start_row 100 --selected_fields "field1,field2" mydataset.mytable

Because the bq head command does not create a query job, bq head commands do not appear in your query history, and you are not charged for them.

API

Browse through a table's data by calling tabledata.list. Specify the name of the table in the tableId parameter.

Configure these optional parameters to control the output:

  • maxResults — Maximum number of results to return
  • selectedFields — Comma-separated list of fields to return; If unspecified, all fields are returned
  • startIndex — Zero-based index of the starting row to read

Values are returned wrapped in a JSON object that you must parse, as described in the tabledata.list reference documentation.

C#

Before trying this sample, follow the C# setup instructions in the BigQuery Quickstart Using Client Libraries . For more information, see the BigQuery C# API reference documentation .


using Google.Api.Gax;
using Google.Apis.Bigquery.v2.Data;
using Google.Cloud.BigQuery.V2;
using System;
using System.Collections.Generic;
using System.Linq;

public class BigQueryBrowseTable
{
    public void BrowseTable(
        string projectId = "your-project-id"
    )
    {
        BigQueryClient client = BigQueryClient.Create(projectId);
        TableReference tableReference = new TableReference()
        {
            TableId = "shakespeare",
            DatasetId = "samples",
            ProjectId = "bigquery-public-data"
        };
        // Load all rows from a table
        PagedEnumerable<TableDataList, BigQueryRow> result = client.ListRows(
            tableReference: tableReference,
            schema: null
        );
        // Print the first 10 rows
        foreach (BigQueryRow row in result.Take(10))
        {
            Console.WriteLine($"{row["corpus"]}: {row["word_count"]}");
        }
    }
}

Go

Before trying this sample, follow the Go setup instructions in the BigQuery Quickstart Using Client Libraries . For more information, see the BigQuery Go API reference documentation .

The Google Cloud Client Library for Go automatically paginates by default, so you do not need to implement pagination yourself, for example:

table := client.Dataset(datasetID).Table(tableID)
it := table.Read(ctx)
for {
	var row []bigquery.Value
	err := it.Next(&row)
	if err == iterator.Done {
		break
	}
	if err != nil {
		return err
	}
	fmt.Println(row)
}

Java

Before trying this sample, follow the Java setup instructions in the BigQuery Quickstart Using Client Libraries . For more information, see the BigQuery Java API reference documentation .

TableId tableIdObject = TableId.of(datasetName, tableName);
// This example reads the result 100 rows per RPC call. If there's no need to limit the number,
// simply omit the option.
TableResult tableData =
    bigquery.listTableData(tableIdObject, TableDataListOption.pageSize(100));
for (FieldValueList row : tableData.iterateAll()) {
  // do something with the row
}

Node.js

Before trying this sample, follow the Node.js setup instructions in the BigQuery Quickstart Using Client Libraries . For more information, see the BigQuery Node.js API reference documentation .

The Google Cloud Client Library for Node.js automatically paginates by default, so you do not need to implement pagination yourself, for example:


// Import the Google Cloud client library
const {BigQuery} = require('@google-cloud/bigquery');

/**
 * TODO(developer): Uncomment the following lines before running the sample.
 */
// const datasetId = "my_dataset";
// const tableId = "my_table";

async function browseRows() {
  // Displays rows from "my_table" in "my_dataset".

  // Create a client
  const bigqueryClient = new BigQuery();

  // List rows in the table
  const [rows] = await bigqueryClient
    .dataset(datasetId)
    .table(tableId)
    .getRows();

  console.log('Rows:');
  rows.forEach(row => console.log(row));
}

browseRows();

PHP

Before trying this sample, follow the PHP setup instructions in the BigQuery Quickstart Using Client Libraries . For more information, see the BigQuery PHP API reference documentation .

Pagination happens automatically in the Google Cloud Client Library for PHP using the generator function rows, which fetches the next page of results during iteration.

use Google\Cloud\BigQuery\BigQueryClient;

/** Uncomment and populate these variables in your code */
// $projectId = 'The Google project ID';
// $datasetId = 'The BigQuery dataset ID';
// $tableId   = 'The BigQuery table ID';
// $maxResults = 10;

$maxResults = 10;
$startIndex = 0;

$options = [
    'maxResults' => $maxResults,
    'startIndex' => $startIndex
];
$bigQuery = new BigQueryClient([
    'projectId' => $projectId,
]);
$dataset = $bigQuery->dataset($datasetId);
$table = $dataset->table($tableId);
$numRows = 0;
foreach ($table->rows($options) as $row) {
    print('---');
    foreach ($row as $column => $value) {
        printf('%s: %s' . PHP_EOL, $column, $value);
    }
    $numRows++;
}

Python

Before trying this sample, follow the Python setup instructions in the BigQuery Quickstart Using Client Libraries . For more information, see the BigQuery Python API reference documentation .

# from google.cloud import bigquery
# client = bigquery.Client()

dataset_ref = client.dataset("samples", project="bigquery-public-data")
table_ref = dataset_ref.table("shakespeare")
table = client.get_table(table_ref)  # API call

# Load all rows from a table
rows = client.list_rows(table)
assert len(list(rows)) == table.num_rows

# Load the first 10 rows
rows = client.list_rows(table, max_results=10)
assert len(list(rows)) == 10

# Specify selected fields to limit the results to certain columns
fields = table.schema[:2]  # first two columns
rows = client.list_rows(table, selected_fields=fields, max_results=10)
assert len(rows.schema) == 2
assert len(list(rows)) == 10

# Use the start index to load an arbitrary portion of the table
rows = client.list_rows(table, start_index=10, max_results=10)

# Print row data in tabular format
format_string = "{!s:<16} " * len(rows.schema)
field_names = [field.name for field in rows.schema]
print(format_string.format(*field_names))  # prints column headers
for row in rows:
    print(format_string.format(*row))  # prints row data

Ruby

Before trying this sample, follow the Ruby setup instructions in the BigQuery Quickstart Using Client Libraries . For more information, see the BigQuery Ruby API reference documentation .

Pagination happens automatically in the Google Cloud Client Library for RubyTable#data and Data#next.

require "google/cloud/bigquery"

def browse_table
  bigquery = Google::Cloud::Bigquery.new project_id: "bigquery-public-data"
  dataset  = bigquery.dataset "samples"
  table    = dataset.table "shakespeare"

  # Load all rows from a table
  rows = table.data

  # Load the first 10 rows
  rows = table.data max: 10

  # Print row data
  rows.each { |row| puts row }
end

Querying table data

After you load your data into BigQuery, you can query the data in your tables. BigQuery supports two types of queries:

By default, BigQuery runs interactive queries, which means that the query is executed as soon as possible.

BigQuery also offers batch queries. BigQuery queues each batch query on your behalf and starts the query as soon as idle resources are available, usually within a few minutes.

You can run interactive and batch queries by using the:

For more information on querying BigQuery tables, see Introduction to querying BigQuery data.

In addition to querying data stored in BigQuery tables, you can query data stored externally. For more information, see Introduction to external data sources.

Appending to and overwriting table data

You can overwrite table data using a load or query operation. You can append additional data to an existing table by performing a load-append operation or by appending query results to the table.

For more information on appending to or overwriting a table when loading data, see the documentation for your source data format:

To append to or overwrite a table using query results, specify a destination table and set the write disposition to either:

  • Append to table — Appends the query results to an existing table.
  • Overwrite table — Overwrites an existing table with the same name using the query results.

For more information on using query results to append to or overwrite data, see Writing query results.

Modifying table data

Using the standard SQL dialect, you can modify data in a table using DML statements. DML statements allow you to perform bulk row updates, insertions, and deletions. For examples on using DML, see Updating table data using the Data Manipulation Language.

The legacy SQL dialect does not support DML statements. To update or delete data using legacy SQL, you must delete the table and then recreate it with new data. Alternatively, you can write a query that modifies the data and write the query results to a new, destination table.

Copying table data

You can copy a table by:

  • Using the GCP Console or the classic BigQuery web UI
  • Using the command-line tool's bq cp command
  • Calling the jobs.insert API method and configuring a copy job

For more information on copying tables, see Copying a table.

Exporting table data

You can export table data in CSV, JSON, or Avro format. Currently, you must export data into a Cloud Storage bucket. Exporting to your local machine is not supported; however, you can download and save query results using the GCP Console or the classic BigQuery web UI.

For more information, see Exporting table data.

Next steps

หน้านี้มีประโยชน์ไหม โปรดแสดงความคิดเห็น

ส่งความคิดเห็นเกี่ยวกับ...

หากต้องการความช่วยเหลือ ให้ไปที่หน้าการสนับสนุน