bq Command-Line Tool Reference

This document details the bq command-line tool's commands and flags. For information on using the CLI, see Using the bq Command-Line Tool.

Global flags

You can use the following global flags with the bq command-line tool.

--api

The API endpoint to call. The default value is https://www.googleapis.com.

--api_version

The API version to use. The default is v2.

--apilog

Log all API requests and responses to the file specified by this flag. You can also use stdout and stderr. Specifying the empty string ('') will direct to stdout.

--application_default_credential_file

For more information, see Setting Up Authentication for Server to Server Production Applications. The default value is ''.

--bigqueryrc

The path to the command-line tool's configuration file. The configuration file specifies new defaults for any flags, and can be overrridden by specifying the flag on the command line. If the --bigqueryrc flag is not specified, the BIGQUERYRC environment variable is used. If that is not specified, the path ~/.bigqueryrc is used. The default value is $HOME/.bigqueryrc.

--ca_certificates_file

The location of your CA certificate file. The default value is ''.

--credential_file

The filename used for storing the your OAuth token. The default value is $HOME/.bigquery.v2.token.

--dataset_id

The default dataset to use for requests. This flag is ignored when not applicable. You can set the value to [PROJECT_ID]:[DATASET] or [DATASET]. If [PROJECT_ID] is missing, the default project is used. You can override this setting by specifying the --project_id flag. The default value is ''.

--debug_mode

Show tracebacks on Python exceptions. The default value is false.

--disable_ssl_validation

Disables HTTPS certificate validation. The default value is false.

--discovery_file

The filename of the JSON file to read for discovery. The default value is ''.

--enable_gdrive

When set to true, a new OAuth token with GDrive scope is requested. When set to false, a new OAuth token without GDrive scope is requested.

--fingerprint_job_id

Whether to use a job ID that is derived from a fingerprint of the job configuration. This will prevent the same job from running multiple times accidentally. The default value is false.

--format

Specifies the format of the command's output. Options include:

  • pretty: formatted table output
  • sparse: simpler table output
  • prettyjson: easy-to-read JSON format
  • json: maximally compact JSON
  • csv: csv format with header

pretty, sparse, and prettyjson are intended to be human-readable. json and csv are for passing to another program. If none is specified, the command produces no output. If the --format flag is absent, an appropriate output format is chosen based on the command.

--headless

Specifies whether to run the bq session without user interaction. When set to true, interaction is disabled. For example, debug_mode will not break into the debugger, and the frequency of informational printing is lowered. The default value is false.

--job_id

The unique job ID to use for the request. If not specified in a job creation request, a job ID is generated. This flag applies only to commands that create jobs: cp, extract, load, and query. For more information, see Generating a job ID.

--job_property

An additional key-value pair to include in the properties field of the job configuration. Repeat this flag to specify additional properties.

--location

A string corresponding to your region or multi-region location. The location flag is supported by the following commands. All other commands ignore the --location flag.

If your dataset was created in the US or EU multi-region location, the --location flag is optional for these commands.

--max_rows_per_request

An integer that specifies the maximum number of rows to return per read.

--project_id

The project ID to use for requests. The default value is ''.

--proxy_address

The name or IP address of the proxy host to use for connecting to GCP. The default value is ''.

--proxy_password

The password to use when authenticating with the proxy host. The default value is ''.

--proxy_port

The port number to use to connect to the proxy host. The default value is ''.

--proxy_username

The user name to use when authenticating with the proxy host. The default value is ''.

--quiet or -q

If set to true, ignore status updates while jobs are running. The default value is false.

--service_account

Use this service account email address for authorization. For example: 1234567890@developer.gserviceaccount.com. The default value is ''.

--service_account_credential_file

The file used as a credential store for service accounts. This flag must be set if you're using a service account.

--service_account_private_key_file

The file that contains the service account private key. This flag is required if the --service_account flag is specified. The default value is ''.

--service_account_private_key_password

The private key password. The password must match the password you set on the key when you create it. The default value is notasecret.

--synchronous_mode or -sync

If set to true, wait for the command to complete before returning, and use the job completion status as the error code. If set to false, the job is created, and successful completion status is used for the error code. The default value is true.

--trace

A tracing token specified as token:[TOKEN] to include in api requests.

--use_gce_service_account

Specify this flag to use service account credentials instead of stored credentials when you're runing on a Google Compute Engine instance. For more information, see: Creating and Enabling Service Accounts for Instances. The default value is false.

Command-specific flags

You can use the following command flags in the bq command-line tool.

bq cancel

The cancel command is used to cancel jobs. The cancel command has no command-specific flags.

For more information on using the cancel command, see Cancelling jobs.

The cancel command uses the following global flags.

--job_id

The unique job ID to use for the cancellation request. You can specify the job ID without using the --job_id flag, for example: bq cancel [JOB_ID].

--synchronous_mode or --sync

When specified, wait for the command to complete before returning. If set to false, the command returns immediately. The default value is true.

bq cp

The cp command is used to copy tables. The cp command uses the following command-specific flags.

For more information on using the cp command, see Copying tables.

--append_table or -a

When specified, copy a table and append it to an existing table. The default value is false.

--destination_kms_key

The Cloud KMS key used for encryption of the destination table data.

--force or -f

When specified, if the destination table exists, overwrite it and don't prompt. The default value is false.

--no_clobber or -n

When specified, if the destination table exists, do not overwrite it. The default value is false.

bq extract

The extract command is used to export table data to Google Cloud Storage.

For more information on using the extract command, see Exporting Table Data.

The extract command uses the following command-specific flags.

--compression

The compression type to use for exported files. Possible values include GZIP and NONE. The default value is NONE.

--destination_format

The format for the exported data. Possible values include:

  • CSV
  • NEWLINE_DELIMITED_JSON
  • AVRO

The default value is CSV.

--field_delimiter or -F

The character that indicates the boundary between columns in the output file. Both \t and tab are allowed for tab delimiters.

--print_header

When specified, print header rows for formats that have headers. The default value is true.

bq head

The head command displays rows in a table.

For more information on using the head command, see Browsing table data.

The head command uses the following command-specific flags.

--job or -j

Specify this flag with a valid job ID to read the results of a query job. The default value is false.

--max_rows or -n

An integer indicating the number of rows to print when showing table data. The default value is 100.

--selected_fields or -c

A comma-separated list indicating a subset of fields (including nested and repeated fields) to return when showing table data. If not specified, all columns are retrieved.

--start_row or -s

An integer indicating the number of rows to skip before showing table data. The default value is 0 (start at the first row).

--table or -t

Specify this flag with a table ID to read rows from a table. The default value is false.

bq insert

The insert command allows you to insert rows of newline-delimited JSON formatted data using the streaming buffer. This command is intended for testing purposes only. To stream data into BigQuery, use the insertAll API method.

For more information, see Streaming Data into BigQuery.

The insert command uses the following command-specific flags.

--ignore_unknown_values or -i

When specified, ignore any values in a row that are not present in the table's schema.

--skip_invalid_rows or -s

When specified, attempt to insert any valid rows, even if invalid rows are present.

--template_suffix or -x

When specified, treat the destination table as a base template, and insert the rows into an instance table named {destination}{templateSuffix}. BigQuery manages creating the instance table using the schema of the base template.

bq load

The load command loads data into a table.

For more information on loading data from Google Cloud Storage using the load command, see:

For more information on loading data from a local source using the load command, see Loading Data into BigQuery from a Local Data Source.

The load command uses the following command-specific flags.

--allow_jagged_rows

When specified, allow missing trailing optional columns in CSV data.

--allow_quoted_newlines

When specified, allow quoted newlines in CSV data.

--autodetect

When specified, enable schema auto-detection for CSV and JSON data.

--destination_kms_key

The Cloud KMS key for encryption of the destination table data.

--encoding or -E

The character encoding used in the data. Possible values include: + ISO-8859-1 (also known as Latin-1) + UTF-8

--field_delimiter or -F

The character that indicates the boundary between columns in the data. Both \t and tab are allowed for tab delimiters.

--ignore_unknown_values

When specified, allow and ignore extra, unrecognized values in CSV or JSON data.

--max_bad_records

An integer that specifies the maximum number of bad records allowed before the entire job fails. The default value is 0.

--null_marker

An optional custom string that represents a NULL value in CSV data.

--projection_fields

If used with --source_format set to DATASTORE_BACKUP, indicates which entity properties to load from a Cloud Datastore backup. Property names are case sensitive and must refer to top-level properties. The default value is ''. (a comma separated list)

--quote

The quote character to use to enclose records. The default value is " which indicates no quote character.

--replace

When specified, erase existing contents before loading new data. The default value is false.

--schema

Either the path to a local JSON schema file or a comma-separated list of column definitions in the form [FIELD]:[DATA_TYPE],[FIELD]:[DATA_TYPE].

--schema_update_option

When appending data to a table (in a load job or a query job), or when overwriting a table partition, specifies how to update the schema of the destination table. Possible values include:

  • ALLOW_FIELD_ADDITION: Allow new fields to be added
  • ALLOW_FIELD_RELAXATION: Allow relaxing REQUIRED fields to NULLABLE

Repeat this flag to specify multiple schema update options.

--skip_leading_rows

An integer that specifies the number of rows to skip at the beginning of the source file.

--source_format

The format of the source data. Possible values include:

  • CSV
  • NEWLINE_DELIMITED_JSON
  • AVRO
  • DATASTORE_BACKUP
  • PARQUET (experimental)

--time_partitioning_expiration

An integer that specifies (in seconds) when a time-based partition should be deleted. The expiration time is creation time plus the expiration value. A negative number indicates no expiration.

--time_partitioning_field

The field used to determine how to create a time-based partition. If time-based partitioning is enabled without this value, the table is partitioned based on the load time.

--time_partitioning_type

Enables time-based partitioning on a table and sets the partition type. Currently, the only possible value is DAY which generates one partition per day.

bq ls

The ls command lists objects in a collection.

For more information on using the ls command, see:

The ls command uses the following command-specific flags.

--all or -a

When specified, show all results: jobs from all users or all datasets (including hidden ones). This flag is not needed when listing transfer configurations or transfer runs.

--datasets or -d

When specified, list datasets. The default value is false.

--filter

List datasets that match the filter expression. Use a space-separated list of label keys and values in the form labels.[KEY]:[VALUE]. For transfer configurations, a filter expression in the form dataSourceIds:[DATA_SOURCES] lists transfer configurations for specified data sources. Possible values include:

  • adwords
  • dcm_dt (DoubleClick Campaign Manager)
  • dfp_dt (DoubleClick for Publishers)
  • youtube_channel
  • youtube_content_owner

Also for transfer runs, a filter expression in the form states:[STATE], lists transfer runs with the specified states. Possible values include: + SUCCEEDED + FAILED + PENDING + RUNNING + CANCELLED

--jobs or -j

When specified, list jobs. The default value is false.

--max_results or -n

An integer indicating the maximum number of results.

--message_type

To list transfer run log messages of a particular type, specify messageTypes:[MESSAGE_TYPE]. Possible values include:

  • INFO
  • WARNING
  • ERROR

--page_token or -k

When specified, list items starting from this page token.

--projects or -p

When specified, show all projects. The default value is false.

--run_attempt

Set this flag to LATEST to list only the latest runs for a transfer.

--transfer_config

When specified, list transfer configurations. When using this flag, you must also specify --transfer_location. The default value is false.

--transfer_location

Lists transfer configurations in the specified location. You set the transfer location when the transfer is created.

--transfer_log

When specified, list transfer log messages for the specified transfer run. The default vaue is false.

--transfer_run

When specified, list transfer runs. The default value is false.

bq mk

The mk command creates a dataset, table, view, or transfer configuration.

For more information on using the mk command with BigQuery, see:

For more information on using the mk command with the BigQuery Data Transfer Service, see:

The mk command uses the following command-specific flags.

--data_location

(Legacy) Specifies the location of the dataset. Use the --location global flag instead.

--data_source

Specifies the data source for a transfer configuration. Possible values include:

  • adwords
  • dcm_dt (DoubleClick Campaign Manager)
  • dfp_dt (DoubleClick for Publishers)
  • youtube_channel
  • youtube_content_owner

The default value is ''.

--dataset or -d

When specified, creates a dataset. The default value is false.

--default_table_expiration

An integer that specifies the default lifetime, in seconds, for newly-created tables in a dataset. The expiration time is set to the current time plus this value.

--description

The description of the dataset or table.

--destination_kms_key

The Cloud KMS key used to encrypt the table data.

--display_name

The display name for the transfer configuration. The default value is ''.

--end_time

A timestamp that specifies the end time for a range of transfer runs. The format for the timestamp is RFC3339 UTC "Zulu".

--expiration

An integer that specifies the table or view's expiration time. The expiration time is set to the current time plus this value.

--external_table_definition

Specifies a table definition to used to create an external table. The value can be either an inline table definition or a path to a file containing a JSON table definition. The format of an inline definition is schema@format=uri.

--force or -f

When specified, ignore already exists errors and overwrite the table without prompting. The default value is false.

--label

A label to set on the table. The format is [KEY]:[VALUE]. Repeat this flag to specify multiple labels.

--params pr -p

The parameters for a transfer configuration in JSON format: {"[PARAMETER]":"[VALUE]"}. The parameters vary depending on the data source. For more information, see Introduction to BigQuery Data Transfer Service.

--refresh_window_days

An integer that specifies the refresh window for a transfer configuration in days. The default value is 0.

--schema

The path to a local JSON schema file or a comma-separated list of column definitions in the form [FIELD]:[DATA_TYPE],[FIELD]:[DATA_TYPE]. The default value is ''.

--start_time

A timestamp that specifies the start time for a range of transfer runs. The format for the timestamp is RFC3339 UTC "Zulu".

--table or -t

When specified, create a table. The default value is false.

--target_dataset

The target dataset for a transfer configuration. The default value is ''.

--time_partitioning_expiration

An integer that specifies (in seconds) when a time-based partition should be deleted. The expiration time is creation time plus the expiration value. A negative number indicates no expiration.

--time_partitioning_field

The field used to determine how to create a time-based partition. If time-based partitioning is enabled without this value, the table is partitioned based on the load time.

--time_partitioning_type

Enables time-based partitioning on a table and sets the partition type. Currently, the only possible value is DAY which generates one partition per day.

--transfer_config

When specified, creates a transfer configuration.

--transfer_run

When specified, creates transfer runs for a time range. The default value is false.

--use_legacy_sql

When set to false, uses a standard SQL query to create a view. The default value is true (uses legacy SQL).

--view

When specified, creates a view. The default value is ''.

--view_udf_resource

The Cloud Storage URI or the path to a local code file that is loaded and evaluated immediately as a user-defined function resource used by a view's SQL query. Repeat this flag to specify multiple files.

bq mkdef

The mkdef command creates a table definition in JSON format for data stored in Google Cloud Storage or Google Drive.

For more information on using the mkdef command, see Creating a Table Definition File for an External Data Source.

The mkdef command uses the following command-specific flags.

--autodetect

When specified, use schema auto-detection for CSV and JSON data.

--source_format

The format of the source data. Possible values include: + CSV + NEWLINE_DELIMITED_JSON + AVRO + DATASTORE_BACKUP + GOOGLE_SHEETS

The default value is CSV.

bq partition

The partition command is used to convert date-named tables (ending in [YYYYmmdd]) into partitioned tables.

For more information on using the partition command, see Converting_date-sharded_tables_into_ingestion-time_partitioned_tables.

The partition command uses the following command-specific flags.

--no_clobber or -n

When specified, do not overwrite an existing partition. The default value is false.

--time_partitioning_expiration

An integer that specifies (in seconds) when a time-based partition should be deleted. The expiration time is creation time plus the expiration value. A negative number indicates no expiration.

--time_partitioning_type

Enables time-based partitioning on a table and sets the partition type. Currently, the only possible value is DAY which generates one partition per day.

bq query

The query command creates a query job that runs the supplied SQL query.

For more information on using the query command, see Running Interactive and Batch Queries.

The query command uses the following command-specific flags.

--allow_large_results

When specified, enables large destination table sizes for legacy SQL queries.

--append_table

When specified, append data to a destination table. The default value is false.

--batch

When specified, run the query in batch mode. The default value is false.

--destination_kms_key

The Cloud KMS key used to encrypt the destination table data.

--destination_schema

The path to a local JSON schema file or a comma-separated list of column definitions in the form [FIELD]:[DATA_TYPE],[FIELD]:[DATA_TYPE]. The default value is ''.

--destination_table

The name of the destination table for writing query results. The default value is ''.

--dry_run

When specified, the query is validated but not run.

--external_table_definition

The table name and schema definition used in an external table query. The schema can be a path to a local JSON schema file or a comma-separated list of column definitions in the form [FIELD]:[DATA_TYPE],[FIELD]:[DATA_TYPE]. The format for supplying the table name and schema is: [TABLE]::[PATH_TO_FILE] or [TABLE]::[SCHEMA]@[SOURCE_FORMAT]=[CLOUD_STORAGE_URI]. Repeat this flag to query multiple tables.

--flatten_results

When specified, flatten nested and repeated fields in the results for legacy SQL queries. The default value is true.

--label

A label to apply to a query job in the form [KEY]:[VALUE]. Repeat this flag to specify multiple labels.

--max_rows or -n

An integer specifying the number of rows to return in the query results. The default value is 100.

--maximum_bytes_billed

An integer that limits the bytes billed for the query. If the query goes beyond the limit, it fails (without incurring a charge). If not specified, the bytes billed is set to the project default.

--min_completion_ratio

[Experimental] A number between 0 and 1.0 that specifies the minimum fraction of data that must be scanned before a query returns. If not set, the default server value 1.0 is used.

--parameter

Either a JSON file containing a list of query parameters, or a query parameter in the form [NAME]:[TYPE]:[VALUE]. An empty name creates a positional parameter. [TYPE] may be omitted to assume a STRING value in the form: name::value or ::value. NULL produces a null value. Repeat this flag to specify multiple parameters.

--replace

If specified, overwrite the destination table with the query results. The default value is false.

--require_cache

If specified, run the query only if results can be retrieved from the cache.

--rpc

If specified, use the rpc-style query API instead of the REST API jobs.insert method. The default value is false.

--schema_update_option

When appending data to a table (in a load job or a query job), or when overwriting a table partition, specifies how to update the schema of the destination table. Possible values include:

  • ALLOW_FIELD_ADDITION: Allow new fields to be added
  • ALLOW_FIELD_RELAXATION: Allow relaxing REQUIRED fields to NULLABLE

Repeat this flag to specify multiple schema update options.

--start_row or -s

An integer that specifies the first row to return in the query result. The default value is 0.

--time_partitioning_expiration

An integer that specifies (in seconds) when a time-based partition should be deleted. The expiration time is creation time plus the expiration value. A negative number indicates no expiration.

--time_partitioning_field

The field used to determine how to create a time-based partition. If time-based partitioning is enabled without this value, the table is partitioned based on the load time.

--time_partitioning_type

Enables time-based partitioning on a table and sets the partition type. Currently, the only possible value is DAY which generates one partition per day.

--udf_resource

The Cloud Storage URI or the path to a local code file that is loaded and evaluated immediately as a user-defined function resource used by a SQL query. Repeat this flag to specify multiple files.

--use_cache

When specified, caches the query results. The default value is true.

--use_legacy_sql

When set to false, runs a standard SQL query. The default value is true (uses legacy SQL).

bq rm

The rm command deletes a dataset, table, view, or transfer configuration.

For more information on using the rm command, see:

The rm command uses the following command-specific flags.

--dataset or -d

When secified, deletes a dataset. The default value is false.

--force or -f

When specified, deletes a table, view, or dataset without prompting. The default value is false.

--recursive or -r

When specified, deletes a dataset and any tables (and table data) in it. The default value is false.

--table or -t

When specified, deletes a table. The default value is false.

--transfer_config

When specified, deletes a transfer configuration. The default value is false.

bq show

The show command displays information about an object.

For more information on using the show command, see:

The show command uses the following command-specific flags.

--dataset or -d

When specified, displays information about a dataset. The default value is false.

--encryption_service_account

When specified, displays the service account for a user if it exists, or creates one if it doesn't exist. The default value is false.

--job or -j

If specified, show information about a job. The default value is false.

--schema

When specified, displays only the table's schema. The default value is false.

--transfer_config

When specified, displays information about a transfer configuration. The default value is false.

--transfer_run

When specified, displays information about a transfer run. The default value is false.

--view

When specified, displays information about a view. The default value is false.

bq update

The update command updates a dataset, table, view or transfer configuration.

For more information on using the update command, see:

The update command uses the following command-specific flags.

--clear_label

Removes a label using the format [KEY]:. Repeat this flag to remove multiple labels.

--dataset or -d

Updates a dataset. The default value is false.

--default_table_expiration

An integer that udpates the default lifetime, in seconds, for newly-created tables in a dataset. The expiration time is set to the current time plus this value. Specify 0 to remove the existing expiration.

--description

Updates the description of a dataset, table or view.

--display_name

Updates the display name for a transfer configuration. The default value is ''.

--etag

Update the resources only if the etag matches.

--expiration

An integer that updates the expiration time in seconds for a table or view. Specifying 0 removes the expiration time.

--external_table_definition

Updates an external table with the specified table definition. The schema can be a path to a local JSON schema file or a comma-separated list of column definitions in the form [FIELD]:[DATA_TYPE],[FIELD]:[DATA_TYPE]. The format for supplying the table name and schema is: [TABLE]::[PATH_TO_FILE] or [TABLE]::[SCHEMA]@[SOURCE_FORMAT]=[CLOUD_STORAGE_URI].

--params or -p

Updates parameters for a transfer configuration in JSON format: {"[PARAMETER]":"[VALUE]"}. The parameters vary depending on the data source. For more information, see Introduction to BigQuery Data Transfer Service.

--refresh_window_days

An integer that specifies an updated refresh window (in days) for a transfer configuration.

--schema

The path to a local JSON schema file or a comma-separated list of column definitions in the form [FIELD]:[DATA_TYPE],[FIELD]:[DATA_TYPE]. The default value is ''.

--set_label

A label to update in the form [KEY]:[VALUE]. Repeat this flag to update multiple labels.

--source

Path to local JSON file containing a payload used to update a resource.

--table or -t

When specified, updates a table. The default value is false.

--target_dataset

When specified, updates the target dataset for a transfer configuration. The default value is ''.

--time_partitioning_expiration

An integer that updates (in seconds) when a time-based partition should be deleted. The expiration time is creation time plus the expiration value. A negative number indicates no expiration.

--time_partitioning_field

Updates the field used to determine how to create a time-based partition. If time-based partitioning is enabled without this value, the table is partitioned based on the load time.

--time_partitioning_type

Updates the time-based partitioning type for a table. Currently, the only possible value is DAY which generates one partition per day.

--transfer_config

When specified, updates a transfer configuration. The default value is false.

--update_credentials

When specified, updates the transfer configuration credentials. The default value is false.

--use_legacy_sql

When set to false, updates the SQL query for a view from legacy SQL to standard SQL. The default value is true (uses legacy SQL).

--view

When specified, updates the SQL query for a view. The default value is ''.

--view_udf_resource

Updates the Cloud Storage URI or the path to a local code file that is loaded and evaluated immediately as a user-defined function resource in a view's SQL query. Repeat this flag to specify multiple files.

bq wait

The wait command waits some number of seconds for a job to finish.

The wait command uses the --job_id global flag and the following command-specific flags.

[INTEGER]

[INTEGER]:
An integer value >= 0 that specifies the wait time (this value is not a flag — you specify the integer on the command line). If you enter 0, the command polls for job completion and returns immediately. If you do not specify an integer value, the command waits forever.

--fail_on_error

When specified, after the wait time elapses, exit with an error if the job is still running, or ended in failure. The default value is true.

--wait_for_status

When specified, wait for a particular job status before exiting. Possible values include:

  • PENDING
  • RUNNING
  • DONE

The default value is DONE.

Send feedback about...