Running Interactive and Batch Queries

This document describes how to query data using interactive and batch queries.

Required permissions

Jobs are actions that BigQuery executes on your behalf to load data, export data, query data, or copy data.

When you use the BigQuery web UI or CLI to load, export, query, or copy data, a job resource is automatically created, scheduled, and run. You can also programmatically create a load, export, query, or copy job. When you create a job programmatically, BigQuery schedules and runs the job for you.

Because jobs can potentially take a long time to complete, they execute asynchronously and can be polled for their status. Shorter actions, such as listing resources or getting metadata are not managed by a job resource.

Running a query job requires bigquery.jobs.create permissions. In order for the query job to complete successfully, the user or group must also have access to the dataset containing the tables referenced by the query.

You can set bigquery.jobs.create permissions at the project level by granting any of the following predefined IAM roles:

  • bigquery.user
  • bigquery.jobUser
  • bigquery.admin

If you grant a user or group the bigquery.user role at the project level, by default, no access is granted to any of the datasets, tables, or views in the project. bigquery.user gives users the ability to create their own datasets and to run query jobs against datasets they have been given access to. If you assign the bigquery.user or bigquery.jobUser role, you must also assign access controls to each dataset the user or group needs to access that wasn't created by the user.

When you assign access to a dataset, there are 3 options:

The minimum access required for a user to run a query is "Can view".

For more information on IAM roles in BigQuery, see Access Control.

Running interactive queries

By default, BigQuery runs interactive queries, which means that the query is executed as soon as possible. Interactive queries count towards your concurrent rate limit and your daily limit.

Query results are always saved to either a temporary or permanent table. You can choose whether to append or overwrite data in an existing table and whether to create a new table if none exists by that name.

To run an interactive query that writes to a temporary table:

Web UI

  1. Go to the BigQuery web UI.
    Go to the BigQuery web UI

  2. Click Compose query.

  3. Enter a valid BigQuery SQL query in the New Query text area.

  4. Click Show Options.

  5. For Processing Location, click Unspecified and choose your data's location. You can leave processing location set to unspecified if your data is in the US or EU multi-region location. When your data is in the US or the EU, the processing location is automatically detected.

  6. Click Run query.

This creates a query job that writes the output to a temporary table.

Command-line

Enter the bq query command and include your query text. Supply the --location flag and set the value to your location.

You can specify the following optional flags. This list includes some of the most common flags. For a complete list of query command flags, see bq query in the bq command-line tool reference.

Specify the:

  • --destination_table flag to create a permanent table based on the query results. To write the query results to a table that is not in your default project, add the project ID to the dataset name in the following format: [PROJECT_ID]:[DATASET]. If --destination_table is unspecified, a query job is generated that writes the output to a temporary (cache) table.
  • --append_table flag to append the query results to a destination table.
  • --destination_kms_key flag to use a Cloud KMS key to encrypt the destination table data.
  • --use_legacy_sql=false flag to use standard SQL syntax. You can set a default syntax for the command-line tool using the .bigqueryrc file.
  • --label flag to apply a label to the query job in the form [KEY]:[VALUE]. Repeat this flag to specify multiple labels.
  • --max_rows or -n flag to specify the number of rows to return in the query results.
  • --maximum_bytes_billed flag to limit the bytes billed for the query. If the query goes beyond the limit, it fails (without incurring a charge). If not specified, the bytes billed is set to the project default.
  • --udf_resource flag to load and evaluate a code file to be used as a user-defined function resource. You can specify a Cloud Storage URI or the path to a local code file. Repeat this flag to specify multiple files.

Enter the following command to run an interactive query using standard SQL syntax:

bq --location=[LOCATION] query --use_legacy_sql=false '[QUERY]'

Where:

  • [LOCATION] is the name of the location where the query is processed. The --location flag is optional if your data is in the US or the EU multi-region location. For example, if you are using BigQuery in the Tokyo region, set the flag's value to asia-northeast1. You can set a default value for the location using the .bigqueryrc file.
  • [QUERY] is a query in standard SQL syntax.

Examples:

Enter the following command to write interactive query results to a destination table named mytable in mydataset. The dataset is in your default project. The query retrieves data from a the USA Name Data public dataset.

bq --location=US query --destination_table mydataset.mytable --use_legacy_sql=false 'SELECT name,number FROM `bigquery-public-data.usa_names.usa_1910_current` WHERE gender = "M" ORDER BY number DESC'

Enter the following command to write interactive query results to a destination table named mytable in mydataset. The dataset is in myotherproject, not your default project. The query retrieves data from a non-partitioned table — the USA Name Data public dataset.

bq --location=US query --destination_table myotherproject:mydataset.mytable --use_legacy_sql=false 'SELECT name,number FROM `bigquery-public-data.usa_names.usa_1910_current` WHERE gender = "M" ORDER BY number DESC'

For more information, see Using the bq command-line tool.

API

To run a query using the API, insert a new job and populate the jobs#configuration.query property. Specify your location in the location property in the jobReference section of the job resource.

Poll for results by calling getQueryResults. Poll until jobComplete equals true. Check for errors and warnings in the errors list.

C#

For more on installing and creating a BigQuery client, refer to BigQuery Client Libraries.

public BigQueryResults AsyncQuery(string projectId, string datasetId, string tableId,
    string query, BigQueryClient client)
{
    var table = client.GetTable(projectId, datasetId, tableId);
    BigQueryJob job = client.CreateQueryJob(query,
        parameters: null,
        options: new QueryOptions { UseQueryCache = false });

    // Wait for the job to complete.
    job.PollUntilCompleted();

    // Then we can fetch the results, either via the job or by accessing
    // the destination table.
    return client.GetQueryResults(job.Reference.JobId);
}

Go

For more on installing and creating a BigQuery client, refer to BigQuery Client Libraries.

q := client.Query(
	"SELECT name FROM `bigquery-public-data.usa_names.usa_1910_2013` " +
		"WHERE state = \"TX\" " +
		"LIMIT 100")
// Location must match that of the dataset(s) referenced in the query.
q.Location = "US"

job, err := q.Run(ctx)
if err != nil {
	return err
}

// Wait until async querying is done.
status, err := job.Wait(ctx)
if err != nil {
	return err
}
if err := status.Err(); err != nil {
	return err
}

it, err := job.Read(ctx)
for {
	var row []bigquery.Value
	err := it.Next(&row)
	if err == iterator.Done {
		break
	}
	if err != nil {
		return err
	}
	fmt.Println(row)
}

Java

For more on installing and creating a BigQuery client, refer to BigQuery Client Libraries.

// BigQuery bigquery = BigQueryOptions.getDefaultInstance().getService();
String query = "SELECT corpus FROM `bigquery-public-data.samples.shakespeare` GROUP BY corpus;";
QueryJobConfiguration queryConfig = QueryJobConfiguration.newBuilder(query).build();

// Print the results.
for (FieldValueList row : bigquery.query(queryConfig).iterateAll()) {
  for (FieldValue val : row) {
    System.out.printf("%s,", val.toString());
  }
  System.out.printf("\n");
}

Node.js

For more on installing and creating a BigQuery client, refer to BigQuery Client Libraries.

// Imports the Google Cloud client library
const BigQuery = require('@google-cloud/bigquery');

/**
 * TODO(developer): Uncomment the following lines before running the sample.
 */
// const projectId = "your-project-id";
// const sqlQuery = "SELECT * FROM publicdata.samples.natality LIMIT 5;";

// Creates a client
const bigquery = new BigQuery({
  projectId: projectId,
});

// Query options list: https://cloud.google.com/bigquery/docs/reference/v2/jobs/query
const options = {
  query: sqlQuery,
  useLegacySql: false, // Use standard SQL syntax for queries.
};

let job;

// Runs the query as a job
bigquery
  .createQueryJob(options)
  .then(results => {
    job = results[0];
    console.log(`Job ${job.id} started.`);
    return job.promise();
  })
  .then(() => {
    // Get the job's status
    return job.getMetadata();
  })
  .then(metadata => {
    // Check the job's status for errors
    const errors = metadata[0].status.errors;
    if (errors && errors.length > 0) {
      throw errors;
    }
  })
  .then(() => {
    console.log(`Job ${job.id} completed.`);
    return job.getQueryResults();
  })
  .then(results => {
    const rows = results[0];
    console.log('Rows:');
    rows.forEach(row => console.log(row));
  })
  .catch(err => {
    console.error('ERROR:', err);
  });

PHP

For more on installing and creating a BigQuery client, refer to BigQuery Client Libraries.

use Google\Cloud\BigQuery\BigQueryClient;
use Google\Cloud\Core\ExponentialBackoff;

/**
 * Run a BigQuery query as a job.
 * Example:
 * ```
 * $query = 'SELECT TOP(corpus, 10) as title, COUNT(*) as unique_words ' .
 *          'FROM `bigquery-public-data.samples.shakespeare`';
 * run_query_as_job($projectId, $query, false);
 * ```.
 *
 * @param string $projectId The Google project ID.
 * @param string $query     A SQL query to run. *
 * @param bool $useLegacySql Specifies whether to use BigQuery's legacy SQL
 *        syntax or standard SQL syntax for this query.
 */
function run_query_as_job($projectId, $query, $useLegacySql)
{
    $bigQuery = new BigQueryClient([
        'projectId' => $projectId,
    ]);
    $jobConfig = $bigQuery->query($query)->useLegacySql($useLegacySql);
    $job = $bigQuery->startQuery($jobConfig);

    $backoff = new ExponentialBackoff(10);
    $backoff->execute(function () use ($job) {
        print('Waiting for job to complete' . PHP_EOL);
        $job->reload();
        if (!$job->isComplete()) {
            throw new Exception('Job has not yet completed', 500);
        }
    });
    $queryResults = $job->queryResults();

    $i = 0;
    foreach ($queryResults as $row) {
        printf('--- Row %s ---' . PHP_EOL, ++$i);
        foreach ($row as $column => $value) {
            printf('%s: %s' . PHP_EOL, $column, json_encode($value));
        }
    }
    printf('Found %s row(s)' . PHP_EOL, $i);
}

Python

For more on installing and creating a BigQuery client, refer to BigQuery Client Libraries.

# from google.cloud import bigquery
# client = bigquery.Client()

query = (
    'SELECT name FROM `bigquery-public-data.usa_names.usa_1910_2013` '
    'WHERE state = "TX" '
    'LIMIT 100')
query_job = client.query(
    query,
    # Location must match that of the dataset(s) referenced in the query.
    location='US')  # API request - starts the query

for row in query_job:  # API request - fetches results
    # Row values can be accessed by field name or index
    assert row[0] == row.name == row['name']
    print(row)

Ruby

For more on installing and creating a BigQuery client, refer to BigQuery Client Libraries.

# project_id   = "your google cloud project id"
# query_string = "query string to execute (using bigquery query syntax)"

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new project: project_id

puts "Running query"
query_job = bigquery.query_job query_string

puts "Waiting for query to complete"
query_job.wait_until_done!

puts "Query results:"
query_job.query_results.each do |row|
  puts row.inspect
end

Running batch queries

BigQuery also offers batch queries. BigQuery queues each batch query on your behalf, and starts the query as soon as idle resources are available, usually within a few minutes. If BigQuery hasn't started the query within 24 hours, BigQuery changes the job priority to interactive. Batch queries don't count towards your concurrent rate limit, which can make it easier to start many queries at once.

To run a batch query:

Web UI

  1. Go to the BigQuery web UI.
    Go to the BigQuery web UI

  2. Click the Compose query button.

  3. Enter a valid BigQuery SQL query in the New Query text area.

  4. Click the Show Options button.

  5. Select the Batch option in the Query Priority section.

  6. For Processing Location, click Unspecified and choose your data's location. You can leave processing location set to unspecified if your data is in the US or EU multi-region location. When your data is in the US or the EU, the processing location is automatically detected.

  7. Click the Run query button.

Command-line

Enter the bq query command and include your query text. Specify the --batch flag to run a batch query. Supply the --location flag and set the value to your location.

You can specify the following optional flags. This list includes some of the most common flags. For a complete list of query command flags, see bq query in the bq command-line tool reference.

Specify the:

  • --destination_table flag to create a permanent table based on the query results. To write the query results to a table that is not in your default project, add the project ID to the dataset name in the following format: [PROJECT_ID]:[DATASET]. If --destination_table is unspecified, a query job is generated that writes the output to a temporary (cache) table.
  • --append_table flag to append the query results to a destination table.
  • --destination_kms_key flag to use a Cloud KMS key to encrypt the destination table data.
  • --use_legacy_sql=false flag to use standard SQL syntax. You can set a default syntax for the command-line tool using the .bigqueryrc file.
  • --label flag to apply a label to the query job in the form [KEY]:[VALUE]. Repeat this flag to specify multiple labels.
  • --max_rows or -n flag to specify the number of rows to return in the query results.
  • --maximum_bytes_billed flag to limit the bytes billed for the query. If the query goes beyond the limit, it fails (without incurring a charge). If not specified, the bytes billed is set to the project default.
  • --udf_resource flag to load and evaluate a code file to be used as a user-defined function resource. You can specify a Cloud Storage URI or the path to a local code file. Repeat this flag to specify multiple files.

Enter the following command to run a batch query using standard SQL syntax:

bq --location=[LOCATION] query --batch --use_legacy_sql=false '[QUERY]'

Where:

  • [LOCATION] is the name of the location where the query is processed. The --location flag is optional if your data is in the US or the EU multi-region location. For example, if you are using BigQuery in the Tokyo region, set the flag's value to asia-northeast1. You can set a default value for the location using the .bigqueryrc file.
  • [QUERY] is a query in standard SQL syntax.

Examples:

Enter the following command to write batch query results to a destination table named mytable in mydataset. The dataset is in your default project. The query retrieves data from a the USA Name Data public dataset.

bq --location=US query --batch --destination_table mydataset.mytable --use_legacy_sql=false 'SELECT name,number FROM `bigquery-public-data.usa_names.usa_1910_current` WHERE gender = "M" ORDER BY number DESC'

Enter the following command to write batch query results to a destination table named mytable in mydataset. The dataset is in myotherproject, not your default project. The query retrieves data from a non-partitioned table — the USA Name Data public dataset.

bq --location=US query --batch --destination_table myotherproject:mydataset.mytable --use_legacy_sql=false 'SELECT name,number FROM `bigquery-public-data.usa_names.usa_1910_current` WHERE gender = "M" ORDER BY number DESC'

For more information, see Using the bq command-line tool.

API

To run a query using the API, insert a new job and populate the jobs#configuration.query property. Specify your location in the location property in the jobReference section of the job resource.

When you populate the query job properties, include the configuration.query.priority property, with the value set to BATCH.

Go

For more on installing and creating a BigQuery client, refer to BigQuery Client Libraries.

	// Build an aggregate table.
	q := client.Query(`
		SELECT
  			corpus,
  			SUM(word_count) as total_words,
  			COUNT(1) as unique_words
		FROM ` + "`bigquery-public-data.samples.shakespeare`" + `
		GROUP BY corpus;`)
	q.Priority = bigquery.BatchPriority
	q.QueryConfig.Dst = client.Dataset(dstDatasetID).Table(dstTableID)

	// Start the job.
	job, err := q.Run(ctx)
	if err != nil {
		return err
	}
	// Job is started and will progress without interaction.
	// To simulate other work being done, sleep a few seconds.
	time.Sleep(5 * time.Second)
	status, err := job.Status(ctx)
	if err != nil {
		return err
	}

	state := "Unknown"
	switch status.State {
	case bigquery.Pending:
		state = "Pending"
	case bigquery.Running:
		state = "Running"
	case bigquery.Done:
		state = "Done"
	}
	// You can continue to monitor job progress until it reaches
	// the Done state by polling periodically.  In this example,
	// we print the latest status.
	fmt.Printf("Job %s in Location %s currently in state: %s\n", job.ID(), job.Location(), state)

Java

For more on installing and creating a BigQuery client, refer to BigQuery Client Libraries.

To run a batch query, set the query priority to QueryJobConfiguration.Priority.BATCH when creating a QueryJobConfiguration.

// BigQuery bigquery = BigQueryOptions.getDefaultInstance().getService();
String query = "SELECT corpus FROM `bigquery-public-data.samples.shakespeare` GROUP BY corpus;";
QueryJobConfiguration queryConfig =
    QueryJobConfiguration.newBuilder(query)
        // Run at batch priority, which won't count toward concurrent rate
        // limit.
        .setPriority(QueryJobConfiguration.Priority.BATCH)
        .build();

// Location must match that of the dataset(s) referenced in the query.
JobId jobId = JobId.newBuilder().setRandomJob().setLocation("US").build();
String jobIdString = jobId.getJob();

// API request - starts the query.
bigquery.create(JobInfo.newBuilder(queryConfig).setJobId(jobId).build());

// Check on the progress by getting the job's updated state. Once the state
// is `DONE`, the results are ready.
Job queryJob = bigquery.getJob(
    JobId.newBuilder().setJob(jobIdString).setLocation("US").build());
System.out.printf(
    "Job %s in location %s currently in state: %s%n",
    queryJob.getJobId().getJob(),
    queryJob.getJobId().getLocation(),
    queryJob.getStatus().getState().toString());

Python

For more on installing and creating a BigQuery client, refer to BigQuery Client Libraries.

# from google.cloud import bigquery
# client = bigquery.Client()

job_config = bigquery.QueryJobConfig()
# Run at batch priority, which won't count toward concurrent rate limit.
job_config.priority = bigquery.QueryPriority.BATCH
sql = """
    SELECT corpus
    FROM `bigquery-public-data.samples.shakespeare`
    GROUP BY corpus;
"""
# Location must match that of the dataset(s) referenced in the query.
location = 'US'

# API request - starts the query
query_job = client.query(sql, location=location, job_config=job_config)

# Check on the progress by getting the job's updated state. Once the state
# is `DONE`, the results are ready.
query_job = client.get_job(
    query_job.job_id, location=location)  # API request - fetches job
print('Job {} is currently in state {}'.format(
    query_job.job_id, query_job.state))

Was this page helpful? Let us know how we did:

Send feedback about...