Managing Jobs, Datasets, and Projects

This document describes how to manage jobs, datasets, and projects.


Jobs are used to start all potentially long-running actions, for instance: queries, table import, and export requests. Shorter actions, such as list or get requests, are not managed by a job resource.

To perform a job-managed action, you will create a job of the appropriate type, then periodically request the job resource and examine its status property to learn when the job is complete, and then check to see whether it finished successfully. Note that there are some wrapper functions that manage the status requests for you: for examples, you can run jobs.query which creates the job and periodically polls for DONE status for a specified period of time.

Jobs in BigQuery persist forever. This includes jobs that are running or completed, whether they have succeeded or failed. You can only list or get information about jobs that you have started, unless you are a project owner, who can perform all actions on any jobs associated with their project.

Every job is associated with a specific project that you specify; this project is billed for any usage incurred by the job. In order to run a job of any kind, you must have READ permissions on the job's project.

Here is how to run a standard job:

  1. Start the job by calling the jobs.insert method using a unique job ID generated by your client code. The server will generate a job ID for you if you omit it, but we recommend generating it on the client side to allow reliable retry of the jobs.insert call.
  2. Check job status by calling jobs.get with the job ID and check the status.state value to learn the job status. When status.state=DONE, the job has stopped running; however, a DONE status does not mean that the job completed successfully, only that it is no longer running.
  3. Check for job success. If the job has a status.errorResult property, the job has failed; this property holds information describing what went wrong in a failed job. If status.errorResult is absent, the job finished successfully, although there might have been some non-fatal errors, such as problems importing a few rows in an import request. Non-fatal errors are listed in the returned job's status.errors list.

See the asynchronous query as an example of starting and polling a job.

There is no single-call method to re-run a job; if you want to re-run a specific job:

  1. Call jobs.get to retrieve the resource for the job to re-run,
  2. Remove the id, status, and statistics field. Change the jobId field to a new value generated by your client code. Change any other fields as necessary.
  3. Call jobs.insert with the modified resource and the new job ID to start the new job.

Running or pending jobs can be cancelled by calling jobs.cancel. Cancelling a running query job may incur charges up to the full cost for the query were it allowed to run to completion.

See jobs in the reference section for more information.

Generating a job ID

As a best practice, you should generate a job ID using your client code and send that job ID when you call jobs.insert. If you call jobs.insert without specifying a job ID, BigQuery will create a job ID for you, but you will not be able to check the status of that job until the call returns. Moreover, it may be difficult to tell whether the job was successfully inserted or not. If you use your own job ID, you can check the status of the job at any time and you can retry on the same job ID to ensure that the job starts exactly one time.

The job ID is a string comprising letters (a-z, A-Z), numbers (0-9), underscores (_), or dashes (-), with a maximum length of 1,024 characters. Job IDs must be unique within any given project.

A common approach to generating a unique job ID is to use a human-readable prefix and a suffix consisting of a timestamp or a GUID. For example: "daily_import_job_1447971251". An example of a method that generates GUIDs can be found in the Python UUID module. For an example of using the Python uuid4() method with jobs.insert, see the example code in Loading data from Google Cloud Storage.

Back to top


A dataset is a grouping mechanism that holds zero or more tables. Datasets are the lowest level unit of access control; you cannot control access at the table level. Read more about datasets in the reference section. A dataset is contained within a specific project. You can list datasets to which you have access by calling bigquery.datasets.list.

Datasets have no hard limit on how many tables they can contain. If a dataset has tens or hundreds of thousands of tables, it will become slower to enumerate them, whether through an API call, the web UI, or querying __TABLES_SUMMARY__.

Choosing a location

You can optionally choose the geographic location for your dataset when the dataset is created. All tables within the dataset inherit the same location value. Possible options include:

  • "US": United States
  • "EU": European Union

For legal information about the location feature, see the Google Cloud Platform Service Specific Terms.

Location limitations

  • You can only set the geographic location at creation time. After a dataset has been created, the location becomes immutable and can't be changed by the patch or update methods.
  • All tables referenced in a query must be stored in the same location.
  • You can stream data into a US or EU dataset, but inserting data across these locations can increase latency and error rates.
  • When copying a table, the destination dataset must reside in the same location.
  • Google Cloud Logging is unsupported for EU datasets.
  • Google Analytics Premium customers who export their data to BigQuery must use a US-based BigQuery dataset as the destination.

Setting the location

To set the dataset location:

BigQuery web UI

When creating a dataset, select the location from the Data location dropdown.

BigQuery command-line tool

Use the --data_location=<location> flag.

BigQuery API

Set the location property.



This sample uses the Google APIs Client Library for Java.

public static void listDatasets(Bigquery bigquery, String projectId) throws IOException {
  Datasets.List datasetRequest = bigquery.datasets().list(projectId);
  DatasetList datasetList = datasetRequest.execute();

  if (datasetList.getDatasets() != null) {
    List<DatasetList.Datasets> datasets = datasetList.getDatasets();
    System.out.println("Dataset list:");

    for (DatasetList.Datasets dataset : datasets) {
      System.out.format("%s\n", dataset.getDatasetReference().getDatasetId());


This sample uses the Google Cloud Client Library for Python.

def list_datasets(project=None):
    """Lists all datasets in a given project.

    If no project is specified, then the currently active project is used
    bigquery_client = bigquery.Client(project=project)

    datasets = []
    page_token = None

    while True:
        results, page_token = bigquery_client.list_datasets(

        if not page_token:

    for dataset in datasets:


This sample uses the Google APIs Client Library for C#.

public IEnumerable<DatasetList.DatasetsData> ListDatasets(
    string projectId)
    BigqueryService bigquery = CreateAuthorizedClient();
    var datasetRequest =
        new DatasetsResource.ListRequest(bigquery, projectId);
    // Sometimes Datasets will be null instead of an empty list.
    // It's easy to forget that and dereference null.  So, catch
    // that case and return an empty list.
    return datasetRequest.Execute().Datasets ??
        new DatasetList.DatasetsData[] { };


This sample uses the Google Cloud Client Library for PHP.

use Google\Cloud\ServiceBuilder;

 * @param string $projectId The Google project ID.
function list_datasets($projectId)
    $builder = new ServiceBuilder([
        'projectId' => $projectId,
    $bigQuery = $builder->bigQuery();
    $datasets = $bigQuery->datasets();
    foreach ($datasets as $dataset) {
        print($dataset->id() . PHP_EOL);


This sample uses the Google Cloud Client Library for Node.js.

function listDatasets (projectId, callback) {
  var bigquery = BigQuery({
    projectId: projectId

  // See
  bigquery.getDatasets(function (err, datasets) {
    if (err) {
      return callback(err);

    console.log('Found %d dataset(s)!', datasets.length);
    return callback(null, datasets);


A project holds a group of datasets. Projects are created and managed in the APIs console. Jobs are billed to the project to which they are assigned. You can list projects to which you have access by calling bigquery.projects.list.

See projects in the reference section and Managing Projects in the APIs Console help for more information.



This sample uses the Google APIs Client Library for Java.

public static void listProjects(Bigquery bigquery) throws IOException {
  Bigquery.Projects.List projectListRequest = bigquery.projects().list();
  ProjectList projectList = projectListRequest.execute();

  if (projectList.getProjects() != null) {
    List<ProjectList.Projects> projects = projectList.getProjects();
    System.out.println("Project list:");

    for (ProjectList.Projects project : projects) {
      System.out.format("%s\n", project.getFriendlyName());


This sample uses the Google APIs Client Library for Python.

def list_projects(bigquery):
        projects = bigquery.projects()
        list_reply = projects.list().execute()

        print('Project list:')

    except HTTPError as err:
        print('Error in list_projects: %s' % err.content)
        raise err


This sample uses the Google APIs Client Library for .NET.

public IEnumerable<ProjectList.ProjectsData> ListProjects()
    BigqueryService bigquery = CreateAuthorizedClient();
    var projectRequest = new ProjectsResource.ListRequest(bigquery);
    // Sometimes Projects will be null instead of an empty list.
    // It's easy to forget that and dereference null.  So, catch
    // that case and return an empty list.
    return projectRequest.Execute().Projects ??
        new ProjectList.ProjectsData[] { };


This sample uses the Google Cloud Client Library for PHP.

use Google\Auth\CredentialsLoader;
use Google\Cloud\BigQuery\BigQueryClient;
use Google\Cloud\BigQuery\Connection\Rest;

function list_projects()
    $keyFile = CredentialsLoader::fromWellKnownFile();
    $scopes = BigQueryClient::SCOPE;
    $connection = new Rest([
        'scopes' => $scopes,
        'keyFile' => $keyFile,
    $result = $connection->send('projects', 'list');
    foreach ($result['projects'] as $project) {
        print($project['id'] . PHP_EOL);


This sample uses the Google Cloud Client Library for Node.js.

function listProjects (callback) {
  // See
  resource.getProjects(function (err, projects) {
    if (err) {
      return callback(err);

    console.log('Found %d project(s)!', projects.length);
    return callback(null, projects);

Send feedback about...

BigQuery Documentation