Running jobs programmatically

To run a BigQuery job programmatically using the REST API or client libraries, you:

  1. Call the jobs.insert method using a unique job ID that is generated by your client code
  2. Periodically request the job resource and examine the status property to learn when the job is complete
  3. Check to see whether the job finished successfully

Required permissions

At a minimum, to run BigQuery jobs, you must be granted bigquery.jobs.create permissions. The following predefined IAM roles incude bigquery.jobs.create permissions:

  • bigquery.user
  • bigquery.jobUser
  • bigquery.admin

For more information on IAM roles and permissions in BigQuery, see Predefined roles and permissions.

Running jobs

To run a job programmatically:

  1. Start the job by calling the jobs.insert method. When you call the jobs.insert method, include a job resource representation that contains:

    • Your location in the location property in the jobReference section.
    • The job ID generated by your client code. The server generates a job ID for you if you omit it, but it is a best practice to generate the job ID on the client side to allow reliable retry of the jobs.insert call.

      For example:

      {
        "jobReference": {
          "projectId": "my_project",
          "jobId": "job_123",
          “location”: “asia-northeast1”
        },
        "configuration":
        {
          // ..
        },
      }
      

  2. In the configuration section of the job resource, include a child property that specifies the job type — load, query, extract, or copy.

  3. After calling the jobs.insert method, check the job status by calling jobs.get with the job ID and location, and check the status.state value to learn the job status. When status.state is DONE, the job has stopped running; however, a DONE status does not mean that the job completed successfully, only that it is no longer running.

  4. Check for job success. If the job has an errorResult property, the job has failed. The status.errorResult property holds information describing what went wrong in a failed job. If status.errorResult is absent, the job finished successfully, although there might have been some non-fatal errors, such as problems importing a few rows in a load job. Non-fatal errors are returned in the job's status.errors list.

Running jobs using client libraries

To create and run a job using the Cloud Client Libraries for BigQuery:

C#

Before trying this sample, follow the C# setup instructions in the BigQuery Quickstart Using Client Libraries. For more information, see the BigQuery C# API reference documentation.


using Google.Cloud.BigQuery.V2;
using System;
using System.Collections.Generic;

public class BigQueryCreateJob
{
    public BigQueryJob CreateJob(string projectId = "your-project-id")
    {
        string query = @"
            SELECT country_name from `bigquery-public-data.utility_us.country_code_iso";

        // Initialize client that will be used to send requests.
        BigQueryClient client = BigQueryClient.Create(projectId);

        QueryOptions queryOptions = new QueryOptions
        {
            JobLocation = "us",
            JobIdPrefix = "code_sample_",
            Labels = new Dictionary<string, string>
            {
                ["example-label"] = "example-value"
            },
            MaximumBytesBilled = 1000000
        };

        BigQueryJob queryJob = client.CreateQueryJob(
            sql: query,
            parameters: null,
            options: queryOptions);

        Console.WriteLine($"Started job: {queryJob.Reference.JobId}");
        return queryJob;
    }
}

Java

Before trying this sample, follow the Java setup instructions in the BigQuery Quickstart Using Client Libraries. For more information, see the BigQuery Java API reference documentation.

import com.google.cloud.bigquery.BigQuery;
import com.google.cloud.bigquery.BigQueryException;
import com.google.cloud.bigquery.BigQueryOptions;
import com.google.cloud.bigquery.Job;
import com.google.cloud.bigquery.JobId;
import com.google.cloud.bigquery.JobInfo;
import com.google.cloud.bigquery.QueryJobConfiguration;
import com.google.common.collect.ImmutableMap;
import java.util.UUID;

// Sample to create a job
public class CreateJob {

  public static void main(String[] args) {
    // TODO(developer): Replace these variables before running the sample.
    String query = "SELECT country_name from `bigquery-public-data.utility_us.country_code_iso`";
    createJob(query);
  }

  public static void createJob(String query) {
    try {
      // Initialize client that will be used to send requests. This client only needs to be created
      // once, and can be reused for multiple requests.
      BigQuery bigquery = BigQueryOptions.getDefaultInstance().getService();

      // Specify a job configuration to set optional job resource properties.
      QueryJobConfiguration queryConfig =
          QueryJobConfiguration.newBuilder(query)
              .setLabels(ImmutableMap.of("example-label", "example-value"))
              .build();

      // The location and job name are optional,
      // if both are not specified then client will auto-create.
      String jobName = "jobId_" + UUID.randomUUID().toString();
      JobId jobId = JobId.newBuilder().setLocation("us").setJob(jobName).build();

      // Create a job with job ID
      bigquery.create(JobInfo.of(jobId, queryConfig));

      // Get a job that was just created
      Job job = bigquery.getJob(jobId);
      if (job.getJobId().getJob().equals(jobId.getJob())) {
        System.out.print("Job created successfully." + job.getJobId().getJob());
      } else {
        System.out.print("Job was not created");
      }
    } catch (BigQueryException e) {
      System.out.print("Job was not created. \n" + e.toString());
    }
  }
}

Python

Before trying this sample, follow the Python setup instructions in the BigQuery Quickstart Using Client Libraries. For more information, see the BigQuery Python API reference documentation.

from google.cloud import bigquery

# Construct a BigQuery client object.
client = bigquery.Client()

query_job = client.query(
    "SELECT country_name from `bigquery-public-data.utility_us.country_code_iso`",
    # Explicitly force job execution to be routed to a specific processing
    # location.
    location="US",
    # Specify a job configuration to set optional job resource properties.
    job_config=bigquery.QueryJobConfig(
        labels={"example-label": "example-value"}, maximum_bytes_billed=1000000
    ),
    # The client libraries automatically generate a job ID. Override the
    # generated ID with either the job_id_prefix or job_id parameters.
    job_id_prefix="code_sample_",
)  # Make an API request.

print("Started job: {}".format(query_job.job_id))

Generating a job ID

As a best practice, generate a job ID using your client code and send that job ID when you call jobs.insert. If you call jobs.insert without specifying a job ID, BigQuery will create a job ID for you, but you will not be able to check the status of that job until the call returns. As well, it might be difficult to tell if the job was successfully inserted. If you use your own job ID, you can check the status of the job at any time, and you can retry on the same job ID to ensure that the job starts exactly one time.

The job ID is a string comprising letters (a-z, A-Z), numbers (0-9), underscores (_), or dashes (-), with a maximum length of 1,024 characters. Job IDs must be unique within a project.

A common approach to generating a unique job ID is to use a human-readable prefix and a suffix consisting of a timestamp or a GUID. For example: daily_import_job_1447971251.

An example of a method that generates GUIDs can be found in the Python UUID module. For an example of using the Python uuid4() method with jobs.insert, see Loading data from Cloud Storage.

Next steps

  • See Running queries for a code example that starts and polls a query job.
  • For more information on creating a job resource representation, see the Jobs overview page in the API reference.