Batch requests

As a result of the deprecation of global batch HTTP endpoints, batch HTTP requests that target only BigQuery APIs will stop working on June 1, 2021. If your application is sending batch HTTP requests, replace batch HTTP requests with individual HTTP requests before June 1, 2021.

For information about the deprecation, see the following FAQ section. For documentation about how to batch HTTP requests, see Batching requests.

BigQuery batch HTTP API deprecation FAQ

Why are BigQuery batch HTTP requests deprecated?

Support for global HTTP batch endpoints was based on an architecture using a single shared proxy to receive requests for all APIs. As Google moved toward a more distributed, high-performance architecture where requests go directly to the appropriate API server, we could no longer support these global endpoints.

Deprecating BigQuery batch HTTP requests is the next step. The BigQuery service is also distributed. High QPS methods are handled by dedicated backends. All regions are isolated, but batch HTTP requests can cause cross-regional request fanout. This makes batching inefficient and can result in higher processing latency, which is opposite to the original goal of batch HTTP request support.

What specifically is deprecated?

The following batch request methods of interacting with BigQuery APIs will no longer work:

How do I migrate?

Most BigQuery users do not use batch HTTP requests. If you are still using batch requests, use the following examples to replace batch HTTP requests with individual HTTP requests.

REST

Send individual HTTP requests as documented in the BigQuery API reference section. Do not combine your requests in batches using the /batch/v2/bigquery path.

JavaScript

If you're using JavaScript, you start with a code block that looks like this:

// Notice that the outer batch request contains inner API requests
// for two different APIs.

// Request to urlshortener API
request1 = gapi.client.urlshortener.url.get({"shortUrl":
"http://goo.gl/fbsS"});

// Request to zoo API
request2 = gapi.client.zoo.animals.list();

// Request to urlshortener API
request3 = gapi.client.urlshortener.url.get({"shortUrl":
"https://goo.gl/XYFuPH"});

// Request to zoo API
request4 = gapi.client.zoo.animal().get({"name": "giraffe"});

// Creating a batch request object
batchRequest = gapi.client.newBatch();
// adding the 4 batch requests
batchRequest.add(request1);
batchRequest.add(request2);
batchRequest.add(request3);
batchRequest.add(request4);
// print the batch request
batchRequest.then(x=>console.log(x))

Replace the preceding code block with a code block that looks like the following:

// Request to urlshortener API
request1 = gapi.client.urlshortener.url.get({"shortUrl": "http://goo.gl/fbsS"});

// Request to zoo API
request2 = gapi.client.zoo.animals.list();

// Request to urlshortener API
request3 = gapi.client.urlshortener.url.get({"shortUrl": "http://goo.gl/fbsS"})

// Request to zoo API
request4 = gapi.client.zoo.animals.list();

// print the 4 individual requests
Promise.all([request1, request2, request3, request4])
    .then(x=>console.log(x));

Python

If you're using Python, you start with a code block that looks like this:

from apiclient.http import BatchHttpRequest

def insert_animal(request_id, response, exception):
  if exception is not None: # Do something with the exception
    pass
  else: # Do something with the response
    pass

service = build('farm', 'v2')
batch = service.new_batch_http_request(callback=insert_animal)
batch.add(service.animals().insert(name="sheep"))
batch.add(service.animals().insert(name="pig"))
batch.add(service.animals().insert(name="llama"))
batch.execute(http=http)

Replace the preceding code block with a code block that looks like the following:

# import a new API to create a thread pool
from concurrent.futures import ThreadPoolExecutor as PoolExecutor

def run_it(request):
  print(request.execute())

service = build('farm', 'v2')
request1 = service.animals().insert(name="sheep")
request2 = service.animals().insert(name="pig")
request3 = service.animals().insert(name="llama")
with PoolExecutor(max_workers=4) as executor:
  for _ in executor.map(run_it,[request1, request2, request3]):
    pass

Other Languages

Similar to the previous examples, replace BatchRequest calls with individual requests.

Get support for migrating

For help with migration, you can ask questions on Stack Overflow. Google engineers monitor and answer questions with the tag google-bigquery. Use this tag when asking questions. We aim to answer all questions in a reasonable period of time.

Batching requests

This document shows how to batch API calls together to reduce the number of HTTP connections your client has to make.

This document is specifically about making a batch request by sending an HTTP request. If, instead, you're using a Google client library to make a batch request, see the client library's documentation.

Overview

Each HTTP connection your client makes results in a certain amount of overhead. The BigQuery API supports batching, to allow your client to put several API calls into a single HTTP request.

Examples of situations when you might want to use batching:

  • You've just started using the API and you have a lot of data to upload.
  • A user made changes to data while your application was offline (disconnected from the Internet), so your application needs to synchronize its local data with the server by sending a lot of updates and deletes.

In each case, instead of sending each call separately, you can group them together into a single HTTP request. All the inner requests must go to the same Google API.

You're limited to 1000 calls in a single batch request. If you must make more calls than that, use multiple batch requests.

Note: The batch system for the BigQuery API uses the same syntax as the OData batch processing system, but the semantics differ.

Batch details

A batch request consists of multiple API calls combined into one HTTP request, which can be sent to the batchPath specified in the API discovery document. The default path is /batch/api_name/api_version. This section describes the batch syntax in detail; later, there's an example.

Note: A set of n requests batched together counts toward your usage limit as n requests, not as one request. The batch request is separated into a set of requests before processing.

Format of a batch request

A batch request is a single standard HTTP request containing multiple BigQuery API calls, using the multipart/mixed content type. Within that main HTTP request, each of the parts contains a nested HTTP request.

Each part begins with its own Content-Type: application/http HTTP header. It can also have an optional Content-ID header. However, the part headers are just there to mark the beginning of the part; they're separate from the nested request. After the server unwraps the batch request into separate requests, the part headers are ignored.

The body of each part is a complete HTTP request, with its own verb, URL, headers, and body. The HTTP request must only contain the path portion of the URL; full URLs are not allowed in batch requests.

The HTTP headers for the outer batch request, except for the Content- headers such as Content-Type, apply to every request in the batch. If you specify a given HTTP header in both the outer request and an individual call, then the individual call header's value overrides the outer batch request header's value. The headers for an individual call apply only to that call.

For example, if you provide an Authorization header for a specific call, then that header applies only to that call. If you provide an Authorization header for the outer request, then that header applies to all of the individual calls unless they override it with Authorization headers of their own.

When the server receives the batched request, it applies the outer request's query parameters and headers (as appropriate) to each part, and then treats each part as if it were a separate HTTP request.

Response to a batch request

The server's response is a single standard HTTP response with a multipart/mixed content type; each part is the response to one of the requests in the batched request, in the same order as the requests.

Like the parts in the request, each response part contains a complete HTTP response, including a status code, headers, and body. And like the parts in the request, each response part is preceded by a Content-Type header that marks the beginning of the part.

If a given part of the request had a Content-ID header, then the corresponding part of the response has a matching Content-ID header, with the original value preceded by the string response-, as shown in the following example.

Note: The server might perform your calls in any order. Don't count on their being executed in the order in which you specified them. If you want to ensure that two calls occur in a given order, you can't send them in a single request; instead, send the first one by itself, then wait for the response to the first one before sending the second one.

Example

The following example shows the use of batching with a generic (fictional) demo API called the Farm API. However, the same concepts apply to the BigQuery API.

Example batch request

POST /batch/farm/v1 HTTP/1.1
Authorization: Bearer your_auth_token
Host: www.googleapis.com
Content-Type: multipart/mixed; boundary=batch_foobarbaz
Content-Length: total_content_length

--batch_foobarbaz
Content-Type: application/http
Content-ID: <item1:12930812@barnyard.example.com>

GET /farm/v1/animals/pony

--batch_foobarbaz
Content-Type: application/http
Content-ID: <item2:12930812@barnyard.example.com>

PUT /farm/v1/animals/sheep
Content-Type: application/json
Content-Length: part_content_length
If-Match: "etag/sheep"

{
  "animalName": "sheep",
  "animalAge": "5"
  "peltColor": "green",
}

--batch_foobarbaz
Content-Type: application/http
Content-ID: <item3:12930812@barnyard.example.com>

GET /farm/v1/animals
If-None-Match: "etag/animals"

--batch_foobarbaz--

Example batch response

This is the response to the example request in the previous section.

HTTP/1.1 200
Content-Length: response_total_content_length
Content-Type: multipart/mixed; boundary=batch_foobarbaz

--batch_foobarbaz
Content-Type: application/http
Content-ID: <response-item1:12930812@barnyard.example.com>

HTTP/1.1 200 OK
Content-Type application/json
Content-Length: response_part_1_content_length
ETag: "etag/pony"

{
  "kind": "farm#animal",
  "etag": "etag/pony",
  "selfLink": "/farm/v1/animals/pony",
  "animalName": "pony",
  "animalAge": 34,
  "peltColor": "white"
}

--batch_foobarbaz
Content-Type: application/http
Content-ID: <response-item2:12930812@barnyard.example.com>

HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: response_part_2_content_length
ETag: "etag/sheep"

{
  "kind": "farm#animal",
  "etag": "etag/sheep",
  "selfLink": "/farm/v1/animals/sheep",
  "animalName": "sheep",
  "animalAge": 5,
  "peltColor": "green"
}

--batch_foobarbaz
Content-Type: application/http
Content-ID: <response-item3:12930812@barnyard.example.com>

HTTP/1.1 304 Not Modified
ETag: "etag/animals"

--batch_foobarbaz--