HTTP guidelines

This document describes how Google APIs work with different HTTP versions and implementations. If you're using one of our generated or hand-crafted client libraries (our recommended approach for most use cases), you don't need to worry about this, as the provided code deals with the low-level issues of communicating with the server.

However, if you are an experienced developer and need to write your own custom code to directly access an API's REST interface using a third-party HTTP client library of your choice, you should understand at least some of the semantics documented here (as relevant to your chosen API), as well as understanding what's provided by your HTTP library.

Working with wire protocols (HTTP/*)

This section describes the supported wire protocols (typically a version of HTTP) that Cloud APIs can use to communicate between clients and servers, and how we recommend you use them. We'll look at the details of how requests and responses are structured in the next section below.

HTTP semantics

When developing API client code, follow standard HTTP protocol semantics. Server-side proxies or API stacks may only support a subset of standard HTTP features, and may also support their backward-compatible versions.

HTTP protocol semantics that need be handled by the server-side implementations of APIs are controlled by the server stack. Only rely on such semantics if these features are explicitly documented as part of the API spec, such as caching support.

HTTP versions

Clients may use any HTTP/* protocols, as allowed by the client platform or their client-side network, or as negotiated with the server-side proxy. Supported protocols include HTTP/1.0, HTTP/1.1, SPDY/*, HTTP/2, and QUIC.

Some API features may only be supported by newer versions of HTTP protocols, such as server-push and priorities; some are only fully specified with HTTP/2, such as full-duplex streaming. Be aware of the limitations of different HTTP versions if you require any of these features as part of the API spec.

Generally we recommend HTTP/2 for better performance as well as resilience to network failures.

Channels

Channels refer to the L4 network connections (TCP or UDP sockets). Client applications should not make any assumption on how channels are used in the runtime to serve HTTP requests. In almost all cases, channels are terminated by proxies on behalf of the server process.

For HTTP/1.1 clients, always reuse TCP connections (Connection: Keep-Alive); the HTTP client library will likely manage a connection pool too for better performance. Do not pipeline requests over the same TCP connection. See HTTP and TCP for further information.

Modern browsers all speak SPDY/*, HTTP/2, or QUIC, which multiplexes requests over a single channel. The traditional connection limit (2-10) should never be a concern except when the server implementation throttles the number of concurrent HTTP requests from a single client, for instance, 100 HTTP/2 streams against a single origin.

HTTPS

Clients can access an API via HTTPS or HTTP, as supported by the API spec. TLS negotiation and TLS versions are transparent to client applications. By default, Google APIs only accept HTTPS traffic.

Request/Response Formats

Request URLs

JSON-REST mapping supports URL encoded request data, and the HTTP request and response body use application/json as the Content-Type.

The HTTP body uses a JSON array to support streamed RPC methods, and the JSON array may contain any number of JSON messages or an error-status JSON message.

Long Request URLs

The URL has a practical length limitation, typically set to 16KB by default, though this may vary depending on the server. If your API uses GET requests with URLs that exceed this length, the requests may not reach the destination API server and will be rejected by the Google Front End (GFE) with the error message Your client has issued a malformed or illegal request.

To bypass the limitation, the client code should use a POST request with Content-Type of application/x-www-form-urlencoded along with HTTP header X-HTTP-Method-Override: GET. This approach also works for DELETE requests.

HTTP Methods (Verbs)

If the request URLs follow the REST model, their HTTP methods are specified as part of the API specification. In particular, every API method must comply with the requirements of HTTP protocol based on the specific HTTP Verb to which the API method maps. For details, please refer to the Hypertext Transfer Protocol specification and the PATCH Method RFC.

Safe Methods, such as HTTP GET and HEAD should not represent an action other than retrieval. Specifically, HTTP GET ought to be considered safe and should not have any client-visible side-effects.

Idempotence in HTTP means that the side-effects of multiple identical requests are the same as for a single request. GET, PUT, and DELETE are the idempotent HTTP methods relevant to the style guide. Note that idempotence is only expressed in terms of server side-effects and does not specify anything about the response. In particular DELETE for non-existing resources should return 404 (Not Found).

HTTP POST and PATCH are neither safe nor idempotent. (PATCH was introduced in RFC 5789)

HTTP Verb	Safe	Idempotent
`GET`	Yes	Yes
`PUT`		Yes
`DELETE`		Yes
`POST`
`PATCH`

Payload Formats

Request and response should share the same Content-Type, except when the request is a GET or a POST with an "application/x-www-form-urlencoded body.
JSON is supported under the application/json MIME type. The mapping from proto3 to JSON is formally specified in JSON Mapping.
Form parameters (POST) may be used in place of URL query parameters (GET), following the same REST-style mapping rule for mapping request fields to query parameters. The supported Content-Type is application/x-www-form-urlencoded.

Streaming

Half-duplex versus full-duplex

HTTP is a request-response protocol that allows its request or response body to be delivered over different stream-oriented transports such as TCP (HTTP/1.x) or its multiplexed variants (SPDY, HTTP/2, QUIC).

As a client developer, your application can produce the request body in a streaming mode, that is, client-streaming. Likewise the application may also consume the response body in a streaming mode, that is, server-streaming.

However, the HTTP spec doesn't specify if a server is allowed to stream back the response body (except for error responses) when the request body is still pending. This semantic is known as full-duplex streaming. Although many HTTP client/server/proxy softwares do allow full-duplex streaming, even for HTTP/1.1, to avoid any interoperability issue HTTP based Cloud APIs are restricted to half-duplex streaming only.

By default, bidi streaming methods in Cloud APIs assume the full-duplex semantics. That is, it is not safe to use HTTP to invoke such a method. If a streaming method is half-duplex only (as enforced by the server), the API document should clearly specify the half-duplex behavior.

For browser clients, the standard HTTP semantics are further constrained by the browser Network APIs. Currently browsers support only server-streaming (that generally respects transport-level framing) via XHR or Fetch. The Fetch API makes the use of whatwg streams.

Because of browser restrictions, Cloud APIs that require browser support must avoid client-streaming as well as full-duplex streaming, or provide a separate API specifically for browser clients.

Generally speaking, client-streaming over the Internet is less useful than server-streaming. This is because using client-streaming often leads to a stateful service, which adversely affects load-balancing and makes the system more vulnerable to failures or attacks. Server-streaming, on the other hand, can be useful as it can significantly reduce latency over neworks with long RTT delays.

Message Encoding

JSON messages when streaming are encoded as an array of JSON messages. The request or response body will remain as a valid JSON MIME type.

Example client stream encoding:

1 <length> <message-bytes> 1 <length> <message-bytes>  … EOF

Example server stream encoding:

1 <length> <message-bytes>  … 2 <length> <status-bytes> EOF

Wire-level encoding: definition of StreamBody is only significant in its allocation of tag-ids for the field "messages" and "status" will be varint encoded with 1-2 bytes for normal messages, so the total encoding overhead is 2-3 bytes per message.

An optional padding field is needed to support base64 encoded streams:

message StreamBody {
  repeated bytes message = 1;
  google.rpc.Status status = 2;
  repeated bytes padding = 15;   // max one-byte tag-id: xxx01111
}

Error messages should be appended as the last element of the JSON or protobuf array, under the same format as regular messages.

State Management

Half-close behavior is well defined in any HTTP version for a client or server to signal to the other end that the body is completed.

In particular, client code is free to complete the request when still waiting for the response. Similarly, a client may see a completed response when the request body is still being written to the server. The HTTP standard expects the client to abort or complete the request when a response is completed in an unexpected way, normally with an error status. This is to say, under normal conditions the server should not complete a response when the client is still sending the request.

Cancellation

Cancellation support allows a client to abort a request when the request or response is still pending.

There is no reliable cancellation support for HTTP/1.* clients, as a client is free to close a TCP connection after the request has been completed without aborting the request/response transaction. A TCP FIN, under HTTP/1.1, should not be interpreted as a cancellation, even when the connection is marked as a keep-alive one (Connection: Keep-Alive).

However, after the client closes the TCP connection, if the server tries to write any data to the client, an RST will be generated, which can trigger a cancellation.

Also note that cancellation is an issue for non-streaming APIs too. This is especially the case when the response involves a long polling and hence the connection may stay idle for an extended period.

Explicit cancellation is supported with SPDY, HTTP/2 and QUIC, notably with the go-away message.

Keep-alive

Keep-alive support allows a client or server to detect a failed peer, even in the event of packet loss or network failures.

There is no keep-alive support in HTTP/1.1 as TCP keep-alive is not a viable approach.

QUIC or HTTP/2 offer special control messages for the purpose of implementing keep-alive support by applications, including browsers.

However, reliable keep-alive and failure detection will likely require a client library with necessary server-side support: doing long-lived streaming over the internet is often error-prone when relying on basic HTTP as the communication protocol.

Flow Control

Flow control support requires the client to propagate transport-level flow-control events to the client application. The actual mechanism depends on the style of the HTTP client API that your client application uses. For example, you need blocking writes and reads, or non-blocking reads and writes with explicit flow control support for applications to handle and respect flow-control events, in order to prevent either the client or server from being overloaded.

HTTP/1.1 relies on TCP flow control.

SPDY and HTTP/2 have their own flow-control at the stream level, which is further subject to TCP flow control at the connection level as requests are multiplexed over a single TCP connection.

QUIC runs on UDP and therefore manages flow-control completely on its own.