API Management

API versioning best practices: When you need versioning and when you don't

May 15, 2017

Martin Nally

Software Developer and API designer, Apigee

When versioning makes sense—and when it doesn’t

API versioning is often misunderstood, in part because the term is used to describe more than one basic concept. One of the misconceptions about versioning is that it’s something you need to bake into your APIs from the start. Consider the following examples.

Suppose you’ve written an API that enables people to create and access data for an application. Following advice that you read on the internet, you prefixed all the HTTP paths of your API with /v1. All your URLs look like http://acme.com/v1/thingumy/12345.

A couple of years later, you do a major rewrite of your API. The URLs of the new API all look like http://acme.com/v2/thingumy/12345 (note the /v2 in place of /v1). This second-generation API has much more advanced features. To accommodate those features, the API data model and the underlying database are richer and more complex.

As a result, data created with V1 is not visible in V2 and vice versa. You thought about trying to implement V2 in a way that was backwards-compatible with V1, but it proved impossible to provide the new function and concepts in a way that still allowed V1 clients to update the data.

That's probably okay—many clients will be fine with keeping old data in the old system and new data in the new system, and you could also offer a data migration API to upgrade from V1 to V2.

Are those APIs really versions?

The story above describes a classic API-versioning scenario, and proves the value of the conventional guidance on versioning in APIs, right? Not really.

Your V1 and V2 APIs are actually two independent APIs with no relationship to one another. To you, they’re related, because they address the same problem domain, and you wrote one of them after you wrote the other. But those relationships are only in your head—these APIs are not tied together in any concrete way.

You could as easily have made the URLs of the second API look like http://v2.acme.com/thingumy/12345 or http://not-acme.com/thingumy/12345. These are two independent APIs that share some family lineage or came from a common foundry.

The concept of “lineage of independent APIs” one of the ideas that people associate with the word “versioning.”

The power of “V”

Let's look at another example. Suppose you had a URL like this in your API: http://acme.com/v1/editor-options. A GET on this resource returns this:

[{"name": "tabWidth",
"value": "4"},
{"name": "defaultFontSize",
"value": "10"},
{"name": "defaultFont",
"value": "Arial"},
{"name": "backgroundColor",
"value": "White"}
]

You get complaints from your users that this is inconvenient—they have to iterate through the whole array to find the edit option they are looking for. Versioning to the rescue! We can simply introduce a V2, like this:

http://acme.com/v2/editor-options (note again the /v2 in place of /v1):
{"tabWidth": "4",
"defaultFontSize": "10",
"defaultFont": "Arial",
"backgroundColor": "White"
}

We can also enable the information to be updated via either format through the appropriate URL.

This is a completely different meaning for the word “versioning.” In this case http://acme.com/v1/thingumy/12345 and http://acme.com/v2/thingumy/12345 are not independent resources in independent APIs—they each read and write the same data.

To add version identifiers, or not to?

Does this illustrate the wisdom of the conventional advice to put a version identifier in URLs? Not really. Suppose I have another resource whose URL is http://acme.com/v1/preferences.

It looks like this:

{"networkOptions": "http://acme.com/v1/network-options",
"dataOptions": "http://acme.com/v1/date-options",
"editorOptions": "http://acme.com/??/editor-options"}

How is the server to decide whether to put /v1 or /v2 on the last line?

One option: if the user asked for /v1 of preferences, she will get links to /v1 of all the other resources. This is not necessarily what the user wants. And what happens if the user asks for V2 of preferences, but there is no corresponding V2 of the networkOptions?

Another option is to return simply http://acme.com/editor-options (no version identifier) and let the client construct a suitable URL by parsing the URL and inserting /v1 or /v2. A variant of this idea is to return a URI template instead of a URL, like this: http://acme.com/{version}/editor-options.

Cleaning up with content negotiation

This is looking a bit complex both in practice and conceptually—try to write a convincing paragraph or two to describe the two different resources identified by the URLs http://acme.com/v1/editor-options and http://acme.com/v2/editor-options and their relationship.

By introducing a second URL, we are introducing a second entity into our model, and the rest of our problems are a consequence of juggling two entities with overlapping states. Our original intent was not to introduce a new API entity—that was a side-effect of our solution. The intent was simply to give users a choice in how the data of the original entity should be formatted.

HTTP offers a cleaner solution to the problem of offering users multiple formats for the same resource. It’s called content negotiation. You’re probably already familiar with the idea: when making an HTTP request, the client includes a number of headers that describe what format (media type in the jargon) they want the information back in, or what format they are providing data in.

The two most commonly used headers for this are Accept to specify the desired format in the response and Content-Type for the provided format in the request. Accept-Language is also commonly used by browsers requesting HTML, and less commonly by API clients.

There are two obvious ways we can use content negotiation to solve our problem without introducing new URLs and their consequent headaches. In the example above, the v2 format uses simple JSON name-value pairs. It would be fair to consider application/json as being the correct media-type for V2.

The original V1 format isn't simply JSON; it defines its own peculiar grammar on top of JSON. V1 is JSON in the same sense that JSON itself is text (it’s JSON, but it's not just JSON): it has a special format of its own.

If we had been purists, we might have invented our own media type for V1—something like application/convoluted+json—but it’s unlikely that we did this. At this point, our choices are to invent a new media type for the new format to use in the standard accept and content-type headers (for example, application/just+json) or use a different header entirely.

A popular though not standardized header for this purpose is Accept-Version. Requests that include the header Accept-Version: V2 will get the V2 format. Requests that omit the header or use it to ask for V1 will get the V1 format.

If we use content-negotiation to communicate the format information, our preferences example becomes simple:

{"networkOptions": "http://acme.com/network-options",
"dataOptions": "http://acme.com/date-options",
"editorOptions": "http://acme.com/editor-options"}

Removing version identifiers from the URLs solves the problem.

Why doesn't everyone agree that content-negotiation is a better solution?

Many people decide to use version identifiers in URLs instead of headers because of the convenience of using URLs without headers, especially in the browser. (If you are using curl to access an API that uses content-negotiation headers, you will have to add -H "Accept-Version: V2" to the command, which isn’t too onerous).

In the browser, you’ll have to use a plugin like Postman to set the header, which is a bit more of a burden. Despite this, I think it is short-sighted to use version identifiers in URLs—in the end the price you pay for creating a more complex conceptual model will be higher.

Ignorance can be bliss

You’ll often see advice on the internet saying that not only should you put version identifiers in URLs, but you should do it right from the beginning, to allow for future evolution. Regardless of the strategy you use for versioning, we think it works perfectly well to ignore versioning initially and only add it if and when you need it.

This offers a significant advantage: if it turns out that you don't really need versioning—which has been the case for most of our own APIs—then you didn't add unnecessary complexity to the initial API.

There’s an old story of a farmer who goes to great lengths to avoid mowing over the fairy rings that grow spontaneously in the fields of his farm. When asked why he does this, he replies, "because I'd be a damned fool if I didn't."

If he were a developer, the farmer would probably also put version identifiers in of all his APIs, right from the start.