API Management

The false dichotomy of stability vs human-centric URL design in web APIs

May 31, 2017

Martin Nally

Software Developer and API designer, Apigee

There are two schools of thought on how to design URLs in web APIs, and they often turn into warring factions. In the past, we've seen very heated discussions on this topic, with the ultimate resolution often being made arbitrarily by whomever controlled the particular API under discussion.

Recently, we’ve arrived at a common view on the topic, which we believe has resulted in better APIs. We thought we'd share our secret.

URLs for stability

One of the historic schools of thought (typically the minority one) was populated by people who were steeped in the web. In that tradition, URLs are stable identifiers of web resources. The format of the URL is controlled by the server and should be opaque to clients. In other words, the URL format of a resource is not part of the API.

Because there’s a strong requirement for stability over time, the guidance is to try not to encode any information in the URL. For a library inventory system, URLs in this style might look like this:

https://libraries.gov/books/85ab9db7-abfd-46c4-823b-ad6997b96509

There is a very famous document entitled "Cool URIs don't change" by Tim Berners-Lee, the inventor of the web, explaining the rationale for URLs like this. It contains this quote:

"After the creation date, putting any information in the name is asking for trouble one way or another."

Even putting the word “book” in the URL above is questionable—if we switch to digital media, is it still a book or is it now a CD?

URLs for humans

URLs designed for stability like this one are sometimes called permalinks. They might appeal to your inner engineer: they are like well-designed primary key values at the scale of the world wide web.

However, they aren’t friendly to humans, and the majority of API designers has decided that being friendly to humans is more important than web theory. So they’ve invented URLs for books that look like this:

https://libraries.gov/library/sunnyvale/shelf/american-classics/book/moby-dick

URLs like this work well because they align with two very powerful techniques that humans use to think and talk about things: we give them names and we organize them in hierarchies. These techniques are older than recorded history.

URLs like this are everywhere in API design, and they are effective. However, systems based on them typically face the problem that renaming entities and reorganizing hierarchies is difficult or impossible.

Google (and most other prominent web companies)—offers products today that display these limitations. Unfortunately, it often turns out that the ability to rename entities and reorganize hierarchies is more important and more frequently needed than the product designers initially envisioned.

A unified approach to URL design

After a fairly extended period of debate and polarization within our organization, we came to understand that picking between these two URL design approaches is a false choice. It is not only reasonable to do both, but for a high-quality API, it is usually necessary. We also discovered that there is a very simple idea that neatly unifies the two.

The insight that helped us unify these approaches is that the URL https://libraries.gov/library/sunnyvale/shelf/american-classics/book/moby-dick isn’t actually the URL of a book—it's the URL of a query or, more precisely, a query result.

The meaning of the query is "that book whose title is [currently] 'Moby Dick' that is [currently] on the shelf named 'American Classics' at the Sunnyvale City Library." The same query could have been expressed using a URL that included a query string. The difference is one of URL style, not meaning.

Queries are very useful for locating things, but they have the characteristic that they are not guaranteed to return the same thing twice. The URL above may today return the book whose URI is:

https://libraries.gov/books/85ab9db7-abfd-46c4-823b-ad6997b96509

But tomorrow it might not return anything, or it may return a different thing altogether.

Recognizing that these two URLs are actually the URLs of two different things—a book and a query result—rather than two different URLs for the same thing made it obvious that the right solution was to implement both. This enables us to simultaneously have cool, stable permalink URLs for identifying entities, and human-friendly query URLs for finding them based on information that humans typically know. This in turn enables us to rename entities and reorganize hierarchies without breakage.

API clients have to be thoughtful about which URLs to use. For example, if a client wants to store a persistent reference to an entity, then that client should use the permalink URL of the entity, not the query URL that today happens to return the same result.

This is very important for internal services, like a permissions service that stores access control rules for entities, as well as for external clients. Following this guidance throughout a system is necessary to ensure that entities can be renamed and hierarchies reorganized without breakage.

Connecting queries and entities is also important. Whenever a query URL is used in an HTTP request, our newer APIs always return the permalink URL in the Content-Location header of the response (and in the response body too) to ensure that clients always have access to the stable permalink URL when they need it.

Happy teams and better APIs

That’s the story of how we restored peace and harmony to API design teams in Apigee. Part of the reason this worked well is that neither of the original schools of thought had to admit that they had been wrong—they just had to acknowledge that their view had been incomplete, and that they had something to learn from the other school.

We think that the result is not only happier teams, but better APIs.