API Management

Why your APIs should be entity-oriented

March 15, 2017

Martin Nally

Software Developer and API designer, Apigee

The dominant model for APIs in distributed computing for decades has been Remote Procedure Call (RPC). This isn't surprising—ever since Fortran II introduced functions in 1958, the function (or procedure) has been the primary organizing construct that programmers use to write code.

Most distributed APIs are defined by programmers, and the simplest way for them to think about a distributed API is that it allows some of the procedures of their program to be invoked from outside the program. Historically, systems like DCE and CORBA provided system software and tools to help implement RPC.

When people started implementing APIs on the world-wide web, they naturally carried over the familiar RPC concepts from previous environments. This led initially to standards like WSDL and the so-called WS-* standards, which were heavyweight and complex. Most web APIs now use HTTP in a much more lightweight way—often called RESTful—that retains concepts from the RPC model while blending in some of the native concepts of HTTP/REST.

HTTP itself is purely entity-oriented, not procedural—it defines a small number of standard “methods” for manipulating entities, but otherwise does not model procedures. A minority of web API designers have abandoned the traditional RPC model completely and design web APIs that are based entirely on HTTP's entity-oriented model; they are using HTTP simply and directly, without layering RPC concepts on top of it. In a moment, I'll explain why they are doing this.

At this point, you might be confused, because much of the available information on the web would lead you to think that the current crop of popular web APIs follows the entity-oriented model called REST. For example, the introduction to the OpenAPI Specification (formerly known as Swagger) says "The goal of the OpenAPI Specification is to define a standard, language-agnostic interface to REST APIs."

In fact, OpenAPI is a fairly traditional RPC Interface Definition Language (IDL) and describing an entity-oriented API with OpenAPI is awkward and imprecise. The fact that OpenAPI can fairly easily and accurately describe the majority of the APIs currently on the web is a reliable indication of their nature. There have been some attempts to define IDLs for entity-oriented APIs—Rapier is an example. One way to understand the challenges of representing an entity-oriented API in OpenAPI is to look at the output of Rapier's OpenAPI generator.

Why entity-oriented APIs matter

So why would you care about this? Why are some people interested in entity-oriented rather than procedure-oriented APIs? Imagine an API for the classic students and classes problem. In a procedural API, I might need the following procedures:

add a student record
retrieve a student record
add a class record
add a student to a class
list classes a student is enrolled in
list students enrolled in a class
assign an instructor to a class
transfer a student between classes

In practice there would be dozens of these procedures, even for a simple problem domain like this one. If I looked at the API for another problem domain, I would start over again—nothing I learned about students and classes will help me learn the next API, whose procedures will all be different and specialized to its own problem domain.

What’s wrong with that?

Most programmers are not surprised or dismayed by the proliferation of APIs with little commonality between them—learning all this detail and diversity is simply part of the life of a programmer. However, even programmers have a different expectation when they program to a database. Database management systems (DBMSs) offer a standard API for interacting with data regardless of the details of the data itself.

If you are programming to MySQL, PostgreSQL, or any other DBMS, you have to learn the schema of the data you are accessing. But once you have done that, all the mechanisms for accessing that data—the API—are standardized by the DBMS itself. This means that when you have to program to a different database, the learning burden is much lower, because you already know the API of the DBMS; you only have to learn the schema of the new data. If each database, rather than each DBMS, had its own API, even the most tolerant programmers would balk.

Implementing a purely entity-oriented API on the web enables HTTP to function as the standardized API for all web APIs, in the same way that the API provided by a DBMS functions as the standardized API for all the databases it hosts. HTTP becomes the universal API for all web APIs, and only the schema of the data of a specific API needs to be specified and learned.

Separation of interface from implementation

Separating an entity-oriented API model from the procedural implementation model has another major advantage—it makes it easier for each of them to evolve independently. This can be done in the procedural model too, by having one set of procedures for the external model and a separate set for the implementation. However, maintaining this separation when they are both expressed as programming-language procedures requires a lot of discipline and design oversight, and is rarely done well.

Why the extra effort is worthwhile

If entity-oriented web APIs are better, why is only a minority of web APIs designed this way? There are multiple reasons. One is that many programmers don't yet know how to do this, or they don't know why it's better. A second reason is that entity-oriented APIs require a bit more work to produce, because implementing an entity-oriented API requires programmers, whose code is in the form of procedures, to implement a mapping from the exposed entity model to the procedural implementation.

The mapping isn't inherently difficult—it consists mostly of implementing create, retrieve, update, and delete (CRUD) procedures for each entity, plus a set of procedures that implement queries on the entities. Many API developers start from this point but stray by exposing procedures that do not correspond clearly to a standard operation on a well-defined entity.

A third reason is that most of the popular API programming education, tools, frameworks, and examples illustrate the procedural style or a hybrid style—not a purely entity-oriented style. Staying true to the entity-oriented model requires a little more effort and mental acuity, but most programmers are neither lazy nor stupid; what is usually lacking is an understanding of why a little extra effort is worthwhile. In short, although it is not terribly hard, you have to have some vision as motivation to implement entity-oriented APIs.

The popularity of entity-oriented web APIs is increasing slowly. Some widely used APIs, like the Google Drive API and the GitHub API, are almost completely entity-oriented. Others have understood that entity-oriented interfaces can be constructed for almost any problem domain. I believe the industry will continue to move in this direction.