8 more rules of designing program interfaces

Sergey Konstantinov
10 min readJan 4, 2021

1. Use globally unique identifiers

It’s considered good form to use globally unique strings as entity identifiers, either semantic (i.e. “lungo” for beverage types) or random ones (i.e. UUID-4). It might turn out to be extremely useful if you need to merge data from several sources under single identifier.

In general, we tend to advice using urn-like identifiers, e.g. urn:order:<uuid> (or just order:<uuid>). That helps a lot in dealing with legacy systems with different identifiers attached to the same entity. Namespaces in urns help to understand quickly which identifier is used, and is there a usage mistake.

One important implication: never use increasing numbers as external identifiers. Apart from abovementioned reasons, it allows counting how many entities of each types there are in the system. You competitors will be able to calculate a precise number of orders you have each day, for example.

NB: this book often use short identifiers like “123” in code examples; that’s for reading the book on small screens convenience, do not replicate this practice in a real-world API.

2. Clients must always know full system state

This rule could be reformulated as ‘don’t make clients guess’.

Bad:

// Creates an order and returns its id
POST /v1/orders
{ … }

{ "order_id" }
// Returns an order by its id
GET /v1/orders/{id}
// The order isn't confirmed
// and awaits checking
→ 404 Not Found

— though the operation looks to be executed successfully, the client must store order id and recurrently check GET /v1/orders/{id} state. This pattern is bad per se, but gets even worse when we consider two cases:

  • clients might lose the id, if system failure happened in between sending the request and getting the response, or if app data storage was damaged or cleansed;
  • customers can’t use another device; in fact, the knowledge of orders being created is bound to a specific user agent.

In both cases customers might consider order creating failed, and make a duplicate order, with all the consequences to be blamed on you.

Better:

// Creates an order and returns it
POST /v1/orders
{ <order parameters> }

{
"order_id",
// The order is created in explicit
// «checking» status
"status": "checking",

}
// Returns an order by its id
GET /v1/orders/{id}

{ "order_id", "status" … }
// Returns all customers's orders
// in all statuses
GET /v1/users/{id}/orders

3. Avoid double negations

Bad: "dont_call_me": false
— people are bad at perceiving double negation; make mistakes.

Better: "prohibit_calling": true or "avoid_calling": true
— it's easier to read, though you shouldn't deceive yourself. Avoid semantical double negations, even if you've found a ‘negative’ word without ‘negative’ prefix.

Also worth mentioning, that making mistakes in de Morgan’s laws usage is even simpler. For example, if you have two flags:

GET /coffee-machines/{id}/stocks

{
"has_beans": true,
"has_cup": true
}

‘Coffee might be prepared’ condition would look like has_beans && has_cup — both flags must be true. However, if you provide the negations of both flags:

{
"beans_absence": false,
"cup_absence": false
}

— then developers will have to evaluate one of !beans_absence && !cup_absence!(beans_absence || cup_absence) conditions, and in this transition people tend to make mistakes. Avoiding double negations helps little, and regretfully only general advice could be given: avoid the situations, when developers have to evaluate such flags.

4. Avoid implicit type conversion

This advice is opposite to the previous one, ironically. When developing APIs you frequently need to add a new optional field with non-empty default value. For example:

POST /v1/orders
{}

{
"contactless_delivery": true
}

New contactless_delivery options isn't required, but its default value is true. A question arises: how developers should discern explicit intention to abolish the option (false) from knowing not it exists (field isn't set). They have to write something like:

if (Type(order.contactless_delivery) == 'Boolean' &&
order.contactless_delivery == false) { … }

This practice makes the code more complicated, and it’s quite easy to make mistakes, which will effectively treat the field in a quite opposite manner. Same could happen if some special values (i.e. null or -1) to denote value absence are used.

The universal rule to deal with such situations is to make all new Boolean flags being false by default.

Better

POST /v1/orders
{}

{
"force_contact_delivery": false
}

If a non-Boolean field with specially treated value absence is to be introduced, then introduce two fields.

Bad:

// Creates a user
POST /users
{ … }

// Users are created with a monthly
// spending limit set by default
{

"spending_monthly_limit_usd": "100"
}
// To cancel the limit null value is used
POST /users
{

"spending_monthly_limit_usd": null
}

Better

POST /users
{
// true — user explicitly cancels
// monthly spending limit
// false — limit isn't canceled
// (default value)
"abolish_spending_limit": false,
// Non-required field
// Only present if the previous flag
// is set to false
"spending_monthly_limit_usd": "100",

}

NB: the contradiction with the previous rule lies in the necessity of introducing ‘negative’ flags (the ‘no limit’ flag), which we had to rename to abolish_spending_limit. Though it's a decent name for a negative flag, its semantics is still unobvious, and developers will have to read the docs. That's the way.

5. Avoid partial updates

Bad:

// Return the order state
// by its id
GET /v1/orders/123

{
"order_id",
"delivery_address",
"client_phone_number",
"client_phone_number_ext",
"updated_at"
}
// Partially rewrites the order
PATCH /v1/orders/123
{ "delivery_address" }

{ "delivery_address" }

— this approach is usually chosen to lessen request and response body sizes, plus it allows to implement collaborative editing cheaply. Both these advantages are imaginary.

In first, sparing bytes on semantic data is seldom needed in modern apps. Network packets sizes (MTU, Maximum Transmission Unit) are more than a kilobyte right now; shortening responses is useless while they’re less then a kilobyte.

Excessive network traffic usually occurs if:

  • no data pagination is provided;
  • no limits on field values are set;
  • binary data is transmitted (graphics, audio, video, etc.)

Transferring only a subset of fields solves none of these problems, in the best case just masks them. More viable approach comprise:

  • making separate endpoints for ‘heavy’ data;
  • introducing pagination and field value length limits;
  • stopping saving bytes in all other cases.

In second, shortening response sizes will backfire exactly with spoiling collaborative editing: one client won’t see the changes the other client have made. Generally speaking, in 9 cases out of 10 it is better to return a full entity state from any modifying operation, sharing the format with read access endpoint. Actually, you should always do this unless response size affects performance.

In third, this approach might work if you need to rewrite a field’s value. But how to unset the field, return its value to the default state? For example, how to remove client_phone_number_ext?

In such cases special values are often being used, like null. But as we discussed above, this is a defective practice. Another variant is prohibiting non-required fields, but that would pose considerable obstacles in a way of expanding the API.

Better: one of the following two strategies might be used.

Option #1: splitting the endpoints. Editable fields are grouped and taken out as separate endpoints. This approach also matches well against the decomposition principle we discussed in the previous chapter.

// Return the order state
// by its id
GET /v1/orders/123

{
"order_id",
"delivery_details": {
"address"
},
"client_details": {
"phone_number",
"phone_number_ext"
},
"updated_at"
}
// Fully rewrite order delivery options
PUT /v1/orders/123/delivery-details
{ "address" }
// Fully rewrite order customer data
PUT /v1/orders/123/client-details
{ "phone_number" }

Omitting client_phone_number_ext in PUT client-details request would be sufficient to remove it. This approach also helps to separate constant and calculated fields (order_id and updated_at) from editable ones, thus getting rid of ambiguous situations (what happens if a client tries to rewrite the updated_at field?). You may also return the entire order entity from PUT endpoints (however, there should be some naming convention for that).

Option 2: design a format for atomic changes.

POST /v1/order/changes
X-Idempotency-Token: <token>
{
"changes": [{
"type": "set",
"field": "delivery_address",
"value": <new value>
}, {
"type": "unset",
"field": "client_phone_number_ext"
}]
}

This approach is much harder to implement, but it’s the only viable method to implement collaborative editing, since it’s explicitly reflects what a user was actually doing with entity representation. With data exposed in such a format you might actually implement offline editing, when user changes are accumulated and then sent at once, while the server automatically resolves conflicts by ‘rebasing’ the changes.

6. Avoid non-atomic operations

There is a common problem with implementing the changes list approach: what to do, if some changes were successfully applied, while others are not? The rule is simple: if you may ensure the atomicity (e.g. either apply all changes or none of them) — do it.

Bad:

// Returns a list of recipes
GET /v1/recipes

{
"recipes": [{
"id": "lungo",
"volume": "200ml"
}, {
"id": "latte",
"volume": "300ml"
}]
}
// Changes recipes' parameters
PATCH /v1/recipes
{
"changes": [{
"id": "lungo",
"volume": "300ml"
}, {
"id": "latte",
"volume": "-1ml"
}]
}
→ 400 Bad Request
// Re-reading the list
GET /v1/recipes

{
"recipes": [{
"id": "lungo",
// This value changed
"volume": "300ml"
}, {
"id": "latte",
// and this did not
"volume": "300ml"
}]
}

— there is no way how client might learn that failed operation was actually partially applied. Even if there is an indication of this fact in the response, the client still cannot tell, whether lungo volume changed because of the request, or some other client changed it.

If you can’t guarantee the atomicity of an operation, you should elaborate in details how to deal with it. There must be a separate status for each individual change.

Better:

PATCH /v1/recipes
{
"changes": [{
"recipe_id": "lungo",
"volume": "300ml"
}, {
"recipe_id": "latte",
"volume": "-1ml"
}]
}
// You may actually return
// a ‘partial success’ status
// if the protocol allows it
→ 200 OK
{
"changes": [{
"change_id",
"occurred_at",
"recipe_id": "lungo",
"status": "success"
}, {
"change_id",
"occurred_at",
"recipe_id": "latte",
"status": "fail",
"error"
}]
}

Here:

  • change_id is a unique identifier of each atomic change;
  • occurred_at is a moment of time when the change was actually applied;
  • error field contains the error data related to the specific change.

Might be of use:

  • introducing sequence_id parameters in the request to guarantee execution order and to align item order in response with the requested one;
  • expose a separate /changes-history endpoint for clients to get the history of applied changes even if the app crashed while getting partial success response or there was a network timeout.

Non-atomic changes are undesirable because they erode the idempotency concept. Let’s take a look at the example:

PATCH /v1/recipes
{
"idempotency_token",
"changes": [{
"recipe_id": "lungo",
"volume": "300ml"
}, {
"recipe_id": "latte",
"volume": "400ml"
}]
}
→ 200 OK
{
"changes": [{

"status": "success"
}, {

"status": "fail",
"error": {
"reason": "too_many_requests"
}
}]
}

Imagine the client failed to get a response because of a network error, and it repeats the request:

PATCH /v1/recipes
{
"idempotency_token",
"changes": [{
"recipe_id": "lungo",
"volume": "300ml"
}, {
"recipe_id": "latte",
"volume": "400ml"
}]
}
→ 200 OK
{
"changes": [{

"status": "success"
}, {

"status": "success",
}]
}

To the client, everything looks normal: changes were applied, and the last response got is always actual. But the resource state after the first request was inherently different from the resource state after the second one, which contradicts the very definition of ‘idempotency’.

It would be more correct if the server did nothing upon getting the second request with the same idempotency token, and returned the same status list breakdown. But it implies that storing these breakdowns must be implemented.

Just in case: nested operations must be idempotent themselves. If they are not, separate idempotency tokens must be generated for each nested operation.

7. Maintain a proper error sequence

In first, always return unresolvable errors before the resolvable ones:

POST /v1/orders
{
"recipe": "lngo",
"offer"
}
→ 409 Conflict
{
"reason": "offer_expired"
}
// Request repeats
// with the renewed offer
POST /v1/orders
{
"recipe": "lngo",
"offer"
}
→ 400 Bad Request
{
"reason": "recipe_unknown"
}

— what was the point of renewing the offer if the order cannot be created anyway?

In second, maintain such a sequence of unresolvable errors which leads to a minimal amount of customers’ and developers’ irritation.

Bad:

POST /v1/orders
{
"items": [{ "item_id": "123", "price": "0.10" }]
}

409 Conflict
{
"reason": "price_changed",
"details": [{ "item_id": "123", "actual_price": "0.20" }]
}
// Request repeats
// with an actual price
POST /v1/orders
{
"items": [{ "item_id": "123", "price": "0.20" }]
}

409 Conflict
{
"reason": "order_limit_exceeded",
"localized_message": "Order limit exceeded"
}

— what was the point of showing the price changed dialog, if the user still can’t make an order, even if the price is right? When one of the concurrent orders finishes, and the user is able to commit another one, prices, items availability, and other order parameters will likely need another correction.

In third, draw a chart: which error resolution might lead to the emergence of another one. Otherwise you might eventually return the same error several times, or worse, make a cycle of errors.

// Create an order
// with a payed delivery
POST /v1/orders
{
"items": 3,
"item_price": "3000.00"
"currency_code": "MNT",
"delivery_fee": "1000.00",
"total": "10000.00"
}
→ 409 Conflict
// Error: if the order sum
// is more than 9000 tögrögs,
// delivery must be free
{
"reason": "delivery_is_free"
}
// Create an order
// with a free delivery
POST /v1/orders
{
"items": 3,
"item_price": "3000.00"
"currency_code": "MNT",
"delivery_fee": "0.00",
"total": "9000.00"
}
→ 409 Conflict
// Error: munimal order sum
// is 10000 tögrögs
{
"reason": "below_minimal_sum",
"currency_code": "MNT",
"minimal_sum": "10000.00"
}

You may note that in this setup the error can’t resolved in one step: this situation must be elaborated over, and either order calculation parameters must be changed (discounts should not be counted against the minimal order sum), or a special type of error must be introduced.

8. No results is a result

If a server processed a request correctly and no exceptional situation occurred — there must be no error. Regretfully, an antipattern is widespread — of throwing errors when zero results are found .

Bad

POST /search
{
"query": "lungo",
"location": <customer's location>
}
→ 404 Not Found
{
"localized_message":
"No one makes lungo nearby"
}

4xx statuses imply that a client made a mistake. But no mistakes were made by either a customer or a developer: a client cannot know whether the lungo is served in this location beforehand.

Better:

POST /search
{
"query": "lungo",
"location": <customer's location>
}
→ 200 OK
{
"results": []
}

This rule might be reduced to: if an array is the result of the operation, than emptiness of that array is not a mistake, but a correct response. (Of course, if empty array is acceptable semantically; empty coordinates array is a mistake for sure.)

This is the an addition to the Chapter 11 of the book; the work continues on Github. I’d appreciate if you share it on reddit, for I personally can’t do that.

--

--