The longer I write code, the more it feels that my life can essentially be boiled down to the act of connecting APIs. If I'm not connecting to our own internal APIs, I'm fetching data from some sort of external/public API.

For the most part, this process seems to get progressively better every single year. I have horrible war stories to tell about SOAP and deep battle scars from XMLHttpRequest. But I survived those trials and the API world today feels so much more... accommodating.

But that doesn't mean that everything in API Land is Rainbows & Lollipops. There are still headaches lurking out there. And because this blog is my own, personal, unpaid, self-administered therapy, I'm going to spend a few minutes venting my spleen.

If you ever find yourself in a position to actually write endpoints, then I want you to listen up. Because I'm going to lay out a step-by-step guide whereby you can make all the consumers of those endpoints hate your guts.

Show Me The Dang Swagger

If you are publishing REST endpoints, I have one thing, and only one thing, to ask of you:

Where's the Swagger file???

If your answer starts to tail off into some diatribe about the detailed documentation site that your team spent six months building, Ima stop you right there and ask again:

Where's the Swagger file???

If you're bold enough to think that you can mollify me with some promise of Swagger docs that might be available after the endpoints are "officially" released, I'll make sure to show you the rising anger in my face before I say:

I'm sorry. Maybe we're speaking different languages. Because you definitely don't seem to be processing the words that are coming outta my mouth. So I'm going to ask one more time with all the civility I can muster: Where is the got-dang Swagger file???

In the last couple of years, I can't even tell you how many times I've been asked to integrate with some set of vendor/partner REST endpoints. And as soon as I ask, "Where's the Swagger file?" everyone looks at me like I showed up at a civil rights protest in blackface.

I understand that Swagger files are not the "end-all / be-all" of API documentation. But if you're publishing REST endpoints, they should be the starting point and the ending point of all documentation. If you wanna throw up some detailed "How to use our API" website... great! But don't you dare tell me that your DIY site is meant to be a replacement for good ol' fashioned Swagger files.

For REST endpoints, Swagger files are not a "nice to have". They are a basic requirement.

Leave the Writing To Stephen King

As much as I cherish a good Swagger file, this can also lead to a terribly-false sense of security. (And epic headaches down-the-road.) Some of my most painful programming experiences in the last several years occurred when I was given a Swagger file - and the Swagger file was an aspirational work of fiction.

I'm frequently required to integrate with vendors/endpoints that haven't yet been deployed. And you know what?? That's OK. Or at least, it should be OK. Because, as long as I have a detailed Swagger file at my disposal, I can crank out vast amounts of functionality designed to interact with your future endpoints.

Here's where that "OK" has fallen apart - badly.

Some vendor/partner sends me a Swagger file and I proceed to write vast quantities of code designed to interact with that API. Then the API goes live, and... my code doesn't work. NONE of my code works.

My bosses, and my clients, and anyone else monitoring my work immediately believe that I've "dropped the ball" and I start scrambling to figure out what went wrong. That's when I realize that the Swagger files I've been given were as accurate as a Trump speech.

But how can this be? Shouldn't Swagger files be a real-time reflection of the actual code?? Well... they are - if your Swagger files are automatically generated.

In other words, I've had too many experiences where someone on the API team was literally writing the Swagger files, by hand, to reflect how their API would presumably behave. Of course, by the time that I could finally hit their endpoints, in real time, the Swagger files I'd been given (and against which I was coding) were a complete fiction. The behavior of the live endpoints bore no resemblance to the responses that were defined in the Swagger files.

This is the 2020s, people. No one should be manually writing API documentation anymore. There are plenty of packages out there that will document your endpoints dynamically, and in real time, based on the actual code that you've promoted to production.

I'd bet good money that the documentation someone manually wrote for your application five years ago is, today, practically useless. Similarly, the act of manually writing API documentation is a complete-and-utter waste of time.

Don't write API documentation. Generate it. Automatically. I fully understand that this can still cause last-minute issues if you make last-minute changes to your code. But at a minimum, I can feel secure in the knowledge that, anytime I hit your dynamic API documentation site, I'm at least seeing an accurate representation of how your endpoints behave at this moment in time. If you're manually writing your API documentation, it may as well be scribbled, on schoolyard pavement, in chalk for all I care.

Manufacturing Bottlenecks

Those who write APIs spend a lot of time thinking about issues surrounding performance and usage. After all, an API is, basically, an open invitation for the whole world to bombard your servers. So it makes sense to be hyper-vigilante, right???

Well...

I've seen too many scenarios where the API architects were content to create rote, mindless limitations. These limitations often make sense when you look at individual API calls - in a vacuum. But when you widen the lens - just a little bit - it quickly becomes apparent that these limitations are actually harmful.

Lemme give you a real-world example from Spotify's API. [Note: If you've noticed that I'm picking on Spotify a lot lately, it's only because I've been building some new tools around their API. So it brings all of their shortcomings into sharp focus in my mind.]

All of the Spotify endpoints have limits on the total number of records that can be returned. For example, if you want to retrieve all of the tracks in a given playlist, there's an endpoint for that. But... that endpoint will retrieve no more than 100 tracks at a time.

Maybe that makes sense to you. After all, you don't want someone launching a single API call that returns, say, 10,000 records, right?? Well... think about the alternative.

You see, most people who use Spotify frequently have their music sorted into playlists. And those playlists frequently hold well-over 100 tracks each.

But if I'm writing a feature that's designed to interact with a given user's playlist, what are the odds that my feature will only need to grab the first 100 tracks from that playlist? Or the last 100 tracks? In fact, if I'm trying to do playlist management, what are the odds that any operation I execute can suffice to only know about 100 tracks in that playlist??

The far more likely scenario is that, if I'm building functionality that's designed to help you manage a playlist, then I probably need to know about all the tracks in that playlist. If I need to know all the tracks in a 500-track playlist, and the endpoint will only ever let me return 100 tracks at a time, then my application will have no choice but to make five consecutive calls to the same endpoint - probably in fairly-rapid succession.

So in this scenario, is the 100-track limit doing anything to aid the performance of your server? If one request must be converted into five rapid-fire requests, are your record limits actually serving their purpose???

To be clear, I understand that there are times when record limits make perfect sense. If, say, you have an API that allows people to search all songs recorded over the last century (which would be millions of records), then maybe it makes perfect sense that you would limit any particular search to a given number of records.

But when you're dealing with something like a user's playlist, you typically need to get all of the tracks in that playlist. And limiting the return set only forces the consumer to spawn more (expensive) round-trip calls to your endpoint.

Don't Be A REST Cultist

Look... I could write another long diatribe about what's great - and what sucks - about REST. And I assume that I probably will at some point in the future. But for now, suffice it to say that slavishly following every minute detail of the REST Purists' Bible can make life hell for your consumers.

A perfect example of this is what I call the REST 404 Paradox.

In theory, when a resource isn't available, you're supposed to return a 404. But, depending upon how you read the REST standards, this also means that you should return a 404 when a search returns no results. Quite frankly, that's annoying AF.

I thoroughly understand that this URL:

GET https://myapi.com/v1/users/e91781a4-21e7-427a-b970-d92fca15c556/

Will return a 404 if there is no user with that GUID.

But this URL gets extremely confusing if you return a 404:

GET https://myapi.com/v1/users?state=FL&lastName=Davis

Does the 404 from this URL happen because there are no users, in Florida, with the last name of Davis? Or does the 404 happen because there is no endpoint at this address??? There's no way to be absolutely sure.

REST also becomes a nightmare when the designers are super-duper anal retentive about ensuring that every single entity can only be returned under its own endpoint.

For example, consider this possible return from the /v1/users endpoint:

{
  user: {
    firstName: "Adam",
    lastName: "Davis",
    addresses: [
      {
        street: "101 Main Street",
        city: "Palookaville",
        state: "Idaho",
        postalCode: 32211
      },
      {
        street: "102 State Street",
        city: "Mainville",
        state: "Illinois",
        postalCode: 42218
      },
      {
        street: "103 Baluga Street",
        city: "Fishville",
        state: "Maine",
        postalCode: 53319
      }       
    ]
  }
}

There are plenty of "REST Acolytes" who would swear that this data model is "wrong" for a REST endpoint. They'll yell you down with the idea that the addressses should absolutely be their own endpoint.

And I'm not telling you that it's "wrong" to create a standalone address endpoint. But if the only reason you're creating that endpoint is to satisfy some nagging inner REST purity-check, then... you should think carefully about what you're doing.

Let me put this another way:

If the addresses only make sense in the context of their users, then creating a separate address endpoint may be an unnecessary headache - and it will force your users to make more calls against your endpoints and further stress your web servers.

Additionally, endpoint designers too-often assume that data can only live under one endpoint. For example, in the data set shown above, they'll demand that addresses are only returned under a standalone address endpoint OR they are only returned under the users endpoint.

But life doesn't always have to be so rigid. There's nothing theoretically wrong with the idea that there may be an address endpoint and addresses may be returned whenever you query a particular user.

Conclusion

APIs should be a service to your users. They should help technically savvy users to leverage and extend your functionality. If you make them jump through an inordinate number of hoops, you undermine the whole purpose for the API to exist at all.

API Sorrows

Show Me The Dang Swagger

Leave the Writing To Stephen King

Manufacturing Bottlenecks

Don't Be A REST Cultist

Conclusion