Rate limiting is like a traffic light used to regulate and maintain the movement of cars. It can be significant in guaranteeing a proportionate share of resources and ensuring fair usage regardless of clients’ abuse and overloading of systems. This article will explain its meaning, importance, strategies, and implementation.

Session Replay for Developers

Uncover frustrations, understand bugs and fix slowdowns like never before with OpenReplay — an open-source session replay suite for developers. It can be self-hosted in minutes, giving you complete control over your customer data.

Happy debugging! Try using OpenReplay today.

One of the most effective forms of pre-coded solutions that you can apply to your web apps and consequently boost their performance and usability is APIs. As demonstrated, APIs are an advantage; however, they are accompanied by a set of weaknesses. For instance, consider a use case of a user who sends numerous requests, say around one hundred requests per second, while other users have not initiated any requests. This whole scenario can result in inequitable allocation of resources and can lead to a halt of your server because of many requests. To avoid this, you must incorporate rate limiting as a feature in the systems you intend to construct.

What is Rate Limiting?

This is the technique that APIs use to limit the number of requests a client makes to the server within a given period. These boundaries are created by defining how much of an API’s interface a client can consume. This tends to minimize circumstances of overlapping flooding of a server with numerous requests at once. There are a few types of rate limiting, each of which has its own advantages and disadvantages. Within the main approaches of rate limiting, there are three that are most commonly used, which include IP-based, Server-based, and geography-based rate limits. Let's have a better look at them:

IP-Based

It is a type of rate limiting that limits the number of requests an IP address can make. This restriction usually implies that the requests are given up to a certain limit for a given period. Hence, if the number of requests exceeds the stated time frame, all other requests will be refused. The benefit of employing IP-based rate limiting is that the process of setting it up is relatively easy and simple. The flip side is that if you find a particular user having, say, 20 IP addresses, this user can use all these addresses to make more requests that can again cause overload on your server.

Server-Based

This is a rate limitation mechanism that seeks to control the number of requests made within a certain time frame from a server. One of the direct advantages of this form of rate limiting is preventing abuse and ensuring fair usage of server resources.

Geography-Based

This type limits the number of requests that a geographical area can make within a given period. It assists in protecting against attacks stemming from areas that launch multiple requests with malicious aims.

Each of the above has its advantages, but developers are advised to combine the types of rate limiting for the maximum effect.

Why is it Important to Implement Rate Limiting?

Aside from reducing server overload, there are a few reasons as to why rate limiting is important. It also decreases classic attacks, such as DDoS and brute force ones, and it also helps to decrease API misuse. Let's have a look at the importance of rate limiting in more detail:

API abuse: This usually occurs when many requests are made to an API, either by intent or by accident. Rate limiting, where the number of requests that a particular client can make is regulated, combats this abuse. This aids in the sensible utilization and allocation of its resources, with emphasis on the point that they are divided among several clients rather than to one client only.

Prevention of DDoS Attacks: This type of attack operates by flooding a system with requests to cause the system to fail and shut down. Rate-limiting makes it difficult for an attacker to send multiple requests to flood your server. This is because it does not allow many requests from a certain IP address after it has got to a specific limit.
Brute force attacks: This is an attack that involves the use of some tools to carry out a trial and error method in an attempt to guess sensitive information, for instance, passwords. When you apply rate limiting it can prompt you to block the number of login attempts during the given period. This process will reduce the effect of brute force attacks as it will be difficult for them to try too many attempts all at once.

With the above, you should now appreciate that rate limiting is a critical tool that not only assists in managing your APIs but also safeguards your system from several attacks.

Strategies for Rate Limiting

As for the tactics that can be employed to apply rate limiting, a few exist, and we will describe them in the course of this article. We have different levels of policies, such as the fixed window, sliding window, token bucket, and the like policies. Let's have a detailed look at these strategies:

Fixed Window Algorithm

This strategy sets a timeline for when requests can be made. For example, the API rate could be set at 50 requests for every one minute; if a client has processed all the requests within this time frame, then they will have to wait until the window is again open to make another request.

Sliding Window Algorithm

This assists in keeping track of the requests made in a certain time frame, maybe in seconds or minutes. This gives way to a more flexible approach as it evaluates the number of requests made within a particular period only. Thus, clients can accumulate their requests because they can be denied access periodically due to a huge flow of visitors.

Token Bucket Algorithm

In this case, tokens are given to a client based on the number of requests made by the same. Each token bucket gets its capacity, and there is a maximum number of requests a client can send in a given time. Every time a client asks for a request, one token is used from the bucket, and when all the tokens are used, the client is dumped until the particular bucket is refilled with more tokens.

Leaky Bucket Algorithm

It entails refilling a bucket at some rate, and the requests, in this case, will act like water in the bucket by draining at some other rate. A given bucket is used to store the requests from the clients, and if the bucket becomes full, then every request made cannot go through.

While attempting to apply a strategy, it becomes wise to look at the specifications and qualities of the area where the technique of rate limiting will be applied. A clearer understanding of your system and the strategies mentioned above assists in achieving an optimum level between the MSP and the protection system’s responsiveness.

Tools and Services for Rate Limiting

Some tools include rate-limiting functionalities to simplify the implementation process. AWS, Azure, and Google Cloud are examples of tools that offer built-in features in addition, to their services. Content Delivery Network (CDN) providers also play a role by providing limiting features, performance enhancements, and security for your applications. By utilizing the tool, you can effectively manage your APIs, prevent abuse, and ensure distribution of resources among users.

Most of the cloud services come equipped with a set of built-in rate-limiting features that are straightforward to integrate. While these features help manage and optimize your APIs effectively, they have drawbacks. The limitation lies in the lack of flexibility and customization offered by these built-in rate-limiting tools. The tool's effectiveness largely relies on the configurations preset by the cloud services, limiting your ability to make adjustments beyond what's provided.

However, when utilizing external tools, you will have a more flexible and open platform for regulating rates within the application. There are very many external tools that you can use, and NGINX is among the best. Nevertheless, some service platforms provide more detailed possibilities to configure the rate limiting in contrast to the preceding section. There is something like Akamai and Cloudflare which provide a much more versatile environment than traditional hosting. The only disadvantage of using external tools is that they are somewhat challenging to accomplish and could prove time-consuming. Aside from this, external implementation comes with a load of benefits like:

Enhancing efficiency and capacity through handling large traffic.
Ensuring the availability of security and monitoring devices.
Enabling developers to have the flexibility of working with several ecosystems of their choice, in this case, the cloud provider ecosystems.

When you can select the right tool and equally apply the best technique or strategies, you should be in a perfect position to implement rate-limiting into web apps or systems as a means of defense.

Implementing Rate Limiting

Choosing the right rate-limiting strategy is one of the most critical activities that you must undertake if you want to get the best rate of sharing resources and fairly share these resources while ensuring the system’s optimum performance. When choosing a strategy, there are some things that you must always remember. These include your traffic pattern, the nature of your API, and the restrictions that are placed on infrastructural development.

There are middleware libraries available for use that can help restrict the rate at which requests are made, and a commonly used one is express-rate-limit in Express.js (under Node. js). You can check out the Express Rate Limit Github Repository for further documentation. This library gives you the customized option to scale on rate limits, manage exceeded limits, and have options to dictate several options for endpoints. Let's employ Express.js to describe how to properly implement rate limiting. And so, let's implement a simple rate limit using express-rate-limit middleware:

Installation

The first measure is typically the installation of Express.js and the express-rate-limit middleware with Node Package Manager (npm). Here's an example:

npm install express express-rate-limit

Creating a Simple Application

Here, we will generate a new file (e.g., app. js) and then write some codes to properly define your API. For example:

const express = require('express');
const app = express();
const port = 3000;

app.get('/', (req, res) => {
 res.send('Welcome to the API!');
});

app.listen(port, () => {
 console.log(`API server running at http://localhost:${port}`);
});

The above code boots up the Express application, stipulates (/) as the only route that will return the welcome message, and starts the server at port 3000.

Apply Rate-limiting Middleware

Subsequently, there is necessary to configure and apply express-rate-limit to help put the rate limits in place. Some factors will need to be configured, for instance, the time windows as well as the number of requests to allow. For example:

const rateLimit = require('express-rate-limit');

// Define rate limiting rules
const limiter = rateLimit({
 windowMs: 15 * 60 * 1000, // 15 minutes
 max: 100, // Limit each IP to 100 requests per windowMs
 message: 'Too many requests from this IP, please try again after 15 minutes'
});

// Apply rate limiting to all requests
app.use(limiter);

The above shows a rate-limiting rule that allows only one hundred requests within fifteen minutes. If the count exceeds the limit set, the middleware should return status 429 (Too many requests) and your custom message.

Start the Server

Finally, you will need to start up the server so it can perform the rate-limiting function. Here's a code example:

node app.js

While the application is running, the middleware that you developed should now observe every request. Therefore, when a specific IP address makes more than 100 requests in 15 minutes, the API will return with a status code 429 coupled with the message "Too many requests from this IP. Please try again in 15 minutes."

Selective Rate Limiting

In your API, there are cases where you want to exempt certain routes from being rate-limited and focus on a specific route instead. For example, if your plans for rate limits are only to restrict routes loaded with loads of sensitive information or resources, you can achieve this. This can be done in Express.js by applying the rate-limiting middleware to a particular route. Here's a code example:

// Apply rate limiting only to the login route
app.use('/login', limiter);

// No rate limiting on this route
app.get('/public-info', (req, res) => {
    res.send('This route is not rate limited.');
});

In the above example, rate-limiting middleware (limiter) is used only for the /login route to avoid abuse. The other routes, such as /public-info, are not assigned any limitations and can be easily accessed. As mentioned, this approach can help secure only those more vulnerable endpoints and not the other routes.

Conclusion

Throughout the length of this article, we have explained what rate limiting means, its basics, why it is important, and a few things to remember while implementing it. Ideally, it is recommended that you attempt to select a strategy that works best to suit your API and the rest of the infrastructure. Therefore, choosing a proper tool together with the correct strategy is paramount to implementing rate-limiting to your APIs and getting the maximum benefits they can offer.

Implementing rate limiting to protect APIs from abuse