Integrating with rate limited APIs

Melia Capital
4 min readFeb 12, 2021

Web services often enforce rate limits on requests to their APIs. Let’s talk about a few use cases for API rate limits.

Cloud providers, like AWS (Amazon Web Services) and GCP (Google Cloud Platform), charge applications per API request and enforce limits based on their service plans.

When building a web service, depending on what types of resources you serve, you may want to ensure that your API provides fair access to your clients.

From a security perspective, rate limiting your service ensures that no single client can spam your endpoints and monopolize access to your resources. The rate limiting logic may track your IP address and when the frequency of your requests exceeds a threshold, subsequent requests are denied for some time period.

Malicious programmers may try a denial of service attack on your servers. To defend against a single attacker we can use the IP tracking strategy. To circumvent this, attackers may orchestrate a distributed group of machines to spam your servers, called a distributed denial of service attack, or DDoS for short. Rather than build your own defenses, you can use a service like Cloudflare.

I recently ran into an issue with my program hitting the rate limit on the public SEC EDGAR filings API. Here’s the notice from https://www.sec.gov/developer:

Fair Access

To ensure that everyone has equitable access to SEC EDGAR content, please use efficient scripting, downloading only what you need and please moderate requests to minimize server load. Current guidelines limit each user to a total of no more than 10 requests per second, regardless of the number of machines used to submit requests.

To ensure that SEC.gov remains available to all users, we reserve the right to block IP addresses that submit excessive requests. See the SEC’s Web Site Privacy and Security Policy.

Here’s the function I wrote:

import_filing_details

I take all the filings from my context and filter the list to just the filings between the dates I’m interested in. Then, in a loop, I fetch the details and save them.

Initially, my Sec module was stateless and the fetch_filing_detail had no rate limiting logic so the SEC servers quickly began to deny my requests.

I decided to make my Sec module a stateful gen server responsible for limiting the rate of requests made to the SEC API.

I decided that a synchronous server would be sufficient because this module is only used by internal processes and does not receive a large amount of traffic. I’d rely on the gen server mailbox to queue requests, and the synchronous handler to apply back pressure to the clients.

Here’s what the Sec module looks like:

Sec module

The only state I needed to keep track of is the time the last request was made. I’m using UTC time zone and I initialize the server state with the current time. I define a rate limit of 100 ms, the minimum amount of time I must wait between requests.

fetch_filing_detail

I provide clients with an API to call the server process. This is standard gen server practice.

handle_call

In my server handler, I call rate_limit with the function I’d like to execute and the last time a request was made from the server state. Once the request is made, I pass back the result and update the server state with the current time.

Here’s how I broke down the rate limiting problem. Using the current time to calculate 100 ms ago, I know that the last request time can only fall into three states when compared to it (less than, equal, and greater than). Only when the last request time is greater than the threshold, do I need to postpone the execution of the request.

rate limiting problem break down

Here’s what the implementation of my rate_limit function looks like:

rate_limit

When the last request time falls within the rate limiting threshold, I calculate how much time the function call should be delayed. I sleep the process and execute the function.

Cheers and happy coding!

Melia Capital

--

--

Melia Capital

I’m on a mission to make finance and technology more accessible. I’m a computer programmer, engineer, data and distributed systems enthusiast.