If you’ve ever implemented a public facing API, you will know how important rate limiting is. If you don’t someone, somewhere will abuse your API’s sending millions of requests a second, either because they want to pull your your data (just ask FFS), because they want to take your service down or just because they messed up the coding and put the request in an infinite loop (… no… I’ve never done that… honest… Sorry @Jack). Whatever the reason, such a huge volume of requests will, like the “Tragedy of the commons“, cause at least a reduced service for other users and increased costs for you and, if left unchecked, eventually cause your service to start falling over.
The answer to this is of course to limit the number of requests a user can make. Their are two main approaches to this, limit by IP address or limit by account, for the purposes of this post, it doesn’t matter too much which you choose, their are good arguments for either (or both) schema’s, the one you decided to implement will be dependent on your use, threat model and willingness to manage accounts.
Regardless of which approach you choose the logic you need to follow is the same. For the purposes of the examples below I’ll assume we’re managing by a key and allowing no more than 100 requests every 10 minutes.
- When a request is made, and before you perform whatever function the API performs, log the key/IP address in a table. The details you store here should be just the key/IP address, the endpoint being accessed and a DateTime stamp of the transaction
- Perform an SQL query something like SELECT count(*) FROM table WHERE key=@key AND timestamp < DATE_ADD(CURRENT_TIMESTAMP(), INTERVAL -10 MINUTE);
- This will return the total number of requests in the last 10 minutes, if that’s less then 101 (101 because we’ve already logged the current request) then we can allow the call otherwise we return a failed message.
The problem with this is that this table will grow huge in a very short amount of time, therefore we should add an event to reduce the size of this table every hour or so (of courses if you want to the know exactly who has accessed what and when, you could you this as as way to log that information).
A second problem arises with this if we want to allow different limits for different users. For example you may want to allow FaceBook to make slightly more calls to your API’s then you would allow this blog to make. This can easily be implemented with a step 0.
- Validate the users key and retrieve their settings.
This allows you to revoke keys, set limits at the key level and do fancy things like return different data for different users (who pay different amounts) based on the program they have signed up to.