At Rebased we have an internal time tracking project called Harmonogram. Its backend is built in Ruby on Rails. It started as a playground for new hires to get used to company culture and conventions, but over time grew into a fully-functional tool. When I joined, the application was already being used internally and we were hitting interesting performance problems. One of them was reaching the rate limit of Toggl API calls. And since this was my first week at the company, I was tasked with solving this issue.

The problem

The Toggl documentation recommended sending requests at most every second in order to avoid hitting the limit – which we were already doing in a naive way, so the solution wouldn’t be as easy as adding sleep 1 to every request.

page = 1
loop do
  response = client.get(payload)
  report_data.concat(response['data'])
  page += 1

  break if report_data.size >= response['total_count']
  sleep 1 # not enough :(
end

After looking around, I’ve realised the code above was called from many places and processes: the web server, manual Rake tasks, background jobs and cron. This made the sleep 1 solution ineffective. I needed something that would synchronise all of these callers in a way that:

  • doesn’t introduce too many changes to the existing code,
  • doesn’t add any new technologies to the current stack,
  • guarantees that there’s at least a 1 second delay between calls.

In order to pull this off I’d need to setup a semaphore shared by multiple Ruby processes – and a mechanism that would block API calls until at least a second has passed since the last one. I needed a place to store and share information about those calls. Redis seemed like a perfect solution, mainly because we can setup keys that expire after a given amount of miliseconds – which would allow us to store a key that expires after one second and have other callers wait until it expires. Luckily we were already using Redis for background jobs, so I wouldn’t be adding new dependencies.

The solution

To keep the changes to a minimum I’ve decided to create a new class. It should allow us to execute a block of code with a globally limited rate, but without having to know how that happens. Ideally we’d initialize that “limiter” and inject it where needed, keeping the rate limiting separate from other logic.

page = 1
loop do
  # now using rate limiter instead of sleeping
  response = rate_limiter.with_limited_rate { client.get(payload) }
  report_data.concat(response['data'])
  page += 1

  break if report_data.size >= response['total_count']
end

If we assume that the rate limiter works as needed, this should solve our problem. But something doesn’t look right. Rate limiting should be a responsibility of the client. It’s there to keep us separated from the low-level details of communicating with the API. That includes rate limiting.

class Client
  def initialize(token, rate_limiter: TogglApi::RateLimiter.new)
    # ...
    @rate_limiter = rate_limiter
  end

  def get(path:)
    response = rate_limiter.with_limited_rate { @client.get(path) }
    Response.parse(response)
  end

  # ...

  private

  attr_reader :rate_limiter
end

This looks better. Now the API client deals with rate limiting and we can simplify our example from earlier:

def some_method
  page = 1
  loop do
    # no need to worry about rate limiting now
    response = client.get(payload)
    report_data.concat(response['data'])
    page += 1

    break if report_data.size >= response['total_count']
  end
end

def client
  @client ||= Client.new(toggl_api_token, rate_limiter: RateLimiter.new)
end

The rate limiter

Now that we have an idea of how the limiter interface should look like, we can talk about the implementation details.

class RateLimiter
  TimedOut = ::Class.new(::StandardError)

  REDIS_KEY = "harmonogram_#{Rails.env}_rate_limiter_lock".freeze

  def initialize(redis = Redis.current)
    @redis = redis
    @interval = 1 # seconds between subsequent calls
    @timeout = 15 # amount of time to wait for a time slot
  end

  def with_limited_rate
    started_at = Time.now
    retries = 0

    until claim_time_slot!
      if Time.now - timeout > started_at
        raise TimedOut, "Started at: #{started_at}, timeout: #{timeout}, retries: #{retries}"
      end

      sleep seconds_until_next_slot(retries += 1)
    end

    yield
  end

  private

  attr_reader :redis, :interval, :timeout

The main element is the with_limited_rate that ends with a yield call. It calls the private claim_time_slot! in a loop until it either succeeds or runs out of time. We give it a limited amount of time because we don’t want it to hang forever, causing timeouts in other places. In case of a timeout we throw a custom error with data for debugging. Inside the loop, there’s a sleep call with a calculated delay in seconds.

  def claim_time_slot!
    redis.set(REDIS_KEY, 'locked', px: (interval * 1000).round, nx: true)
  end

The claim_time_slot! method is straightforward. It calls the Redis instance to set the value 'locked' on the REDIS_KEY key. This value will expire after px milliseconds and nx: true means it will only set the value if it doesn’t exist yet. The return value of redis.set is truthy when the key was sucessfully created and falsy otherwise. In other words, if no other instance of the rate limiter called redis.set in the last second, then claim_time_slot! will return true and the block passed to with_limited_rate will be called.

  def seconds_until_next_slot(retries)
    ttl = redis.pttl(REDIS_KEY)
    ttl = ttl.negative? ? interval * 1000 : ttl
    ttl += calculate_next_slot_offset(retries)
    ttl / 1000.0
  end

  # Calculates an offset between 10ms and 50ms to avoid hitting the key right before it expires.
  # As the number of retries grows, the offset gets smaller to prioritize earlier requests.
  def calculate_next_slot_offset(retries)
    [10, 50 - [retries, 50].min].max
  end
end

The seconds_until_next_slot is more interesting. It uses the pttl method that returns the current TTL (time to live) of a given key, in miliseconds. At this point we know that another instance of the limiter has claimed a time slot. And we want to figure out how long we have to wait. It is possible though that the key no longer exists because it’s just expired. In that situation the returned value is negative and we replace it with a full interval to avoid unexpected race conditions or going through retries without waiting. Then we add a small offset to the TTL, convert it to seconds and return the calculated value to be used in sleep.

Why not ask for seconds in the first place if we’re converting them anyway? Asking Redis for TTL in seconds will return a rounded value. By asking for miliseconds we can convert them to fractions of a second.

What’s the point of calculate_next_slot_offset(retries)? This is a trick that will allow us to prioritize calls that have been waiting for a slot longer. A kind of scheduler, if you will.

calculate_next_slot_offset(0) # => 50
calculate_next_slot_offset(5) # => 45
calculate_next_slot_offset(10) # => 40
calculate_next_slot_offset(15) # => 35
calculate_next_slot_offset(50) # => 10
calculate_next_slot_offset(100) # => 10

Calls with higher retries count will get smaller offset and therefore have a higher chance of “claiming” a time slot before others. Given a 1s interval and offset values ranging from 10 to 50, we will be able to prioritize callers with timeouts of up to ~40s.

The 10ms minimal value is there to make sure we don’t hit the key just before it expires and miss an empty slot.

Closing thoughts

With the rate limiter implemented and the client configured to use it, we can stop worrying about receiving errors from the Toggl API. But it’s not ready for release just yet. We still have to write unit tests and make sure they don’t take too long to run. Stay tuned for part 2, where we’ll do just that.

Meanwhile – a working demo of the rate limiter!

This solution is a version of a distributed lock, you can read more about that on the Redis website.