Introduction

In the Rails community Redis is known to be the most famous tool for implementing background jobs. As we know, fame has its bad sides - think Michael Jackson, Axl Rose and Rodion Romanovich Raskolnikov. Let’s look at the unexplored, brighter side of Redis to see what it has to offer!

Background

In my personal project CodeQuack I had to coordinate remote workers distributed over the network. There were a few constraints I had to impose on them:

  1. In the whole network of workers, there must be only one worker per user.
  2. The worker could not be requested by user if too little time elapsed from last request.
  3. The lifetime of a worker is limited.

The second point is often known as API throttling. When one calls an API too often, it just stops responding. A perfect defensive measure against DoS attacks!

About Redis

The name Redis means REmote DIctionary Server. Redis is as simple as that - there is not much to understand. Why it is so easy? Because at its core Redis commands are executed sequentially and we get expected results all the time.

Another cool thing is that commands can be grouped and executed as one atomic operation - a ‘transaction’.

Making Sure That a Command is Executed Once

Let’s assume that you have one special API call that is very restricted, but still available to any external service. Issuing this call twice can destroy your infrastructere, so you want it to be called exactly once.

From an API consumer’s perspective, some token is requested and then used to consume the resource. For example:

GET /spawn_vm?token=0an3dks7000xcj201ls HTTP/1.1
Host: api.example.com

The above call should succeed the first time, but fail on every subsequent attempt - even if the first and subsequent calls are being handled at the same time.

I will be using redis-rb to demonstrate the calls. First, we create a token which will be later used by the API consumer.

key = "user_token:#{token}"

# execute a set of commands atomically
res = redis.multi do
  redis.mapped_hmset(key, {
    'some_metadata' => metadata
    'used_by' => 0 # reference counter
  })
  # let's make the token valid only for some short time
  redis.expire(key, expiration_time_in_seconds)
end

It is really important to create the key and the set expiration time in one operation - if the service breaks, there will be tokens which never expire!

The other end of the API should increase the counter and let user consume it only once:

key = "user_token:#{token}"
res = redis.multi do
  redis.hgetall(key)
  redis.hincrby(key, "used_by", 1)
end

token_data = res[0] # hash containing token metadata
used_by = res[1] # counter value

if used_by == 1
  success # nobody used that token before
else
  failure # token not set or used more than once
end

Throttling API Calls

Let’s assume that we want to request a token, but also make the additional assumption that requesting a token costs some CPU time and memory.

Knowing that, we can easily attack the token provider by doing token requests simultaneously. It’s not even DDoS, it’s just a DoS…

Ladies and gentleman, let me introduce API throttling!

At the beginning of the critical call implementation add:

res = redis.multi do
  redis.get(key)
  redis.incr(key)
  redis.pexpire(key, expiration_time_in_miliseconds)
end

failure unless res[0].nil?

How does it work? If the key is set it means one thing - somebody accessed the API and the expiration time did not elapse, so access is temporarily forbidden.

More Sophisticated Approach to Rate Limiting

Now, let’s assume that we changed our mind and instead of 1/interval we shift to 2000/interval. One simply cannot do that with the pexpire command, which has a precision of 1 milisecond (and internal expiration accuracy is between 0-1 ms).

It’s tricky to do this using standard commands without introducing race conditions. Fortunately, Redis authors introduced Lua scripting capabilities. Time to use them!

max_calls = 2000 # call/interval rate

script = <<-LUA
  local call_counter
  call_counter = redis.call("incr", KEYS[1])
  if tonumber(call_counter) == 1 then
    redis.call("pexpire", KEYS[1], ARGV[1])
  end
  return call_counter
LUA
call_counter = redis.eval(script, [key], [expiration_time_in_miliseconds])

failure if call_counter > max_calls

Thanks to Lua code, we were able to increase the counter and compare its value in one operation.

Conclusion

Redis might be a powerful tool to coordinate work of distributed services. If you want to learn more, consult the official command reference. Whenever you implement distributed algorithms, always make sure that you are not introducing any race conditions. Think forward!