Introduction

In this episode of “Doing terrible things in the name of science” we will look into how we can write C extensions for Ruby. This would be pretty harmless… so… we will look into sharing memory between two Ruby processes and two-way communication. Of course, this could be separate processes, separate forks, you name it!

You may ask – why, oh why? Why would you do such terrible things? For the science of course. You monster. For science, you monster

(Disclaimer: A lot of code presented today is based on APIs that could change, and you shouldn’t do it in production. Never. And by never, I mean really never. Terrible things will happen. Cute little puppies will die. You have been warned.)

First things first – let’s start with an example

When using C code inside Ruby, we have quite a few options. You can use FFI, Rice or the simplest one – Ruby Inline which will do the trick today.

You can install it like any other gem: gem install RubyInline. So, let’s write the first program: 2 + 2 using C! (in Ruby).

require 'inline'
class CHello
  inline do |builder|
    builder.include '<stdio.h>'
    builder.c 'int sumThem() {
      return 2 + 2;
    }'
  end
end

Let’s dissect our simple program – but focusing on real CMeat (great name for some steakhouse!) here. We define all of our C code inside an .rb file, putting it into a block called inline. Each include will have its own builder.include call (this one is unnecessary here, but it’s included for the sake of completeness), each part of the code (like a function) will have its own builder.c call.

Result?

>> CHello.new.sumThem #=> 4

Yay! It’s working! So, how about passing parameters then?

require 'inline'
class CHello
  inline do |builder|
    builder.include '<stdio.h>'
    builder.c 'int sumThem(int a, int b) {
      return a + b;
    }'
  end
end

How about now?

CHello.new.sumThem(100, 200) #=> 300

Oh my god, it’s working, it’s so awesome, I don’t want to use Ruby ever again!!!111 But wait, what will happen if I use some bigger numbers? For example, I want to sum, I don’t know, 230 with 230? Let’s check it…

CHello.new.sumThem(2**30, 2**30) #=> -2147483648

And here comes our good C-friend: integer overflow. When walking out of Alice Ruby wonderland, you need to remember about all these things.

Then why would I use it?

One word.

Speed

Let’s benchmark isomorphic code to count all numbers from 1 to N in pure Ruby and in a C extension.

We set N to some arbitrarily long (but smaller than 231-1) number, for example 216. Our benchmark weapon of choice will be benchmark-ips.

require 'inline'
require 'benchmark/ips'
class Counter
  inline do |builder|
    builder.include '<stdio.h>'
    builder.c 'int cCount(int n) {
      int counter = 0;
      for(int i = 0; i < n; i++) {
        counter += i;
      }
      return counter;
    }'
  end

  def rCount(n)
    counter = 0
    i = 0
    while i < n
      counter += i
      i += 1
    end
    counter
  end
end

tested_class = Counter.new

Benchmark.ips do |x|
  x.report("C Method") { tested_class.cCount(2**16) }
  x.report("Ruby Method") { tested_class.rCount(2**16) }
  x.compare!
end

And the results? Mindblowing:

$ ruby sumthem.rb
Warming up --------------------------------------
            C Method   215.194k i/100ms
         Ruby Method    45.000  i/100ms
Calculating -------------------------------------
            C Method      6.306M (±10.3%) i/s -     31.203M in   5.003825s
         Ruby Method    473.080  (± 5.3%) i/s -      2.385k in   5.055964s

Comparison:
            C Method:  6305730.9 i/s
         Ruby Method:      473.1 i/s - 13329.09x  slower

Mind blown

…and that’s precisely why you shouldn’t use Ruby for performance-critical numeric calculations. Still, it’s very convenient not to be worried about integer overflow, converting to bigger data types when needed, etc.

The godforsaken part

And now time for the real meat. The part where children cry, sysadmins run away screaming. Sharing memory.

First of all, we will need some C-tools to do it. There are a few different ways how you can share memory between processes in a typical Unix system (POSIX standard). I choose the shmget and shmat approach – it was the simplest one and very good for this use case.

First of all, we will need a way to name our shared memory segment. We will use the key 123. We need to provide it to the shmget call, so our call should look like this: shmget(key, 1024, 0644 | IPC_CREAT). The first argument is a unique identifier of the shared memory segment, the second is length (I choose one kilobyte – why not?), and the third one specifies access control. You can read about it here.

What we get from shmget is another identifier, which this time we need to provide to shmat – which attaches shared memory to our process at the specified (or new) address. The final call should always be shmdt which will detach our memory.

So let’s write some code, shall we?

require 'inline'
class Sharer
  inline do |builder|
    builder.include '<sys/types.h>'
    builder.include '<sys/ipc.h>'
    builder.include '<sys/shm.h>'
    builder.c 'int getShmid(int key) {
      return shmget(key, 1024, 0644 | IPC_CREAT);
    }'
    builder.c 'int getMem(int id) {
      return shmat(id, NULL, 0);
    }'
    builder.c 'int removeMem(long id) {
      return shmdt(id);
    }'
  end
end

(You may get some warnings, due to type conversions here, but don’t worry about them.)

Ok, so now we can create shared memory, attach it, and what?

Here’s Johnny Fiddle::Pointer

Here's Johnny

One of the most powerful, yet little-known knifes in our Ruby toolbox is Fiddle. Why did I say knife? Well, it’s very pointy, very sharp and you can hurt yourself severely with it. But that’s part of the fun!

Fiddle is a neat library that allows you to interact with Ruby on the level of C pointers. Which sounds kind of cool, because this is our use case!

As an aside – Fiddle::Pointer also allows you to unfreeze a Ruby object, which is a bad idea, but since this is a dirty hack we can show it to you:

str = 'water'.freeze
str.frozen? # true

# the C pointer to a Ruby object is equal to
# its object_id shifted one bit left (doubled)
memory_address = str.object_id << 1

Fiddle::Pointer.new(memory_address)[1] &= ~8
str.frozen? # false

Why does this work? It flips the memory bit which carries the frozen flag in a Ruby object. Please don’t use it.

I will use memory changes in the scope of forks; it’s the easiest way to show it in code. You can move the code from the fork block to another Ruby process and it will still work the same.

basic_id = Sharer.new.getShmid(123)
address = Sharer.new.getMem(basic_id)
pointer = Fiddle::Pointer.new(address, 1024)

Now the pointer contains the reference to our brand new shared memory. Let’s set the first byte of it to some value. I like the number 4, so:

pointer[0] = 4

And now, time for some magic. We will fork the process, attach this segment second time, fork memory space (why can’t we use original one?), change the value of pointer[0] and see if the original has changed.

pid = fork do
  basic_id = Sharer.new.getShmid(123)
  address = Sharer.new.getMem(basic_id)
  pointer = Fiddle::Pointer.new(address, 1024)

  pointer[0] = 44

  Sharer.new.removeMem(address)
end
Process.wait(pid)
puts pointer[0]

And the result is…

puts pointer[0] # => 44

Great! Now we can communicate with our fork both ways.

Bravo

Can we pass an object?

And now it’s the time for the most tricky part. Can we pass a Ruby object from a fork to its parent memory? Yes and no. We can do this with a simple object, but if the object is, for example, a hash, it contains a lot of references inside (keys, values, etc.). Therefore passing it would require a lot of hacks. But for one, small, simple object it’s possible.

Notice that we couldn’t set the memory of the newly created object. However, Fiddle allows copying the whole memory from one segment to other. So we will create a new object, get its pointer, and copy the entire memory from this object to shared memory.

Let’s change our code at fork:

pid = fork do
  basic_id = Sharer.new.getShmid(123)
  address = Sharer.new.getMem(basic_id)
  pointer = Fiddle::Pointer.new(address, 13) # 13 bytes is enough, tested

  obj = Object.new
  second_pointer = Fiddle::Pointer.new(obj.object_id << 1)
  pointer[0, 13] = second_pointer

  Sharer.new.removeMem(address)
end

Process.wait(pid)
puts pointer.to_value

Sharer.new.removeMem(address)

And the result is…

puts pointer.to_value # => #<Object:0x0000010d2c5000>

Awesome!

Is it useful?

I can think of only one useful application. If we can write any arbitrary Ruby object to shared memory and we can access this memory from another process, we can quickly look into shared memory in some C program and dump carefully chosen objects from Ruby.

Let’s look at an example of such C code:

#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/shm.h>
#include <stdio.h>

int main(void) {
  int basicId = shmget(123, 1024, 0644 | IPC_CREAT);
  char *address = shmat(basicId, 0, 0);
  getchar();
  char *s;
  int i;
  for (s = address+15, i = 0; i < 14; i++) {
  // in a typical situation you would write this as
  // (s = address, i = 0; i < 1024; i++)
  // however I tweaked this parameters to only display
  // the contents of a string, without flags etc.
  // This is a topic for another blog note ;)
    putchar(*s);
    s++;
  }
  putchar('\n');

  shmdt(address);
}

For the sake of simplicity, let’s just write a string to our shared memory…

basic_id = Sharer.new.getShmid(123)
address = Sharer.new.getMem(basic_id)
pointer = Fiddle::Pointer.new(address, 1024)

string = "Hello Rebased"
second_pointer = Fiddle::Pointer.new(string.object_id << 1)
pointer[0, 1024] = second_pointer

gets

Sharer.new.removeMem(address)

…compile the C code…

gcc --std=c11 dump_me.c -o dump_me.o

…run the Ruby code, run our dumper program, and voila:

$ ./dump_me.o

Hello Rebased

Conclusion

In this note, we found an excellent way to embed C in Ruby and discovered the speed of this approach. And – last but not least – we dove deep into the forbidden land of shared memory.

Complete code for the shared memory example

require 'inline'
require 'fiddle'

class Sharer
  inline do |builder|
    builder.include '<sys/types.h>'
    builder.include '<sys/ipc.h>'
    builder.include '<sys/shm.h>'
    builder.c 'int getShmid(int key) {
      return shmget(key, 1024, 0644 | IPC_CREAT);
    }'
    builder.c 'int getMem(int id) {
      return shmat(id, NULL, 0);
    }'
    builder.c 'int removeMem(long id) {
      return shmdt(id);
    }'
  end
end

basic_id = Sharer.new.getShmid(123)
address = Sharer.new.getMem(basic_id)
pointer = Fiddle::Pointer.new(address, 1024)

pid = fork do
  basic_id = Sharer.new.getShmid(123)
  address = Sharer.new.getMem(basic_id)
  pointer = Fiddle::Pointer.new(address, 13) # 13 bytes is enough, tested

  obj = Object.new
  second_pointer = Fiddle::Pointer.new(obj.object_id << 1)

  pointer[0, 13] = second_pointer

  Sharer.new.removeMem(address)
end

Process.wait(pid)
puts pointer.to_value

Sharer.new.removeMem(address)