Concurrency vs Parallelism in Ruby Apps

August 17, 2020

A thing that came up some weeks ago which confused me, is whether languages like Python and Ruby are multithreaded. This my attempt to explain to myself how it works in Ruby, and I hope it helps you too. Firstly, we need to distinguish concurrency and parallelism, which are conflated with multithreading, but are not the same. Concurrency can be thought of as interleaving, so if two jobs are switched back and forth very quickly, there is a sense that both are being done ‘at the same time’, but they are merely being done concurrently. For example, you may be eating food, and drinking beer. Take a bite, then sip, then bite, then sip, so you are concurrently drinking beer and eating food. But you are not literally drinking beer and eating food at the same time, that would require you to have both the cup to your lips and fork to your lips at the same time(in parallel), which is not possible. To do so you’d need two mouths.

Second, Ruby has several implementations, the most popular of which is MRI(Matz’s Ruby Implementation), named after the Ruby creator Yukihiro Mastumoto. This is the canonical ‘Ruby’ that everyone refers to when they say ‘Ruby, the programming language’. In MRI, there is something called GIL(Global Interpreter Lock) that ensures that only one thread is ever running at once. Why the GIL is there in the first place will be a rabbit hole for another time. Which means when you call Thread.new in Ruby and schedule a job on it, it isn’t really running in parallel, because the GIL is locking Ruby code. There are other Ruby implementations like JRuby, that does not have the GIL and on those implementations, true parallelism is possible. On the Python side, the story is the same, with CPython(the default implementation) having a GIL.

However, ruby threads are also native threads(only true as of Ruby 1.9). This means that every Ruby thread is backed by an OS thread. When a Ruby application blocks on i/o, the ruby runtime can actually switch to allow another thread to continue running, because this ‘blocking’ happens outside of the GIL. For example, if your Ruby application makes a network request, and is waiting for the network to respond, it can actually release its lock on the GIL, and allow another thread to serve an incoming request. When the network contents are fetched, the OS interrupts the blocked thread, and allows the thread to resume. And so it can be said that some amount of parallelism is happening in here! However, this only happens for I/O operations. To contrast it, if you had two threads handling incoming web requests, and two requests came in at the same time, you can bet that whether the first thread handles both requests, or whether they are each handled by different threads, the GIL will ensure that only one thread is only handling a request at any given time. In short: No parallelism during compute only operations. This is good news for Ruby and web applications, since the nature of web applications are that they are i/o bound. This means that most of the time, Ruby applications are blocked waiting for the database or the network. As discussed above, this block happens outside of the GIL, and so if a request arrives while the current thread is blocked on i/o, Ruby can execute the thread that serves that request.

Now, what about Puma? Doesn’t that enable parallelism? Yes, parallelism happens with Puma, but through a different mechanism. Puma forks multiple OS processes, creating multiple copies of your apps in memory(multiprocessing). As a reminder, processes provides the resources needed to execute a program, and are isolated from each other by the OS, have their own virtual address space, executable code, environment variables, process identifier, and at least one thread of execution(the main thread). Threads on the other hand, is an entity within a process that can be scheduled for execution, but shares the processes’s virtual address space and system resources.

So when Puma starts 5 worker processes, there are 5 copies(processes) of Rails apps running, isolated from each other by the OS. These copies live in memory(and thus takes up RAM), have their own db connection pools, and so on. However, if there are less than 5 cpu cores on the machine it is running on, our 5 workers would not be able to achieve full (compute) parallelism under peak load. Each worker then transparently schedules additional threads to serve the ruby application, so that the application doesn’t need to think about it.

To come back to the original question, does Ruby support multithreading? If we simply define multithreading as having a thread primitive, then Ruby and Python are definitely multithreaded. But that doesn’t mean that those threads are running in parallel. But even if they aren’t running in parallel, multithreading in Ruby speeds up web applications because they are i/o heavy!

Thanks to Tom Clark for reviewing drafts of this.