modeling concurrency in ruby and beyond
DESCRIPTION
The world of concurrent computation is a complicated one. We have to think about the hardware, the runtime, and even choose between half a dozen different models and primitives: fork/wait, threads, shared memory, message passing, semaphores, and transactions just to name a few. And that's only the beginning.What's the state of the art for dealing with concurrency & parallelism in Ruby? We'll take a quick look at the available runtimes, what they offer, and their limitations. Then, we'll dive into the concurrency models and ask are threads really the best we can do to design, model, and test our software? What are the alternatives, and is Ruby the right language to tackle these problems?Spoiler: out with the threads. Seriously.TRANSCRIPT
Modeling Concurrency @igrigorik
Modeling concurrency in Ruby and beyond
Ilya Grigorik@igrigorik
what is an advanced concurrency model?
Modeling Concurrency @igrigorik
“Concurrency is a property of systems in which several computations are executing simultaneously, and
potential interacting with each other.”
Modeling Concurrency @igrigorik
Neither. You need them both.and neither is enough…
Threads!
No. Events!
Modeling Concurrency @igrigorik
Hardware Parallelism maximizing resource utilization
~100 ns
~0.5 ns
~7 ns
2Ghz CPU = 0.5 ns cycle
RAM: 2000 wasted cycles!
• Prefetching• Brand prediction• Instruction pipelining• Hyperthreading• Speculative execution• …
http://bit.ly/cSKKVb
Modeling Concurrency @igrigorik
A quick pollwhich is faster?
if (cond1 && cond2) { System.err.println("Am I faster yet?");}
if (cond1 || cond2) { System.err.println("Am I fast yet?");}
1
2
Turns out. We don’t know.
Modeling Concurrency @igrigorik
Hardware Parallelism
Software Parallelism(Processes, Threads, Events)
The “concurrency API”a bolt-on systems component for any language
pthreads, lwkt, epoll, kqueue, …
C / C++, Java, Ruby, ….
Modeling Concurrency @igrigorik
More advanced concurrency features?
Bruce: if you could go back in time, what is the one thing you would change?
Matz: “I would remove the thread and add actors or some other more advanced concurrency features”
Modeling Concurrency @igrigorik
Hardware Parallelism
Software Parallelism(Processes, Threads, Events)
C / C++, Java, Ruby, ….
pthreads, lwkt, epoll, kqueue, …
“Advanced concurrency model”
New!
Modeling Concurrency @igrigorik
Dataflow
Transactional Memory
Actor Model
Pi-calculus / CSP
Petri-nets
…
http://bit.ly/fMLJR8
Modeling Concurrency @igrigorik
The value of a tool / model is in:(a) what it enables you to do
(b) the constraints it imposes
• Provide a way to express a behavior• Dictate a structure• Dictate a style
• Disallow unwanted behavior• Implicitly “make the right choice”• Eliminate a class of errors
Modeling Concurrency @igrigorik
The history: actor modelLet’s rewind back to the 1973 …
“A Universal Modular Actor Formalism for Artificial Intelligence”Carl Hewitt; Peter Bishop and Richard Steiger (1973)
“Semantics of Communicating Parallel Professes”Irene Grief (MIT EECS Doctoral Dissertation. August 1975)
…
Erlang (1986), Scala (2003), Kilim, …
Modeling Concurrency @igrigorik
1. Give every process a name2. Give every process a “mailbox”3. Communicate via messages
• A --> B
Actor ModelThe 50k foot view…
Enables:• Message centric view• Communication between:
threads, processes, machines• Distributed programming
Constraints:• No side-effects• No race conditions• No mutexes, no semaphores
Modeling Concurrency @igrigorik
The history: CSP modelLet’s rewind back to the 1978 …
“Communicating sequential processes”Hoare, C.A.R. (1978)
CCS, pi-calculus, …
…
Limbo (1995), Go (2007), CSP++, PyCSP…
Modeling Concurrency @igrigorik
1. Processes are anonymous2. Give every channel a name3. Processes communicate
over named channels• Think UNIX pipes…
CSP / Pi-calculusThe 50k foot view…
Enables:• Message centric view• Communication between:
threads, processes, machines• Distributed programming
Constraints:• No side-effects• No race conditions• No mutexes, no semaphores
Modeling Concurrency @igrigorik
Multiple workers can share a channelA
Send a “response” channel to another process!
A(B)
B
Workers are mobile! Delegate the channelto someone else!
A
C(A) A
Modeling Concurrency @igrigorik
gem install agentlet’s get hands on…
Modeling Concurrency @igrigorik
Producer / Consumerlook, no threads!
c = Agent::Channel.new(name: 'incr', type: Integer)
go(c) do |c, i=0| loop { c << i+= 1 }end
p c.receive # => 1p c.receive # => 2
Named channel Typed channel
Spawn the worker
Consume the results
Modeling Concurrency @igrigorik
A “multi-threaded” server!where’s the synchronization?
Request = Struct.new(:args, :resultChan)clientRequests = Agent::Channel.new(name: :clientRequests, type: Request, size: 2)
worker = Proc.new do |reqs| loop do req = reqs.receive sleep 1.0 req.resultChan << [Time.now, req.args + 1].join(' : ') endend
# start two workersgo(clientRequests, &worker)go(clientRequests, &worker)
req1 = Request.new(1, Agent::Channel.new(:name => "resultChan-1", :type => String))req2 = Request.new(2, Agent::Channel.new(:name => "resultChan-2", :type => String))
clientRequests << req1clientRequests << req2
puts req1.resultChan.receive # => 2010-11-28 23:31:08 -0500 : 2puts req2.resultChan.receive # => 2010-11-28 23:31:08 -0500 : 3
“Request” type
Modeling Concurrency @igrigorik
Request = Struct.new(:args, :resultChan)clientRequests = Agent::Channel.new(name: :clientRequests, type: Request, size: 2)
worker = Proc.new do |reqs| loop do req = reqs.receive sleep 1.0 req.resultChan << [Time.now, req.args + 1].join(' : ') endend
# start two workersgo(clientRequests, &worker)go(clientRequests, &worker)
req1 = Request.new(1, Agent::Channel.new(:name => "resultChan-1", :type => String))req2 = Request.new(2, Agent::Channel.new(:name => "resultChan-2", :type => String))
clientRequests << req1clientRequests << req2
puts req1.resultChan.receive # => 2010-11-28 23:31:08 -0500 : 2puts req2.resultChan.receive # => 2010-11-28 23:31:08 -0500 : 3
Sleep, increment, add timestamp
wait for work
A “multi-threaded” server!where’s the synchronization?
Modeling Concurrency @igrigorik
Request = Struct.new(:args, :resultChan)clientRequests = Agent::Channel.new(name: :clientRequests, type: Request, size: 2)
worker = Proc.new do |reqs| loop do req = reqs.receive sleep 1.0 req.resultChan << [Time.now, req.args + 1].join(' : ') endend
# start two workersgo(clientRequests, &worker)go(clientRequests, &worker)
req1 = Request.new(1, Agent::Channel.new(:name => "resultChan-1", :type => String))req2 = Request.new(2, Agent::Channel.new(:name => "resultChan-2", :type => String))
clientRequests << req1clientRequests << req2
puts req1.resultChan.receive # => 2010-11-28 23:31:08 -0500 : 2puts req2.resultChan.receive # => 2010-11-28 23:31:08 -0500 : 3
Both workers listen on same channel
A “multi-threaded” server!where’s the synchronization?
Modeling Concurrency @igrigorik
Request = Struct.new(:args, :resultChan)clientRequests = Agent::Channel.new(name: :clientRequests, type: Request, size: 2)
worker = Proc.new do |reqs| loop do req = reqs.receive sleep 1.0 req.resultChan << [Time.now, req.args + 1].join(' : ') endend
# start two workersgo(clientRequests, &worker)go(clientRequests, &worker)
req1 = Request.new(1, Agent::Channel.new(:name => "resultChan-1", :type => String))req2 = Request.new(2, Agent::Channel.new(:name => "resultChan-2", :type => String))
clientRequests << req1clientRequests << req2
puts req1.resultChan.receive # => 2010-11-28 23:31:08 -0500 : 2puts req2.resultChan.receive # => 2010-11-28 23:31:08 -0500 : 3
Create two requests, each with return channel of type String
A “multi-threaded” server!where’s the synchronization?
Modeling Concurrency @igrigorik
Request = Struct.new(:args, :resultChan)clientRequests = Agent::Channel.new(name: :clientRequests, type: Request, size: 2)
worker = Proc.new do |reqs| loop do req = reqs.receive sleep 1.0 req.resultChan << [Time.now, req.args + 1].join(' : ') endend
# start two workersgo(clientRequests, &worker)go(clientRequests, &worker)
req1 = Request.new(1, Agent::Channel.new(:name => "resultChan-1", :type => String))req2 = Request.new(2, Agent::Channel.new(:name => "resultChan-2", :type => String))
clientRequests << req1clientRequests << req2
puts req1.resultChan.receive # => 2010-11-28 23:31:08 -0500 : 2puts req2.resultChan.receive # => 2010-11-28 23:31:08 -0500 : 3
Dispatch both requests
A “multi-threaded” server!where’s the synchronization?
Modeling Concurrency @igrigorik
Request = Struct.new(:args, :resultChan)clientRequests = Agent::Channel.new(name: :clientRequests, type: Request, size: 2)
worker = Proc.new do |reqs| loop do req = reqs.receive sleep 1.0 req.resultChan << [Time.now, req.args + 1].join(' : ') endend
# start two workersgo(clientRequests, &worker)go(clientRequests, &worker)
req1 = Request.new(1, Agent::Channel.new(:name => "resultChan-1", :type => String))req2 = Request.new(2, Agent::Channel.new(:name => "resultChan-2", :type => String))
clientRequests << req1clientRequests << req2
puts req1.resultChan.receive # => 2010-11-28 23:31:08 -0500 : 2puts req2.resultChan.receive # => 2010-11-28 23:31:08 -0500 : 3
Collect the results!A “multi-threaded” server!
where’s the synchronization?
Modeling Concurrency @igrigorik
So, Ruby?JRuby, RBX, MacRuby, MRI, …
Modeling Concurrency @igrigorik
The many Rubies…for your concurrency experiments
JRuby:• No GIL• JVM threads• Existing libraries & frameworks:
Akka, Kilim, etc• Great platform for experiments
MacRuby:• Grand Central Dispatch• MacRuby + IOS?• GCD + higher level API?
Rubinius:• Hydra branch: no GIL• Built in Channel / Actor primitives• Great platform to experiment with
with new language features
MRI:• GIL• Research work on MVM• ... agent?
Modeling Concurrency @igrigorik
Pick up & experiment with other runtimes!learn what works, find what resonates…
IO:• Small, compact, easy to learn• Actor based concurrency• http://iolanguage.com/
Go:• Released by Google in ‘07• CSP + channels• http://golang.org/
Clojure:• JVM + Functional programming• Transactional memory• http://clojure.org/
Scala:• JVM• Actor based concurrency• http://www.scala-lang.org/
… and many others …
Modeling Concurrency @igrigorik
In Summary:• We need threads; we need events; we need locks; we need shared memory; …• Are threads, events, etc., the right API for modeling concurrency? Likely not.• Threads, events, etc., should belong under the hood.
Hardware Parallelism
Software Parallelism(Processes, Threads, Events)
pthreads, lwkt, epoll, kqueue, …
CSP / Actor / Dataflow / Transactional Memory
Modeling Concurrency @igrigorik
Phew, time for questions?hope this convinced you to explore the area further…
Concurrency with Actors, Goroutines & Rubyhttp://www.igvita.com/2010/12/02/concurrency-with-actors-goroutines-ruby/
Multi-core, Threads & Message Passing:http://www.igvita.com/2010/08/18/multi-core-threads-message-passing/
gem install agenthttps://github.com/igrigorik/agent/https://github.com/igrigorik/agent/tree/master/spec/