computer systems principles concurrency patterns
DESCRIPTION
Computer Systems Principles Concurrency Patterns. Emery Berger and Mark Corner University of Massachusetts Amherst. Web Server. web server. Client (browser) Requests HTML, images Server Caches requests Sends to client. not found. http://server/Easter-bunny/ 200x100/75.jpg. client. - PowerPoint PPT PresentationTRANSCRIPT
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
Computer Systems PrinciplesConcurrency Patterns
Emery Berger and Mark CornerUniversity of Massachusetts
Amherst
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
http://server/Easter-bunny/200x100/75.jpg
not found
client
webserver
Web Server Client (browser)
– Requests HTML, images Server
– Caches requests– Sends to client
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 3
Possible Implementation
while (true) { wait for connection; read from socket & parse URL; look up URL contents in cache; if (!in cache) { fetch from disk / execute CGI; put in cache; } send data to client;}
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 4
Possible Implementation
while (true) { wait for connection; // net read from socket & parse URL; // cpu look up URL contents in cache; // cpu
if (!in cache) { fetch from disk / execute CGI;//disk
put in cache; // cpu } send data to client; // net}
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
clients
webserver
Problem: Concurrency Sequential fine until:
– More clients– Bigger server
• Multicores, multiprocessors Goals:
– Hide latency of I/O• Don’t keep clients waiting
– Improve throughput• Serve up more pages
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 6
Building Concurrent Apps Patterns / Architectures
– Thread pools– Producer-consumer– “Bag of tasks”– Worker threads (work stealing)
Goals:– Minimize latency– Maximize parallelism– Keep progs. simple to program & maintain
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 7
Thread Pools Thread creation relatively expensive Instead: use pool of threads
– When new task arrives, get thread from pool to work on it; block if pool empty
– Faster with many tasks– Limits max threads (thus resources)– ( ThreadPoolExecutor class in Java)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 8
producerproducer consumerconsumer
Producer-Consumer Can get pipeline parallelism:
– One thread (producer) does work• E.g., I/O
– and hands it off to other thread (consumer)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 9
producerproducer consumerconsumer
Producer-Consumer Can get pipeline parallelism:
– One thread (producer) does work• E.g., I/O
– and hands it off to other thread (consumer)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 10
producerproducer consumerconsumer
Producer-Consumer Can get pipeline parallelism:
– One thread (producer) does work• E.g., I/O
– and hands it off to other thread (consumer)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 11
producerproducer consumerconsumer
Producer-Consumer Can get pipeline parallelism:
– One thread (producer) does work• E.g., I/O
– and hands it off to other thread (consumer)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 12
producerproducer consumerconsumer
LinkedBlockingQueueBlocks on put() if full, poll() if empty
Producer-Consumer Can get pipeline parallelism:
– One thread (producer) does work• E.g., I/O
– and hands it off to other thread (consumer)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 13
while (true) { do something… queue.put (x);}
while (true) { x = queue.poll(); do something…}
while (true) { wait for connection; read from socket & parse URL; look up URL contents in cache; if (!in cache) { fetch from disk / execute
CGI; put in cache; } send data to client;}
Producer-Consumer Web Server Use 2 threads: producer & consumer
– queue.put(x) and x = queue.poll();
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 14
while (true) { wait for connection; read from socket & parse
URL; queue.put (URL);}
while (true) { URL = queue.poll(); look up URL contents in cache; if (!in cache) { fetch from disk / execute
CGI; put in cache; } send data to client;}
Producer-Consumer Web Server Pair of threads – one reads, one writes
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 15
while (true) { wait for connection; read from socket & parse
URL; queue1.put (URL);}
while (true) { URL = queue1.poll(); look up URL contents in cache; if (!in cache) { queue2.put (URL); return; } send data to client;}
while (true) { URL = queue2.poll(); fetch from disk / execute CGI; put in cache; send data to client;}
1
2
Producer-Consumer Web Server More parallelism –
optimizes common case (cache hit)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 16
When to Use Producer-Consumer Works well for pairs of threads
– Best if producer & consumer are symmetric• Proceed roughly at same rate
– Order of operations matters Not as good for
– Many threads– Order doesn’t matter– Different rates of progress
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 17
while (true) { wait for connection; read from socket & parse
URL; queue1.put (URL);}
while (true) { URL = queue1.poll(); look up URL contents in cache; if (!in cache) { queue2.put (URL); } send data to client;}
while (true) { URL = queue2.poll(); fetch from disk / execute CGI; put in cache; send data to client;}
1
2
Producer-Consumer Web Server Should balance load across threads
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 18
workerworker workerworker workerworker workerworker
Bag of Tasks Collection of mostly independent tasks
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 19
workerworker workerworker workerworker workerworker
Bag of Tasks Collection of mostly independent tasks
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 20
workerworker workerworker workerworker workerworker
Bag of Tasks Collection of mostly independent tasks
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 21
workerworker workerworker workerworker workerworker
Bag of Tasks Collection of mostly independent tasks
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 22
workerworker workerworker workerworker workerworker
Bag of Tasks Collection of mostly independent tasks
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 23
workerworker workerworker workerworker workerworker
Bag of Tasks Collection of mostly independent tasks
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 24
workerworker workerworker workerworker workerworker
addWorkaddWork
Bag of Tasks Collection of mostly independent tasks
Bag could also be LinkedBlockingQueue(put, poll)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 25
while (true) { wait for connection; read from socket & parse URL; look up URL contents in cache; if (!in cache) { fetch from disk / execute
CGI; put in cache; } send data to client;}
Exercise: Restructure into BOT Re-structure this into bag of tasks:
– addWork & worker threads– t = bag.poll() or bag.put(t)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 26
addWork:while (true) { wait for
connection; t.URL = URL; t.sock = socket; bag.put (t);}
Worker:while (true) { t = bag.poll(); look up t.URL contents in cache; if (!in cache) { fetch from disk / execute CGI; put in cache; } send data to client via t.sock;}
Exercise: Restructure into BOT Re-structure this into bag of tasks:
– addWork & worker– t = bag.poll() or bag.put(t)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 27
workerworker workerworker
addWork: while (true){
wait for connection;
bag.put (URL);}
worker: while (true) { URL = bag.poll(); look up URL contents in cache; if (!in cache) { fetch from disk / execute
CGI; put in cache; } send data to client;}
workerworker
addWorkaddWork
Bag of Tasks Web Server Re-structure this into bag of tasks:
– t = bag.poll() or bag.put(t)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 28
Bag of Tasks vs. Prod/Consumer Exploits more parallelism Even with coarse-grained threads
– Don’t have to break up tasks too finely What does task size affect?
– possibly latency… smaller might be better Easy to change or add new functionality
But: one major performance problem…
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 29
workerworker workerworker workerworker workerworker
addWorkaddWork
What’s the Problem?
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 30
workerworker workerworker workerworker workerworker
addWorkaddWork
What’s the Problem? Contention – single lock on structure
– Bottleneck to scalability
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 31
executorexecutor executorexecutor executorexecutor executorexecutor
Work Queues Each thread has own work queue (deque)
– No single point of contention
Threads now generic “executors”– Tasks (balls): blue = parse, yellow = connect…
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
executorexecutor executorexecutor executorexecutor executorexecutor
32
Work Queues Each thread has own work queue (deque)
– No single point of contention
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 33
executorexecutor executorexecutor executorexecutor executorexecutor
Work Queues Each thread has own work queue (deque)
– No single point of contention
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
executorexecutor executorexecutor executorexecutor executorexecutor
34
Work Queues Each thread has own work queue (deque)
– No single point of contention
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 35
executorexecutor executorexecutor executorexecutor executorexecutor
Work Queues Each thread has own work queue
– No single point of contention
Now what?
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 36
workerworker workerworker workerworker workerworker
Work Stealing When thread runs out of work,
steal work from random other thread
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 37
workerworker workerworker workerworker workerworker
Work Stealing When thread runs out of work,
steal work from top of random deque
Optimal load balancing algorithm
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 38
while (true) { wait for connection; read from socket & parse URL; look up URL contents in cache; if (!in cache) { fetch from disk / execute
CGI; put in cache; } send data to client;}
Work Stealing Web Server Re-structure:
readURL, lookUp, addToCache, output– myQueue.put(new readURL (url))
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
readURL, lookUp, addToCache, output class Work {
public: virtual void run();};
class readURL : public Work {public: void run() {…} readURL (socket s) { …}};
39
while (true) { wait for connection; read from socket & parse URL; look up URL contents in cache; if (!in cache) { fetch from disk / execute
CGI; put in cache; } send data to client;}
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 40
readURL
output
lookUp
addToCache
workerworker
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
class readURL {public: void run() { read from socket, f = get file myQueue.put (new lookUp(_s, f)); } readURL(socket s) { _s = s; }};
41
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
class lookUp {public: void run() { look in cache for file “f” if (!found) myQueue.put (new addToCache(_f)); else myQueue.put (new Output(s, cont)); } lookUp (socket s, string f) { _s = s; _f = f; }};
42
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
class addToCache {public: void run() { fetch file f from disk into cont add file to cache (hashmap) myQueue.put (new Output(s, cont)); }
43
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 44
readURL(url) { wait for connection; read from socket & parse URL; myQueue.put (new lookUp
(URL));}
Work Stealing Web Server Re-structure:
readURL, lookUp, addToCache, output– myQueue.put(new readURL (url))
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 45
readURL(url) { wait for connection; read from socket & parse URL; myQueue.put (new lookUp
(URL));}
lookUp(url) { look up URL contents in cache; if (!in cache) { myQueue.put (new addToCache
(URL)); } else { myQueue.put (new
output(contents)); }}
addToCache(URL) { fetch from disk / execute
CGI; put in cache; myQueue.put (new
output(contents));}
Work Stealing Web Server Re-structure:
readURL, lookUp, addToCache, output– myQueue.put(new readURL (url))
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 46
Work Stealing Works great for heterogeneous tasks
– Convert addWork and worker into units of work (different colors)
Flexible: can easily re-define tasks– Coarse, fine-grained, anything in-between
Automatic load balancing Separates thread logic from functionality
Popular model for structuring servers
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 47
The End