[greach 17] make concurrency groovy again

Post on 11-Apr-2017

67 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

CONCURRENCYGROOVY

Groovy Developer

Functional fan

NOT a Trump supporter

Alonso Torres @alotor

KALEIDOShttp://kaleidos.net/

Actors

Data Structures

Go channels

GPU pallelism

Event loop

???

Leverages JVM ecosystem

Static & dynamic typing

Metaprogramming & DSL’s

GPars

CONCURRENCY TESTS

Design

Steal

Building

Testing

Design

1. Performance

Your problems:

Parallelization

Parallelization

1. Performance

2. Feedback

Your problems:

Asynchronousprocess

Asynchronous process

1. Slow

2. Feedback

3. Resources

Your problems:

Distributed processing

Distributed processing

Performance

Feedback

Resources

Parallel

Asynchronous

Distributed

1. Non-determinism

2. Correctness

3. Coordination

4. Performance

5. Access

6. Scoping

7. Communication

8. System resources

Performance

Non-determinism

Correctness

Coordination

Locking

Visibility

Communication

System resources

Performance

Feedback

Resources

Some people, when confronted with a problem, think "I know, I'll use regular expressions."

Now they have two problems.

- Jamie Zawinski

CONCURRENCY

8 concurrently

Alonso Torres

Building

● Threads

● Functional style programming

● Parallel collections

● Fork/join

● Actors

Programming models

n! = 1 * 2 * 3 * … * n

n n!5 120

10 3,628,800

15 1,307,674,368,000

20 2,432,902,008,176,640,000

25 15,511,210,043,330,985,984,000,000

402,387,260,077,093,773,543,702,433,923,003,985,719,374,864,210,714,632,543,799,910,429,938,512,398,629,020,592,044,208,486,969,404,800,479,988,610,197,196,058,631,666,872,994,808,558,901,323,829,669,944,590,997,424,504,087,073,759,918,823,627,727,188,732,519,779,505,950,995,276,120,874,975,462,497,043,601,418,278,094,646,496,291,056,393,887,437,886,487,337,119,181,045,825,783,647,849,977,012,476,632,889,835,955,735,432,513,185,323,958,463,075,557,409,114,262,417,474,349,347,553,428,646,576,611,667,797,396,668,820,291,207,379,143,853,719,588,249,808,126,867,838,374,559,731,746,136,085,379,534,524,221,586,593,201,928,090,878,297,308,431,392,844,403,281,231,558,611,036,976,801,357,304,216,168,747,609,675,871,348,312,025,478,589,320,767,169,132,448,426,236,131,412,508,780,208,000,261,683,151,027,341,827,977,704,784,635,868,170,164,365,024,153,691,398,281,264,810,213,092,761,244,896,359,928,705,114,964,975,419,909,342,221,566,832,572,080,821,333,186,116,811,553,615,836,546,984,046,708,975,602,900,950,537,616,475,847,728,421,889,679,646,244,945,160,765,353,408,198,901,385,442,487,984,959,953,319,101,723,355,556,602,139,450,399,736,280,750,137,837,615,307,127,761,926,849,034,352,625,200,015,888,535,147,331,611,702,103,968,175,921,510,907,788,019,393,178,114,194,545,257,223,865,541,461,062,892,187,960,223,838,971,476,088,506,276,862,967,146,674,697,562,911,234,082,439,208,160,153,780,889,893,964,518,263,243,671,616,762,179,168,909,779,911,903,754,031,274,622,289,988,005,195,444,414,282,012,187,361,745,992,642,956,581,746,628,302,955,570,299,024,324,153,181,617,210,465,832,036,786,906,117,260,158,783,520,751,516,284,225,540,265,170,483,304,226,143,974,286,933,061,690,897,968,482,590,125,458,327,168,226,458,066,526,769,958,652,682,272,807,075,781,391,858,178,889,652,208,164,348,344,825,993,266,043,367,660,176,999,612,831,860,788,386,150,279,465,955,131,156,552,036,093,988,180,612,138,558,600,301,435,694,527,224,206,344,631,797,460,594,682,573,103,790,084,024,432,438,465,657,245,014,402,821,885,252,470,935,190,620,929,023,136,493,273,497,565,513,958,720,559,654,228,749,774,011,413,346,962,715,422,845,862,377,387,538,230,483,865,688,976,461,927,383,814,900,140,767,310,446,640,259,899,490,222,221,765,904,339,901,886,018,566,526,485,061,799,702,356,193,897,017,860,040,811,889,729,918,311,021,171,229,845,901,641,921,068,884,387,121,855,646,124,960,798,722,908,519,296,819,372,388,642,614,839,657,382,291,123,125,024,186,649,353,143,970,137,428,531,926,649,875,337,218,940,694,281,434,118,520,158,014,123,344,828,015,051,399,694,290,153,483,077,644,569,099,073,152,433,278,288,269,864,602,789,864,321,139,083,506,217,095,002,597,389,863,554,277,196,742,822,248,757,586,765,752,344,220,207,573,630,569,498,825,087,968,928,162,753,848,863,396,909,959,826,280,956,121,450,994,871,701,244,516,461,260,379,029,309,120,889,086,942,028,510,640,182,154,399,457,156,805,941,872,748,998,094,254,742,173,582,401,063,677,404,595,741,785,160,829,230,135,358,081,840,096,996,372,524,230,560,855,903,700,624,271,243,416,909,004,153,690,105,933,983,835,777,939,410,970,027,753,472,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000

1000! =

CommutativeWe don’t care about the order

AssociativeSplit and combine the tasks

d * c * b * a =

(a * b) * (c * d) =

a * b * c * d = Binary operation(Int, Int) → Int

def factorial(def num) {

def result = 1g

(1..num).each {

result = result * it

}

return result

}

BigInteger literals

Method 10,000! 25,000! 100,000! 1,000,000!

Serial 0.06 0.25 3.44 423.51

Parallel

Asynchronous

Distributed

Performance

Non-determinism

Correctness

Coordination

Locking

Visibility

Communication

System resources

● Threads

● Functional style programming

● Parallel collections

● Fork/join

● Actors

Programming models

https://goo.gl/Jbx03XShow me the code!!

● Threads

● Functional style programming

● Parallel collections

● Fork/join

● Actors

Programming models

○ Basic unit of concurrency for the JVM

○ Managed by the OS

○ There is always at least 1 thread (main thread)

Threads

def factorial(def num) {

def result = 1g

def ts = []

(1 .. num).each { n ->

ts << Thread.start {

result = result * n

}

}

ts*.join()

return result

}

Thread creation

Wait to finish

Store the threads

Main

T3T2T1 T4 ...

Parallel

Asynchronous

Distributed

Performance

Non-determinism

Correctness

Coordination

Locking

Visibility

Communication

System resources

def factorial(def num) {

def result = 1g

def ts = []

(1 .. num).each { n ->

ts << Thread.start {

result = result * n

}

}

ts*.join()

return result

}

Shared memory

Coordination

Communication

Access to shared memory

3628800

3628800

3628800

172800

3628800

3628800

103680

factorial(10) =

Parallel

Asynchronous

Distributed

Performance

Non-determinism

Correctness

Coordination

Locking

Visibility

Communication

Main

T3T2T1 T4 ...

def factorial(def num) {

def result = 1g

def ts = []

(1 .. num).each { n ->

ts << Thread.start {

synchronized(this) {

result = result * n

}

}

}

ts*.join()

return result

}

Mutex: Explicit locking

Method 10,000! 25,000! 100,000! 1,000,000!

Serial 0.06 0.25 3.44 423.51

Threads 0.67 1.65 5.28 ERROR

java.util.concurrent.*

○ Locks

○ Executors

○ Thread Pools

○ Thread-safe Collections

○ Atomic variables

○ Locks

○ Executors

○ Thread Pools

○ Thread-safe Collections

○ Atomic variables

def factorial(def num) {

def result = new AtomicReference(1g)

def ts = []

(1 .. num).each { n ->

ts << Thread.start {

result = result * n

}

}

ts*.join()

return result.get()

}

No locking!!just a bit of Groovy...

Concurrent util

AtomicReference.metaClass.multiply = { val ->

def old = delegate.get()

while(!delegate.compareAndSet(old, old * val)) {

old = delegate.get()

}

return delegate

}

Add multiplicationto metaclass

Retries until it’s allowed to change

def factorial(def num) {

def result = new AtomicReference(1g)

def ts = []

(1 .. num).each { n ->

ts << Thread.start {

result = result * n

}

}

ts*.join()

return result.get()

}

Safe write to atomicvariable

Parallel

Asynchronous

Distributed

Performance

Non-determinism

Correctness

Coordination

Locking

Visibility

Communication

System resources

def factorial(def num) {

def result = new AtomicReference(1g)

def ts = []

(1 .. num).each { n ->

ts << Thread.start {

result = result * n

}

}

ts*.join()

return result.get()

}

Potentially thousands of threads

def factorial(def num) {

def result = new AtomicReference(1g)

def threadPool = Executors.newFixedThreadPool(10)

def fs = []

(1 .. num).each { n ->

fs << threadPool.submit {

result = result * n

}

}

fs*.get()

return result.get()

}

The thread pool will reuse system threads when we’re done

We’re sending a task that will be completed eventually

Method 10,000! 25,000! 100,000! 1,000,000!

Serial 0.06 0.25 3.44 423.51

Threads 0.67 1.65 5.28 ERROR

Thread + pool 0.14 0.65 10.45 439.29

def factorial(def num) {

def result = new AtomicReference(1g)

def threadPool = Executors.newFixedThreadPool(10)

def fs = []

(1 .. num).each { n ->

fs << threadPool.submit {

result = result * n

}

}

fs*.get()

return result.get()

}

1-1 thread to task ratio

Thread-1

1

Thread-2

2

Thread-3

3

Thread-4

4

Thread-1

5

Thread-2

6

Thread-3

7

Thread-4

8

Thread-1

9

Thread-2

10

Thread-1

1 2 3 4 5

Thread-2

6 7 8 9 10

def product(from, to) {

def result = 1g

(from .. to).each { n ->

result = result * n

}

return result

}

int batches = batchesForNum(number)

(0 ..< batches).each { batch ->

fs << pool.submit {

def from = batchFrom(number, batch)

def to = batchTo(number, batch)

def current = product(from, to)

result = result * current

}

}

Divide into batches

Each batch will be a Thread

“product” is serial

Thread-1

1 2 3 4 5

Thread-2

6 7 8 9 10

Method 10,000! 25,000! 100,000! 1,000,000!

Serial 0.06 0.25 3.44 423.51

Threads 0.67 1.65 5.28 ERROR

Thread + pool 0.14 0.65 10.45 439.29

Thread + batch 0.04 0.14 0.70 69.78

Shared mutable state

○ The threads are competing to write/read

○ Mutex = “safe zone” for accessing

○ Non-determinism, performance problems

○ Memory-wise is good

● Threads

● Functional style programming

● Parallel collections

● Fork/join

● Actors

Programming models

Function style programming

○ Pass functions to data

○ Describe the task to do

○ Implementation will manage parallelization

○ Immutable data

java.util.stream.*

def factorial(number) {

(1g .. number)

.stream()

.parallel()

.reduce { a, b -> a * b }

.get()

}

Create a stream from a rangeDo it parallel

We can reduce because it’s an associative function

Function style programming

○ You have to get used to it

○ Harder to distribute (wait for it…)

○ Higher level means less flexibility

Method 10,000! 25,000! 100,000! 1,000,000!

Serial 0.06 0.25 3.44 423.51

Threads 0.67 1.65 5.28 ERROR

Thread + pool 0.14 0.65 10.45 439.29

Thread + batch 0.04 0.14 0.70 69.78

Streams 0.18 0.09 1.05 47.22

How are you doing?

○ Parallel Collections

○ Asynchronous Processing

○ Fork/Join

○ Actors

○ Communicating Sequential Processes (CSP)

○ Dataflow

● Threads

● Functional style programming

● Parallel collections

● Fork/join

● Actors

Programming models

def factorial(def num) {

def result = new AtomicReference(1g)

GParsPool.withPool(10) {

(1..num).eachParallel {

result = result * it

}

}

return result.get()

}

Parallel collections

Shared state :(((

Abstraction over thread pool

Method 10,000! 25,000! 100,000! 1,000,000!

Serial 0.06 0.25 3.44 423.51

Threads 0.67 1.65 5.28 ERROR

Thread + pool 0.14 0.65 10.45 439.29

Thread + batch 0.04 0.14 0.70 69.78

Streams 0.18 0.09 1.05 47.22

GPars - Collections 0.10 0.16 0.72 161.17

● Threads

● Functional style programming

● Parallel collections

● Fork/join

● Actors

Programming models

Fork / Join

○ Divide and conquer approach

○ Java’s ForkJoinTask wrapper

○ No shared state: functional-ish approach

○ Divide and conquer

1, 2, 3, 4, 49, 50, 99, 100

1, 2, 24, 48, 49 50, 51, 74, 99, 100

1 - 24 25 - 49 50 - 74 75 - 100

* *100!

def factorialForkJoin(def num) {

withPool(10) {

runForkJoin(1, num) { from, to ->

if (to - from < 1000) {

return product(from, to)

} else {

def half = from + ((to - from) / 2) as BigInteger

forkOffChild(from, half)

forkOffChild(half + 1, to)

def (a, b) = getChildrenResults()

return a * b

}

}

}

}

Recursive declaration

Base case: serial product

Concurrent child processing

def factorialForkJoin(def num) {

withPool(10) {

runForkJoin(1, num) { from, to ->

if (to - from < 1000) {

return product(from, to)

} else {

def half = from + ((to - from) / 2) as BigInteger

forkOffChild(from, half)

forkOffChild(half + 1, to)

def (a, b) = getChildrenResults()

return a * b

}

}

}

}

Method 10,000! 25,000! 100,000! 1,000,000!

Serial 0.06 0.25 3.44 423.51

Threads 0.67 1.65 5.28 ERROR

Thread + pool 0.14 0.65 10.45 439.29

Thread + batch 0.04 0.14 0.70 69.78

Streams 0.18 0.09 1.05 47.22

GPars - Collections 0.10 0.16 0.72 161.17

GPars - Fork / Join 0.03 0.04 0.21 5.17

Fork / Join

○ Very fast!

○ Running over Java’s API

○ Not suited for every problem

○ We need more flexible approach

● Threads

● Functional style programming

● Parallel collections

● Fork/join

● Actors

Programming models

Actors

○ Asynchronous processes

○ Encapsulates internal state

○ Coordination by immutable messages

○ Model easily distributable

Main

Mailbox

C1

Mailbox

C2

Mailbox

C3

Mailbox

C4

Mailbox

C5

Mailbox

C6

Mailbox

def factorial(def num) {

def result = 0

def coordinator = actor {

spawnCalculator().send([1g, num])

react {

result = msg

}

}

coordinator.join()

return result

}

Owned by the “coordinator”

Starts a calculator with the whole calculation

Waits for the answer

Synchronizes with the main thread

def spawnCalculator() {

actor {

react { msg ->

def (from, to) = msg

def origin = sender

if (to - from < 1000) {

reply product(from, to)

} else {

....

}

}

}

}

Waits for petitions

Serial “base” case

Next slide

def half = from + ((to - from) / 2) as BigInteger

def child1 = spawnCalculator()

def child2 = spawnCalculator()

child1.send([from, half])

child2.send([half+1, to])

react { a ->

react { b ->

origin.send(a * b)

}

}

Splits in half and “delegates” to its children

Waits for the children response

Method 10,000! 25,000! 100,000! 1,000,000!

Serial 0.06 0.25 3.44 423.51

Threads 0.67 1.65 5.28 ERROR

Thread + pool 0.14 0.65 10.45 439.29

Thread + batch 0.04 0.14 0.70 69.78

Streams 0.18 0.09 1.05 47.22

GPars - Collections 0.10 0.16 0.72 161.17

GPars - Fork / Join 0.03 0.04 0.21 5.17

GPars - Actors 0.01 0.03 0.14 3.65

“Stealing”

Clojure

Clojure

○ Functional language for the JVM

○ Great data structures with concurrency in mind

○ But… at the end of the day is still bytecode

def factorial(def num) {

def result = new Atom(1g)

GParsPool.withPool(10) {

(1..num).eachParallel {

result.swap({

value -> value * current

} as IFn)

}

}

return result.deref()

}

The atom “owns” the value

We send the operation we want to use as a function

Coerce Clojure’s function

Method 10,000! 25,000! 100,000! 1,000,000!

Serial 0.06 0.25 3.44 423.51

Threads 0.67 1.65 5.28 ERROR

Thread + pool 0.14 0.65 10.45 439.29

Thread + batch 0.04 0.14 0.70 69.78

Streams 0.18 0.09 1.05 47.22

GPars - Collections 0.10 0.16 0.72 161.17

GPars - Fork / Join 0.03 0.04 0.21 5.17

GPars - Actors 0.01 0.03 0.14 3.65

Threads + Atom 0.03 0.05 0.64 175.81

Akka

○ Concurrency toolkit for Scala

○ Emphasis on actor-based model

○ Distribution through actor remoting

○ Also has Java bindings

def calculate(def number) {

def system = ActorSystem.create("Factorial")

def inbox = Inbox.create(system)

def calculator = system.actorOf(Props.create(CalculatorActor))

inbox.send(actor, new StartCalculation(1, number))

def result = inbox.receive(Duration.create("1 minute"))

system.shutdown()

return result.value

}

Starts a calculator with the whole calculation

Waits for the answer

class CalculatorActor extends AbstractActor {

CalculatorActor() {

receive(

match(StartCalculation) { msg ->

...

}

.match(Result, { this.firstResult == -1 }) {

...

}

.match(Result, { this.firstResult != -1 }) {

...

}

.build()

)

}

}

match(StartCalculation) {

if (to - from < 1000) {

def resultMsg = new Result(product(from, to))

this.origin.tell(resultMsg, self())

context.stop(self())

} else {

def half = from + ((to - from) / 2)

def child1 = context.actorOf(Props.create(CalculatorActor))

child1.tell(new StartCalculation(from, half), self())

def child2 = context.actorOf(Props.create(CalculatorActor))

child2.tell(new StartCalculation(half+1, to), self())

}

}

Send message

Terminates execution

Spawns new actor

Method 10,000! 25,000! 100,000! 1,000,000!

Serial 0.06 0.25 3.44 423.51

Threads 0.67 1.65 5.28 ERROR

Thread + pool 0.14 0.65 10.45 439.29

Thread + batch 0.04 0.14 0.70 69.78

Streams 0.18 0.09 1.05 47.22

GPars - Collections 0.10 0.16 0.72 161.17

GPars - Fork / Join 0.03 0.04 0.21 5.17

GPars - Actors 0.01 0.03 0.14 3.65

Threads + Atom 0.03 0.05 0.64 175.81

Akka actors 0.20 0.17 1.48 179.67

○ Very big community

○ Better error handling support

○ More abstractions: routing, schedulers, ….

○ Persistence

○ Fault tolerance

○ Distributed actors

Akka’s strong suit

○ Distributed functional-style programming

○ Java & Scala

○ We can use the Java bindings to access from Groovy!

Spark

○ Resilient Distributed Dataset

○ Abstraction to work with distributed collections

○ “Spark’s streams”

RDD’s

def conf = new SparkConf()

.setMaster("local[8]")

.setAppName("FactorialSpark")

this.sparkContext = new JavaSparkContext(conf)

Set the master node (here local)

BigInteger calculate(BigInteger number) {

def result = this.sparkContext

.parallelize(1g .. number)

.reduce({ a, b -> a * b }.dehydrate())

return result

}

Create a RDD with our set

Same as with streams

“dehydrate” needed to serialize the closure

Method 10,000! 25,000! 100,000! 1,000,000!

Serial 0.06 0.25 3.44 423.51

Threads 0.67 1.65 5.28 ERROR

Thread + pool 0.14 0.65 10.45 439.29

Thread + batch 0.04 0.14 0.70 69.78

Streams 0.18 0.09 1.05 47.22

GPars - Collections 0.10 0.16 0.72 161.17

GPars - Fork / Join 0.03 0.04 0.21 5.17

GPars - Actors 0.01 0.03 0.14 3.65

Threads + Atom 0.03 0.05 0.64 217.59

Akka actors 0.20 0.17 1.48 220.4

Spark 0.18 0.24 0.84 34.50

Tests

def uploader = new UploaderService('...')

uploader.start { msg ->

println ">>> ${msg.progress}"

}

void "Failing Test"() {

given:

def uploader = new UploaderService("file")

when:

uploader.start()

then:

uploader.isDone

}

Asynchronous testing

○ BlockingVariable

void "BlockingVariable Sample"() {

given:

def isDone = new BlockingVariable()

and:

def uploader = new UploaderService("file") {

void setIsDone(boolean v) {

isDone.set(v)

}

}

when:

uploader.start()

then:

isDone.get()

}

Will set the blocking variable when is done

Will block until finished

Asynchronous testing

○ BlockingVariable

○ AsyncConditions

void "AsyncConditions Sample"() {

given:

def conditions = new AsyncConditions()

and:

def uploader = new UploaderService("file")

when:

uploader.start { msg ->

conditions.evaluate {

assert msg.type == NotificationType.START || msg.progress != 0

}

}

then:

conditions.await()

}

Assertion in asynchronous block

Blocks until the evaluate is resolved

Asynchronous testing

○ BlockingVariable

○ AsyncConditions

○ PollingConditions

void "PollingConditions Sample"() {

given:

def conds = new PollingConditions()

and:

def uploader = new UploaderService("file")

when:

uploader.start()

then:

conds.eventually {

assert uploader.isDone && uploader.progress == 100

}

}

Will try the condition until it passes or timeout

Leverages JVM ecosystem

Static & dynamic typing

Metaprogramming & DSL’s

GPars

Parallel

Asynchronous

Distributed

Performance

Non-determinism

Correctness

Coordination

Locking

Visibility

Communication

System resources

● Threads

● Functional style programming

● Parallel collections

● Fork/join

● Actors

Programming models

Clojure

Actors

Core.async

Go channels

GPU pallelism

Event loop

Plain awesome!

Alonso Torres @alotor

https://goo.gl/Jbx03XShow me the code!!

AttributionsPencil icon: Created by Souvik Bhattacharjeefrom the Noun Project

Thief icon: Created by Gregor Cresnar the Noun Project

Block icon: Created by mikicon the Noun Project

Checklist icon: Created by Delwar Hossain the Noun Project

Gui icon: Created by Ralf Schmitzer the Noun Project

top related