concurrency scalability

45
Mårten Rånge WCOM AB @marten_range

Upload: marten-range

Post on 11-Jun-2015

196 views

Category:

Technology


2 download

DESCRIPTION

Herb Sutter (GotW.ca) says that the concept of Concurrency is easier understood if split into three sub concepts; scalability, responsiveness and consistency. This presentation is the first of three covering these concepts, starting off with everyone’s favorite: Scalability – i.e. splitting a CPU-bound problem onto several cores in order to solve the problem faster. I will show what tools what .NET offer but also performance pitfalls that arise from an escalating problem that plagued computer architecture for the last 20 years.

TRANSCRIPT

Page 1: Concurrency scalability

Mårten RångeWCOM AB

@marten_range

Page 2: Concurrency scalability

ConcurrencyExamples for .NET

Page 3: Concurrency scalability
Page 4: Concurrency scalability

Responsive

Page 5: Concurrency scalability

PerformanceScalable algorithms

Page 6: Concurrency scalability

Three pillars of Concurrency

Scalability (CPU) Parallel.For

Responsiveness Task/Future async/await

Consistency lock/synchronized Interlocked.* Mutex/Event/Semaphore Monitor

Page 7: Concurrency scalability

Scalability

Page 8: Concurrency scalability
Page 9: Concurrency scalability

Which is fastest?

var ints = new int[InnerLoop];var random = new Random();for (var inner = 0; inner < InnerLoop; ++inner){    ints[inner] = random.Next();}// ------------------------------------------------var ints = new int[InnerLoop];var random = new Random();Parallel.For(    0,     InnerLoop,    i => ints[i] = random.Next()    );

Page 10: Concurrency scalability

SHARED STATE Race condition

var ints = new int[InnerLoop];var random = new Random();for (var inner = 0; inner < InnerLoop; ++inner){    ints[inner] = random.Next();}// ------------------------------------------------var ints = new int[InnerLoop];var random = new Random();Parallel.For(    0,     InnerLoop,    i => ints[i] = random.Next()    );

Page 11: Concurrency scalability

SHARED STATE Poor performancevar ints = new int[InnerLoop];var random = new Random();for (var inner = 0; inner < InnerLoop; ++inner){    ints[inner] = random.Next();}// ------------------------------------------------var ints = new int[InnerLoop];var random = new Random();Parallel.For(    0,     InnerLoop,    i => ints[i] = random.Next()    );

Page 12: Concurrency scalability
Page 13: Concurrency scalability

Then and now

Metric VAX-11/750 (’80)

Today Improvement

MHz 6 3300 550x

Memory MB 2 16384 8192x

Memory MB/s 13 R ~10000W ~2500

770x190x

Page 14: Concurrency scalability

Then and now

Metric VAX-11/750 (’80)

Today Improvement

MHz 6 3300 550x

Memory MB 2 16384 8192x

Memory MB/s 13 R ~10000W ~2500

770x190x

Memory nsec 225 70 3x

Page 15: Concurrency scalability

Then and now

Metric VAX-11/750 (’80)

Today Improvement

MHz 6 3300 550x

Memory MB 2 16384 8192x

Memory MB/s 13 R ~10000W ~2500

770x190x

Memory nsec 225 70 3x

Memory cycles

1.4 210 -150x

Page 16: Concurrency scalability

299,792,458 m/s

Page 17: Concurrency scalability
Page 18: Concurrency scalability

Speed of light is too slow

Page 19: Concurrency scalability

0.09 m/c

Page 20: Concurrency scalability
Page 21: Concurrency scalability

99% - latency mitigation

1% - computation

Page 22: Concurrency scalability

2 Core CPU

RAM

L3L2

L1

CPU1

L2

L1

CPU2

Page 23: Concurrency scalability

2 Core CPU – L1 Cache

L1

CPU1

L1

CPU2

new Random ()

new int[InnerLoop]

Page 24: Concurrency scalability

2 Core CPU – L1 Cache

L1

CPU1

L1

CPU2

Random object Random object

Page 25: Concurrency scalability

2 Core CPU – L1 Cache

L1

CPU1

L1

CPU2

Random object Random object

Page 26: Concurrency scalability

2 Core CPU – L1 Cache

L1

CPU1

L1

CPU2

Random objectRandom object

Page 27: Concurrency scalability

2 Core CPU – L1 Cache

L1

CPU1

L1

CPU2

Random objectRandom object

Page 28: Concurrency scalability

2 Core CPU – L1 Cache

L1

CPU1

L1

CPU2

Random objectRandom object

Page 29: Concurrency scalability

2 Core CPU – L1 Cache

L1

CPU1

L1

CPU2

Random objectRandom object

Page 30: Concurrency scalability

4 Core CPU – L1 Cache

L1

CPU1

L1

CPU2

L1

CPU3

L1

CPU4

new Random ()

new int[InnerLoop]

Page 31: Concurrency scalability

2x4 Core CPU

RAM

L3L2

L1

CPU1

L2

L1

CPU2

L2

L1

CPU3

L2

L1

CPU4

L3L2

L1

CPU5

L2

L1

CPU6

L2

L1

CPU7

L2

L1

CPU8

Page 32: Concurrency scalability

Solution 1 – Locks

var ints = new int[InnerLoop];var random = new Random();Parallel.For(    0,     InnerLoop,    i => {lock (ints) {ints[i] = random.Next();}}    );

Page 33: Concurrency scalability

Solution 2 – No sharing

var ints = new int[InnerLoop];Parallel.For( 0, InnerLoop, () => new Random(), (i, pls, random) => {ints[i] = random.Next(); return random;}, random => {} );

Page 34: Concurrency scalability

Parallel.For adds overheadLevel0

Level1

Level2

ints[0]

ints[1]

Level2

ints[2]

ints[3]

Level1

Level2

ints[4]

ints[5]

Level2

ints[6]

ints[7]

Page 35: Concurrency scalability

Solution 3 – Less overhead

var ints = new int[InnerLoop];Parallel.For( 0, InnerLoop / Modulus, () => new Random(), (i, pls, random) => { var begin = i * Modulus ; var end = begin + Modulus ; for (var iter = begin; iter < end; ++iter) { ints[iter] = random.Next(); } return random; }, random => {} );

Page 36: Concurrency scalability

var ints = new int[InnerLoop];var random = new Random();for (var inner = 0; inner < InnerLoop; ++inner){    ints[inner] = random.Next();}

Page 37: Concurrency scalability

Solution 4 – Independent runs

var tasks = Enumerable.Range (0, 8).Select ( i => Task.Factory.StartNew ( () => { var ints = new int[InnerLoop]; var random = new Random (); while (counter.CountDown ()) { for (var inner = 0; inner < InnerLoop; ++inner) { ints[inner] = random.Next(); } } }, TaskCreationOptions.LongRunning)) .ToArray ();Task.WaitAll (tasks);

Page 38: Concurrency scalability

Parallel.For

Only for CPU bound problems

Page 39: Concurrency scalability

Sharing is bad

Kills performanceRace conditions

Dead-locks

Page 40: Concurrency scalability

Cache locality

RAM is a misnomerClass designAvoid GC

Page 41: Concurrency scalability

Natural concurrency

Avoid Parallel.For

Page 42: Concurrency scalability

Act like an engineer

Measure before and after

Page 43: Concurrency scalability

One more thing…

Page 45: Concurrency scalability

Mårten RångeWCOM AB

@marten_range