counter wars (jeeconf 2016)
TRANSCRIPT
АлексейФедоров,Одноклассники
CounterWars
Зачемвыздесь?
3 Много интересных докладов в других залах
4Counter
public interface Counter {
long get();
void increment();
}
5Simple Counter
class SimpleCounter implements Counter {
private long value = 0;
public long get() {return value;
}
public void increment() {value++;
}
}
6Volatile Counter
class VolatileCounter implements Counter {
volatile long value = 0;
public long get() {return value;
}
public void increment() {value++;
}
}
7Volatile Counter
class VolatileCounter implements Counter {
volatile long value = 0;
public long get() {return value;
}
public void increment() {long oldValue = value; // readlong newValue = oldValue + 1; // modifyvalue = newValue; // write
}}
8
class SynchronizedCounter implements Counter {
volatile long value = 0;
public synchronized long get() {return value;
}
public synchronized void increment() {value++;
}
}
Synchronized Counter
9Synchronized Counter
class SynchronizedCounter implements Counter {
long value = 0;
public synchronized long get() {return value;
}
public synchronized void increment() {value++;
}
}
10Тестовый стенд
Core(TM) i7-47704 x 2 x 2.0 Ghz (downscaled)
Linux Ubuntu 14.04.43.13.0-86-generic x86_64
taskset -c 0,1,2,3 (thread affinity)
11Бенчмарки, op/µs
Thanks to Nitsan Wakarthttp://psy-lob-saw.blogspot.ru/2014/06/jdk8-update-on-scalable-counters.html
1 thread 2 threads2 threads 2 threads4 threads8 threads
Core 0 Core 0 Cores 0,4 Cores 0,1Cores 0-3Cores 0-7
LONG 308 267 182 220 180 86
VOLATILE_LONG 77 77 79 22 21 35
SYNCHRONIZED 26 43 27 12 12 13
Вывод1:синхронизациячего-тостоит
13
class LockCounter implements Counter {
long value;final Lock lock = new ReentrantLock();
public long get() {try {
lock.lock();return value;
} finally {lock.unlock();
}
}
}
public void add() {try {
lock.lock();value += 1;
} finally {lock.unlock();
}}
Lock Counter
14Lock Counter
class LockCounter implements Counter {
long value;final Lock lock = new ReentrantLock();
public long get() {try {
lock.lock();return value;
} finally {lock.unlock();
}
}
}
public void add() {try {
lock.lock();value += 1;
} finally {lock.unlock();
}}
15Бенчмарки, op/µs
1 thread 2 threads2 threads 2 threads4 threads8 threads
Core 0 Core 0 Cores 0,4 Cores 0,1Cores 0-3Cores 0-7
SYNCHRONIZED 26 43 27 12 12 13
REENTRANTLOCK 32 32 18 5 20 20
http://mechanical-sympathy.blogspot.com/2011/11/java-lock-implementations.htmlhttp://mechanical-sympathy.blogspot.com/2011/11/biased-locking-osr-and-benchmarking-fun.htmlhttp://www.javaspecialist.ru/2011/11/synchronized-vs-reentrantlock.htmlhttp://dev.cheremin.info/2011/11/synchronized-vs-reentrantlock.html
16
class LockCounter implements Counter {
long value;final Lock lock = new ReentrantLock();
public long get() {try {
lock.lock();return value;
} finally {lock.unlock();
}
}
}
public void add() {try {
lock.lock();value += 1;
} finally {lock.unlock();
}}
Lock Counter
17
class LockCounter implements Counter {
long value;final Lock lock = new ReentrantLock(true);
public long get() {try {
lock.lock();return value;
} finally {lock.unlock();
}
}
}
public void add() {try {
lock.lock();value += 1;
} finally {lock.unlock();
}}
Lock Counter
18Бенчмарки, op/µs
1 thread 2 threads2 threads 2 threads4 threads8 threads
Core 0 Core 0 Cores 0,4 Cores 0,1Cores 0-3Cores 0-7
SYNCHRONIZED 26 43 27 12 12 13
UNFAIR_LOCK 32 32 18 5 20 20
FAIR_LOCK
Насколькомедленнее,чемunfairlock?
19Бенчмарки, op/µs
1 thread 2 threads2 threads 2 threads4 threads8 threads
Core 0 Core 0 Cores 0,4 Cores 0,1Cores 0-3Cores 0-7
SYNCHRONIZED 26 43 27 12 12 13
UNFAIR_LOCK 32 32 18 5 20 20
FAIR_LOCK 31 5 ± 9 0.5 ± 0.3 0.26 0.24 0.23
Насколькомедленнее,чемunfairlock?Надвапорядка!
20Бенчмарки, op/µs
Насколькомедленнее,чемunfairlock?Надвапорядка!Страшнолиэто?
1 thread 2 threads2 threads 2 threads4 threads8 threads
Core 0 Core 0 Cores 0,4 Cores 0,1Cores 0-3Cores 0-7
SYNCHRONIZED 26 43 27 12 12 13
UNFAIR_LOCK 32 32 18 5 20 20
FAIR_LOCK 31 5 ± 9 0.5 ± 0.3 0.26 0.24 0.23
21Как устроен типичный Core i7
CPU4
CPU0
CPU5
CPU1
CPU6
CPU2
CPU7
CPU3
L1cache
L2cache
L1cache L1cache L1cache
L2cache L2cache L2cache
L3cache
Вывод2:честностьчего-тостоит
Атомики иCAS
24CompareandSwap— HardwareSupport
compare-and-swapCAS
load-link / store-conditionalLL/SC
cmpxchg
ldrex/strex lwarx/stwcx
25 CAS Counterpublic class CasLoopCounter implements Counter {
private AtomicLong value = new AtomicLong();
public long get() {return value.get();
}
public void increment() {for (;;) {
long oldValue = value.get();long newValue = oldValue + 1;if (value.compareAndSet(oldValue, newValue))
return;}
}
}
26Бенчмарки, op/µs
1 thread 2 threads2 threads 2 threads4 threads8 threads
Core 0 Core 0 Cores 0,4 Cores 0,1Cores 0-3Cores 0-7
SYNCHRONIZED 26 43 27 12 12 13
UNFAIR_LOCK 32 32 18 5 20 20
CAS_LOOP
27Бенчмарки, op/µs
1 thread 2 threads2 threads 2 threads4 threads8 threads
Core 0 Core 0 Cores 0,4 Cores 0,1Cores 0-3Cores 0-7
SYNCHRONIZED 26 43 27 12 12 13
UNFAIR_LOCK 32 32 18 5 20 20
CAS_LOOP 62 62 45 10 6 5
28Get-and-Add Counter
public class CasLoopCounter implements Counter {
private AtomicLong value = new AtomicLong();
public long get() {return value.get();
}
public void increment() {value.getAndAdd(1);
}
}
29 AtomicLong.getAndAdd()
30Бенчмарки, op/µs
1 thread 2 threads2 threads 2 threads4 threads8 threads
Core 0 Core 0 Cores 0,4 Cores 0,1Cores 0-3Cores 0-7
SYNCHRONIZED 26 43 27 12 12 13
UNFAIR_LOCK 32 32 18 5 20 20
CAS_LOOP 62 62 45 10 6 5
GET_AND_ADD 100 100 97 27 28 28
31True Sharing
CPU4
CPU0
CPU5
CPU1
CPU6
CPU2
CPU7
CPU3
L1cache
L2cache
L1cache L1cache L1cache
L2cache L2cache L2cache
L3cache
32
atomicLong.getAndAdd(5)
JDK7u95 JDK8u72
60
9 7 6
100
27 27 27
1 2 3 4
ops/μs
threads
33
atomicLong.getAndAdd(5)
JDK7u95 JDK8u72
60
9 7 6
100
27 27 27
1 2 3 4
ops/μs
threads
34
atomicLong.getAndAdd(5)
JDK7u95 JDK8u72
60
9 7 6
100
27 27 27
1 2 3 4
ops/μs
threads
35
loop:mov 0x10(%rbx),%raxmov %rax,%r11add $0x5,%r11lock cmpxchg %r11,0x10(%rbx)sete %r11bmovzbl %r11b,%r11dtest %r10d,%r10dje loop
JDK7u95-XX:+PrintAssembly
atomicLong.getAndAdd(5)
36
lock addq $0x5,0x10(%rbp))loop:mov 0x10(%rbx),%raxmov %rax,%r11add $0x5,%r11lock cmpxchg %r11,0x10(%rbx)sete %r11bmovzbl %r11b,%r11dtest %r10d,%r10dje loop
JDK7u95-XX:+PrintAssembly JDK8u72 -XX:+PrintAssembly
atomicLong.getAndAdd(5)
JDK7u95 JDK8u72
60
9 7 6
100
27 27 27
1 2 3 4
ops/μs
threads
37 AtomicLong.getAndAdd()— JDK7
38 AtomicLong.getAndAdd()— JDK7
cmpxchg
39 AtomicLong.getAndAdd()— JDK8
40 AtomicLong.getAndAdd()— JDK8
lock addqJVMIntrinsic
Вывод3: неверьтевсему,чтонаписановисходникахOpenJDK
JDK8
43StampedLock Counter
public class StampedLockCounter implements Counter {private long value = 0; private StampedLock lock = new StampedLock();
public long get() { ... }
public void add() {long stamp = lock.writeLock();try {
value++;} finally{
lock.unlock(stamp);}
}}
44Бенчмарки, op/µs
1 thread 2 threads2 threads 2 threads4 threads8 threads
Core 0 Core 0 Cores 0,4 Cores 0,1Cores 0-3Cores 0-7
SYNCHRONIZED 26 43 27 12 12 13
UNFAIR_LOCK 32 32 18 5 20 20
CAS_LOOP 62 62 45 10 6 5
GET_AND_ADD 100 100 97 27 28 28
STAMPED_LOCK 31 31 24 5 22 21
45Long Adder Counter
public class LongAdderCounter implements Counter {
private LongAdder value = new LongAdder();
public long get() {return value.longValue();
}
public void increment() {value.add(1);
}
}
46Бенчмарки, op/µs
1 thread 2 threads2 threads 2 threads4 threads8 threads
Core 0 Core 0 Cores 0,4 Cores 0,1Cores 0-3Cores 0-7
SYNCHRONIZED 26 43 27 12 12 13
UNFAIR_LOCK 32 32 18 5 20 20
CAS_LOOP 62 62 45 10 6 5
GET_AND_ADD 100 100 97 27 28 28
STAMPED_LOCK 31 31 24 5 22 21
LONG_ADDER 62 62 85 124 248 340
47Литература
48Материалы
• Все-все-все— bit.ly/concurrency-interest• Nitsan Wakart — psy-lob-saw.blogspot.com• АлексейШипилёв — shipilev.net
Вопросыиответы