lock free algorithm for multi-core architecture · 2016-07-06 · sdy 1.introduction background...

21
SDY Lock free Algorithm for Multi-core architecture Lock free Algorithm for Multi-core architecture SDY Corporation Hiromasa Kanda

Upload: others

Post on 01-Aug-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lock free Algorithm for Multi-core architecture · 2016-07-06 · SDY 1.Introduction Background needed Multi-Thread Problem of Multi-Thread program What's Lock free? 2.Lock free Algorithm

SDY

Lock free Algorithm for Multi-core architectureLock free Algorithm for Multi-core architecture

SDY CorporationHiromasa Kanda

Page 2: Lock free Algorithm for Multi-core architecture · 2016-07-06 · SDY 1.Introduction Background needed Multi-Thread Problem of Multi-Thread program What's Lock free? 2.Lock free Algorithm

SDY

1.IntroductionBackground needed Multi-ThreadProblem of Multi-Thread programWhat's Lock free?

2.Lock free AlgorithmAtomic operationLock-free queueLock-free hash-map

3.Performance

4.Summary

ContentsContents

Page 3: Lock free Algorithm for Multi-core architecture · 2016-07-06 · SDY 1.Introduction Background needed Multi-Thread Problem of Multi-Thread program What's Lock free? 2.Lock free Algorithm

SDY

11.Introduction.IntroductionBackground needed Multi-ThreadBackground needed Multi-Thread

・Scheduling

・・Shared resource

・The speedup of a program using multiple processors in parallel computing is limited

・Parallelization Multi-threadMulti-thread

・Manage memory access and CPU clock・Limited CMOS scaling

What is needed in application?What is needed in application?

Amdahl's lawAmdahl's law

Problem of Multi-threadProblem of Multi-thread

Multi-core and SMT(HT)Multi-core and SMT(HT)

Page 4: Lock free Algorithm for Multi-core architecture · 2016-07-06 · SDY 1.Introduction Background needed Multi-Thread Problem of Multi-Thread program What's Lock free? 2.Lock free Algorithm

SDY

11.Introduction.Introduction

There was resources problem that share it when concurrent access multi-thread.

Problem of Multi-Thread program 1/2Problem of Multi-Thread program 1/2

Thread 1q.dequeue;

Thread 2q.dequeue();

queue q

① ②10

Criticalsession!!

dequeuedequeue

20

30

10 10

Page 5: Lock free Algorithm for Multi-core architecture · 2016-07-06 · SDY 1.Introduction Background needed Multi-Thread Problem of Multi-Thread program What's Lock free? 2.Lock free Algorithm

SDY

The traditional approach to multi-thread programming is using locks synchronize access to shared resources.Mutex,Semaphore

11.Introduction.IntroductionProblem of Multi-Thread program 2/2Problem of Multi-Thread program 2/2

Thread 1lock q;

q.dequeue();

unlock i

Thread 2

lock q;

q.dequeue();

unlock i

queue q

② ②'10

dequeue20

30④

①lock

③unlock

lock

lock falssewait

⑤unlock

dequeue10

20

Page 6: Lock free Algorithm for Multi-core architecture · 2016-07-06 · SDY 1.Introduction Background needed Multi-Thread Problem of Multi-Thread program What's Lock free? 2.Lock free Algorithm

SDY

Lock-free is "non-blocking" algorithm that is not broken value when access each thread at the same time.

11.Introduction.IntroductionWhat's Lock free?What's Lock free?

Thread 1q.dequeue();

q.enqueue(40);

Thread 2q.euqueue();

queue q

①②

10

Thread Safe!

enqueuedequeue

enqueue 20

30

Lock-free algorithm

40

10 20

Page 7: Lock free Algorithm for Multi-core architecture · 2016-07-06 · SDY 1.Introduction Background needed Multi-Thread Problem of Multi-Thread program What's Lock free? 2.Lock free Algorithm

SDY

2.Lock free Algorithm2.Lock free AlgorithmAtomic operationAtomic operation

"atomic operation" is built-ins that is no memory operand will be moved across the operation,either forward or backward.

・Test-and-Set operation TAS __sync_test_and_set(&p , a)

・Fetch-and-Add(Sub) operation __sync_fetch_and_add(&p , i )

・Compare-and-swap operation CAS __sync_compare_and_swap(&p , a , b )

Page 8: Lock free Algorithm for Multi-core architecture · 2016-07-06 · SDY 1.Introduction Background needed Multi-Thread Problem of Multi-Thread program What's Lock free? 2.Lock free Algorithm

SDY

2.Lock free Algorithm2.Lock free AlgorithmLock-free queue 1/10Lock-free queue 1/10

head pointer

Lock-free queue algorithm(Java Concurrent queue)

dummytail pointer

dummy

dummy nodenext

NULLNULL

node Anext

data ANULL

head pointerdummy

tail pointerA

dummy nodenext

NULLA

node Anext

data ANULL

enqueue

Head pointerA

tail pointerA

node Anext

data ANULL

dequeueHead pointer

dummytail pointer

A

dummy nodenext

NULLA

node Anext

data ANULL

① ←dummy

data A

Page 9: Lock free Algorithm for Multi-core architecture · 2016-07-06 · SDY 1.Introduction Background needed Multi-Thread Problem of Multi-Thread program What's Lock free? 2.Lock free Algorithm

SDY

2.Lock free Algorithm2.Lock free AlgorithmLock-free queue 2/10Lock-free queue 2/10Lock-free queue algorithm(Java Concurrent queue) enqueue(value)E01: node = new node ;E02: node–>value = value ;E03: node–>next.ptr = NULL ;E04: While(true) {E05: tail = TailPointer;E06: next = tail.ptr–>next;E07: if ( tail == TailPointer ) {E08: if (next.ptr == NULL ){E09: if ( CAS(&tail.ptr–>next, next, node) ) {E10: break ;E11: }E12: } else {E13: CAS(&TailPointer, tail, next.ptr) ; E14: }E15: }E16: }E17: CAS(&TailPointer, tail, node) ;

Page 10: Lock free Algorithm for Multi-core architecture · 2016-07-06 · SDY 1.Introduction Background needed Multi-Thread Problem of Multi-Thread program What's Lock free? 2.Lock free Algorithm

SDY

2.Lock free Algorithm2.Lock free AlgorithmLock-free queue 3/10Lock-free queue 3/10Lock-free queue using Java algorithmdequeue(value)D01: while(true){D02: head = HeadPointer ; D03: tail = TailPointer ;D04: next = head–>next ;D05: if ( head == HeadPointer ) { D06: if ( head.ptr == tail.ptr ) {D07: if ( next.ptr == NULL ) {D08: return FALSE ;D09: }D10: CAS(&TailPointer, tail, next.ptr) ;D11: } else {D12: value = next.ptr–>value ;D13: if ( CAS(&HeadPointer, head, next) ) { D14: break ;D15: }D16: }D17: }D18: }D19: delete head ;D20: return true ;

Critical SessionCritical Session

Page 11: Lock free Algorithm for Multi-core architecture · 2016-07-06 · SDY 1.Introduction Background needed Multi-Thread Problem of Multi-Thread program What's Lock free? 2.Lock free Algorithm

SDY

2.Lock free Algorithm2.Lock free AlgorithmLock-free queue 4/10Lock-free queue 4/10Non liner lock-free queue (using arrangement)

head pointer0

tail pointer0

0NULL

1NULL

2NULL

3NULL

4NULL

5NULL

6NULL

MNULL・・・

head pointer0

tail pointer2

0A

1B

2NULL

3NULL

4NULL

5NULL

6NULL

MNULL・・・

Page 12: Lock free Algorithm for Multi-core architecture · 2016-07-06 · SDY 1.Introduction Background needed Multi-Thread Problem of Multi-Thread program What's Lock free? 2.Lock free Algorithm

SDY

Lock-free queue 5/10Lock-free queue 5/10

head pointer

Lock-free queue new algorithm

dummytail pointer

dummy

dummy nodenext

NULLdummy

node Anext

data Adummy

head pointerdummy

tail pointerA

dummy nodenext

NULLA

node Anext

data Adummy

enqueue pattan1

Head pointerdummy

tail pointerA

dummy nodenext

NULLdummy

dequeue pattan1Head pointer

dummytail pointer

A

dummy nodenext

NULLA

node Anext

data Adummy

Node Anext

data Adummy

2.Lock free Algorithm2.Lock free Algorithm

Page 13: Lock free Algorithm for Multi-core architecture · 2016-07-06 · SDY 1.Introduction Background needed Multi-Thread Problem of Multi-Thread program What's Lock free? 2.Lock free Algorithm

SDY

Lock-free queue 6/10Lock-free queue 6/10Lock-free queue new algorithm

Head pointerB

tail pointerB

dummy nodenext

NULLB

dequeue pattan2Head pointer

dummytail pointer

B

dummy nodenext

NULLA

node Anext

data AB

Node Anext

data AB ①

2.Lock free Algorithm2.Lock free Algorithm

Node Bnext

data Bdummy

Node Bnext

data Bdummy

Page 14: Lock free Algorithm for Multi-core architecture · 2016-07-06 · SDY 1.Introduction Background needed Multi-Thread Problem of Multi-Thread program What's Lock free? 2.Lock free Algorithm

SDY

Lock-free queue 7/10Lock-free queue 7/10Lock-free queue new algorithm

Head pointerB

tail pointerB

dummy nodenext

NULLB

dequeue pattan3 (case of dequeue2)

2.Lock free Algorithm2.Lock free Algorithm

Node Bnext

data Bdummy

Head pointerdummy

tail pointerB

dummy nodenext

NULLdummy

Node Bnext

data Bdummy

Same result dequeue pattan1

②③

Page 15: Lock free Algorithm for Multi-core architecture · 2016-07-06 · SDY 1.Introduction Background needed Multi-Thread Problem of Multi-Thread program What's Lock free? 2.Lock free Algorithm

SDY

head pointer

dummytail pointer

A

dummy nodenext

NULLdummy

node Bnext

data Bdummy

head pointerdummy

tail pointerB

dummy nodenext

NULLB

node Bnext

data Bdummy

enqueue pattan2 (case of dequeue1)

Lock-free queue 8/10Lock-free queue 8/10Lock-free queue new algorithm

2.Lock free Algorithm2.Lock free Algorithm

node Cnext

data Cdummy

head pointerB

tail pointerC

dummy nodenext

NULLB

node Bnext

data BC

enqueue pattan3 (case of dequeue2)

Head pointerB

tail pointerB

dummy nodenext

NULLB

Node Bnext

data Bdummy

node Cnext

data Cdummy

Same result dequeue pattan1

Same result dequeue pattan2

Page 16: Lock free Algorithm for Multi-core architecture · 2016-07-06 · SDY 1.Introduction Background needed Multi-Thread Problem of Multi-Thread program What's Lock free? 2.Lock free Algorithm

SDY

2.Lock free Algorithm2.Lock free AlgorithmLock-free queue 9/10Lock-free queue 9/10Lock-free queue using new algorithmenqueue(value)E01: node = new node ;E02: node–>value = value ;E03: node–>next.ptr = dummy ;E04: while( true ) {E05: tail = TailPointer;E06: if ( CAS(&tail–>next,dummy,node) ) {E07: CAS(&tail,TailPointer,node) ;E08: break ;E09: } else {E10: next = tail->next;E11: if ( next != dummy ) {E12: CAS(&TailPointer, tail ,next ) ;E13: }E14: }E15: }

Page 17: Lock free Algorithm for Multi-core architecture · 2016-07-06 · SDY 1.Introduction Background needed Multi-Thread Problem of Multi-Thread program What's Lock free? 2.Lock free Algorithm

SDY

2.Lock free Algorithm2.Lock free AlgorithmLock-free queue 10/10Lock-free queue 10/10Lock-free queue using new algorithmdequeue(value)D01: while ( true ) {D02: head = HeadPointer ;D04: next = HeadPointer –>next ;D05: if ( head == HeadPointer ) { D06: if ( head == dummy ) {D07: if ( next == dummy ) return false ;D09: if ( CAS(HeadPointer, head, next–>next) ){D10: TAS(&dummy–>next,dummy) ;D11: value = next–>value ;D12: delete next ;D13: return true ;D14: }D15: } else {D16: if ( CAS(HeadPointer, head, next–>next) ) {D17: CAS(TailPointer, head , dummy) ;D18: value = head->value ;D19: delete head ;D20: return true;D21: }D22: }D23: }D24:}

Page 18: Lock free Algorithm for Multi-core architecture · 2016-07-06 · SDY 1.Introduction Background needed Multi-Thread Problem of Multi-Thread program What's Lock free? 2.Lock free Algorithm

SDY

Lock-free hash-mapLock-free hash-mapLock-free hash-map algorithm

2.Lock free Algorithm2.Lock free Algorithm

0NULL

1NULL

2NULL

3NULL

4NULL

5NULL

6NULL

MNULL・・・

NULL NULL NULL NULL NULL NULL NULL NULLkey

value

hash-map

0NULL

1NULL

2A

3NULL

4NULL

5NULL

6NULL

MNULL・・・

NULL NULL A NULL NULL NULL NULL NULLkey

value

hash-map

Key AValue AHash function

Page 19: Lock free Algorithm for Multi-core architecture · 2016-07-06 · SDY 1.Introduction Background needed Multi-Thread Problem of Multi-Thread program What's Lock free? 2.Lock free Algorithm

SDY

3.Performance3.Performance

1 5 10 50 100 200 300 500 700 10000

5000000000

10000000000

15000000000

20000000000

25000000000

30000000000

35000000000

40000000000

45000000000

0

5000000000

10000000000

15000000000

20000000000

25000000000

30000000000

35000000000

40000000000

45000000000

std queue(use mutex)lock-free queue(liner)lock-free queue(non-liner)

queue's performance

Thread num

RTDSC /thread num

Page 20: Lock free Algorithm for Multi-core architecture · 2016-07-06 · SDY 1.Introduction Background needed Multi-Thread Problem of Multi-Thread program What's Lock free? 2.Lock free Algorithm

SDY

3.Performance3.Performance

1 5 10 500

500000000

1000000000

1500000000

2000000000

2500000000

3000000000

0

500000000

1000000000

1500000000

2000000000

2500000000

3000000000

std queue(use mutex)lock-free queue(liner)lock-free queue(non-liner)

queue's performanceRTDSC /thread num

Thread num

Page 21: Lock free Algorithm for Multi-core architecture · 2016-07-06 · SDY 1.Introduction Background needed Multi-Thread Problem of Multi-Thread program What's Lock free? 2.Lock free Algorithm

SDY

4.Summary4.Summary・Possible simple coding on application

・Possible access faster than use mutex

・I will plans to create “list” and “liner hash-map”. Present,considering some method that is find(),delete().. base on liner queue.

See source forge website:

http://sourceforge.jp/projects/c-lockfree/