parallel string sorting with super scalar string sample sort · scanning an array p 1 p 2 p 3 p 4 n...

Post on 08-Oct-2020

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

INSTITUTE OF THEORETICAL INFORMATICS – ALGORITHMICS

Parallel String Sorting withSuper Scalar String Sample SortTimo Bingmann and Peter Sanders | September 4th, 2013 @ ESA’13

KIT – University of the State of Baden-Wuerttemberg andNational Laboratory of the Helmholtz Association www.kit.edu

Abstract

We present the currently fastest parallel string sorting algorithmfor modern multi-core shared memory architectures.

First, we describe the challenges posed by these newarchitectures, and discuss key points to achieving highperformance gains. Then we give an overview of existingsequential and parallel string sorting algorithms andimplementations. Thereafter, we continue by developing superscalar string sample sort (S5), which is easily parallelizable andyields higher parallel speedups than all previously knownalgorithms.

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 2 / 29

Overview

1 Introduction and Motivation

Parallel Memory Bandwidth Test

2 String Sorting Algorithms

Radix Sort

Multikey Quicksort

Super Scalar String Sample Sort

3 Experimental Results

4 Conclusion

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 3 / 29

String Sorting AlgorithmsTheoretical Parallel Algorithms

“Optimal Parallel String Algorithms: . . .” [Hagerup ’94]

Existing Basic Sequential Algorithms

Radix Sort [McIlroy et al. ’95]Multikey Quicksort [Bentley, Sedgewick ’97]Burstsort [Sinha, Zobel ’04]LCP-Mergesort [Ng, Kakehi ’08]

Existing Algorithm Library

by Tommi Rantala (for Radix Sort [Kärkkäinen, Rantala ’09])http://github.com/rantala/string-sorting

Our Contribution: Practical Parallel Algorithms

Parallel Super Scalar String Sample Sort (pS5) [This Work]

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 4 / 29

String Sorting AlgorithmsTheoretical Parallel Algorithms

“Optimal Parallel String Algorithms: . . .” [Hagerup ’94]

Existing Basic Sequential Algorithms

Radix Sort [McIlroy et al. ’95]Multikey Quicksort [Bentley, Sedgewick ’97]Burstsort [Sinha, Zobel ’04]LCP-Mergesort [Ng, Kakehi ’08]

Existing Algorithm Library

by Tommi Rantala (for Radix Sort [Kärkkäinen, Rantala ’09])http://github.com/rantala/string-sorting

Our Contribution: Practical Parallel Algorithms

Parallel Super Scalar String Sample Sort (pS5) [This Work]

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 4 / 29

Modern Multi-Core Architecture

L3 Cache

L2 L2 L2 L2

L2 L2 L2 L2

L3 Cache

L2 L2 L2 L2

L2 L2 L2 L2

L3 Cache

L2 L2 L2 L2

L2 L2 L2 L2

L3 Cache

L2 L2 L2 L2

L2 L2 L2 L2

RA

MR

AM

RA

MR

AM

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 5 / 29

Parallel Memory Bandwidth Test

// ScanRead64IndexUnrollLoopfor (size_t i=0; i<n; i+=16) {

uint64_t x0 = array[i+0];// ... 14 timesuint64_t x15 = array[i+15];

}

// PermRead64SimpleLoopuint64_t p = *array;while((uint64_t*)p != array)

p = *(uint64_t*)p;

p1 p2 p3 p4

n

p1 p2

n

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 6 / 29

ScanRead64IndexUnrollLoop

210 215 220 225 230 235

0

200

400

600

800

1,000

1,200

area size n [B]

band

wid

th[G

iB/s

]

p=1p=2p=4p=8p=16p=32p=64

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 7 / 29

ScanRead64IndexUnrollLoop

210 215 220 225 230 235

0

200

400

600

800

1,000

1,200

1,137 GiB/s

16 GiB/s

area size n [B]

band

wid

th[G

iB/s

]

p=1p=2p=4p=8p=16p=32p=64

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 7 / 29

ScanRead64IndexUnrollLoop

210 215 220 225 230 2350

10

20

30

40

50

area size n [B]

band

wid

th[G

iB/s

]

p=1p=2p=4p=8p=16p=32p=64

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 8 / 29

ScanRead64IndexUnrollLoop

210 215 220 225 230 2350

10

20

30

40

50

35 GiB/s

2.6 GiB/s

area size n [B]

band

wid

th[G

iB/s

]

p=1p=2p=4p=8p=16p=32p=64

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 8 / 29

PermRead64SimpleLoop

210 215 220 225 230 235

0

100

200

300

area size n [B]

band

wid

th[G

iB/s

]

p=1p=2p=4p=8p=16p=32p=64

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 9 / 29

PermRead64SimpleLoop

210 215 220 225 230 235

0

200

400

600

area size n [B]

acce

ssla

tenc

y[n

s]

p=1p=2p=4p=8p=16p=32p=64

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 10 / 29

PermRead64SimpleLoop

210 215 220 225 230 235

0

200

400

600

0.028 ns

6 ns

area size n [B]

acce

ssla

tenc

y[n

s]

p=1p=2p=4p=8p=16p=32p=64

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 10 / 29

Log-Log Access Latency

210 215 220 225 230 235

10−11

10−10

10−9

10−8

10−7

10−6

area size n [B]

acce

ssla

tenc

y[s

]

Scan p=1Scan p=64Perm p=1Perm p=64

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 11 / 29

Parallel Memory Bandwidth Test

Scanning an Array

p1 p2 p3 p4

n

Random Access in an Array

p1 p2

n

Cache (n = 16 KiB)p Scan Random1 35 GiB/s 4.4 GiB/s

16 567 GiB/s 69 GiB/s32 1131 GiB/s 137 GiB/s

RAM (n = 4 GiB)p Scan Random1 2.6 GiB/s 19 MiB/s16 10.3 GiB/s 403 MiB/s32 10.6 GiB/s 763 MiB/s

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 12 / 29

Sorting Strings

a r r a y

k i t

a r r a n g e

k a y a k

k e r n e l

k i t c h e n

k i t t e n

a r c a d e

k i t e

a b a c u s

k r y p t o n

a l p h a

a r c a i c

a b a c u s

a l p h a

a r c a d e

a r c a i c

a r r a n g e

a r r a y

k a y a k

k e r n e l

k i t

k i t c h e n

k i t e

k i t t e n

k r y p t o n

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 13 / 29

Sorting Strings

a r r a y

k i t

a r r a n g e

k a y a k

k e r n e l

k i t c h e n

k i t t e n

a r c a d e

k i t e

a b a c u s

k r y p t o n

a l p h a

a r c a i c

a b a c u s

a l p h a

a r c a d e

a r c a i c

a r r a n g e

a r r a y

k a y a k

k e r n e l

k i t

k i t c h e n

k i t e

k i t t e n

k r y p t o n

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 13 / 29

Sorting Strings: Radix Sort

a r r a y

k i t

a r r a n g e

k a y a k

k e r n e l

k i t c h e n

k i t t e n

a r c a d e

k i t e

a b a c u s

k r y p t o n

a l p h a

a r c a i c

0 σ−1a k

0 6 7 0

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 14 / 29

Sorting Strings: Radix Sort

a r r a y

k i t

a r r a n g e

k a y a k

k e r n e l

k i t c h e n

k i t t e n

a r c a d e

k i t e

a b a c u s

k r y p t o n

a l p h a

a r c a i c

0 σ−1a k

0 6 7 0

0 0 6 6 13 13 13

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 14 / 29

Sorting Strings: Radix Sort

a r r a y

k i t

a r r a n g e

k a y a k

k e r n e l

k i t c h e n

k i t t e n

a r c a d e

k i t e

a b a c u s

k r y p t o n

a l p h a

a r c a i c

0 σ−1a k

0 6 7 0

0 0 6 6 13 13 13

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 14 / 29

Sorting Strings: Radix Sort

a r r a y

a r r a n g e

a r c a d e

a b a c u s

a l p h a

a r c a i c

k i t

k a y a k

k e r n e l

k i t c h e n

k i t t e n

k i t e

k r y p t o n

0 σ−1a k

0 6 7 0

0 0 6 6 13 13 13

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 14 / 29

Sorting Strings: Radix Sort

a r r a y

k i t

a r r a n g e

k a y a k

k e r n e l

k i t c h e n

k i t t e n

a r c a d e

k i t e

a b a c u s

k r y p t o n

a l p h a

a r c a i c

0 σ−1a k

0 6 7 0

0 0 6 6 13 13 13

p1

p2

p3

0 2 3 00 2 3 00 2 1 0

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 14 / 29

Sorting Strings: Radix Sort

a r r a y

k i t

a r r a n g e

k a y a k

k e r n e l

k i t c h e n

k i t t e n

a r c a d e

k i t e

a b a c u s

k r y p t o n

a l p h a

a r c a i c

0 σ−1a k

0 6 7 0

0 0 6 6 13 13 13

p1

p2

p3

0 2 3 00 2 3 00 2 1 0

p1

p2

p3

0 0 6 6 13 13 130 2 6 9 13 130 4 6 12 13 13

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 14 / 29

Sorting Strings: Multikey Quicksort

a r r a y

k i t

a r r a n g e

k a y a k

k e r n e l

k i t c h e n

k i t t e n

a r c a d e

k i t e

a b a c u s

k r y p t o n

a l p h a

a r c a i c

[Bentley, Sedgewick ’97]

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 15 / 29

Sorting Strings: Multikey Quicksort

a r r a y

a r r a n g e

k a y a k

k e r n e l

a r c a d e

a b a c u s

a l p h a

a r c a i c

k i t

k i t c h e n

k i t t e n

k i t e

k r y p t o n

<

=

>

[Bentley, Sedgewick ’97]

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 15 / 29

Sorting Strings: Multikey Quicksort

a r r a y

a r r a n g e

k a y a k

k e r n e l

a r c a d e

a b a c u s

a l p h a

a r c a i c

k i t

k i t c h e n

k i t t e n

k i t e

k r y p t o n

<

=

>

[Bentley, Sedgewick ’97]

= < ? > =

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 15 / 29

Sorting Strings: Multikey Quicksort

a r r a y

a r r a n g e

k a y a k

k e r n e l

a r c a d e

a b a c u s

a l p h a

a r c a i c

k i t

k i t c h e n

k i t t e n

k i t e

k r y p t o n

<

=

>

[Bentley, Sedgewick ’97]

< = >

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 15 / 29

Sorting Strings: Multikey Quicksort

a r r a y

a r r a n g e

k a y a k

k e r n e l

a r c a d e

a b a c u s

a l p h a

a r c a i c

k i t

k i t c h e n

k i t t e n

k i t e

k r y p t o n

<

=

>

[Bentley, Sedgewick ’97]

< = >

[Rantala ’??]

partition by w = 8 characters

cache characters⇒ fewer random accesses

fastest sequential algorithm

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 15 / 29

Sorting Strings: Multikey Quicksort

a r r a y

a r r a n g e

k a y a k

k e r n e l

a r c a d e

a b a c u s

a l p h a

a r c a i c

k i t

k i t c h e n

k i t t e n

k i t e

k r y p t o n

<

=

>

[Bentley, Sedgewick ’97]

< = >

[Rantala ’??]

partition by w = 8 characters

cache characters⇒ fewer random accesses

fastest sequential algorithm

[This Work]

? ? ? ? ? ? ? ? ? ? ? ? ?

parallelize using blocks

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 15 / 29

Sorting Strings: Multikey Quicksort

a r r a y

a r r a n g e

k a y a k

k e r n e l

a r c a d e

a b a c u s

a l p h a

a r c a i c

k i t

k i t c h e n

k i t t e n

k i t e

k r y p t o n

<

=

>

? ? ? ? ? ? ? ? ? ? ? ? ?

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 16 / 29

Sorting Strings: Multikey Quicksort

a r r a y

a r r a n g e

k a y a k

k e r n e l

a r c a d e

a b a c u s

a l p h a

a r c a i c

k i t

k i t c h e n

k i t t e n

k i t e

k r y p t o n

<

=

>

? ? ? ? ? ? ? ? ? ? ? ? ?

p1 ? ? ?

p2 ? ? ?

p3 ? ? ?

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 16 / 29

Sorting Strings: Multikey Quicksort

a r r a y

a r r a n g e

k a y a k

k e r n e l

a r c a d e

a b a c u s

a l p h a

a r c a i c

k i t

k i t c h e n

k i t t e n

k i t e

k r y p t o n

<

=

>

? ? ? ? ? ? ? ? ? ? ? ? ?

p1 < = ? > ?

p2 < ? = ? >

p3 < ? = > ?

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 16 / 29

Sorting Strings: Multikey Quicksort

a r r a y

a r r a n g e

k a y a k

k e r n e l

a r c a d e

a b a c u s

a l p h a

a r c a i c

k i t

k i t c h e n

k i t t e n

k i t e

k r y p t o n

<

=

>

? ? ? ?? ? ? ?? ? ? ? ?

p1 ? = ? > ?

p2 < ? = ? ?

p3 < ? ? > ?

Out

put <

=

>

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 16 / 29

Sorting Strings: Multikey Quicksort

a r r a y

a r r a n g e

k a y a k

k e r n e l

a r c a d e

a b a c u s

a l p h a

a r c a i c

k i t

k i t c h e n

k i t t e n

k i t e

k r y p t o n

<

=

>

? ? ? ?? ? ? ?? ? ? ? ?

p1 < ? = ? >

p2 < = ? > ?

p3 < ? = ? >

Out

put <

=

>

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 16 / 29

Sorting Strings: Multikey Quicksort

a r r a y

a r r a n g e

k a y a k

k e r n e l

a r c a d e

a b a c u s

a l p h a

a r c a i c

k i t

k i t c h e n

k i t t e n

k i t e

k r y p t o n

<

=

>

? ? ? ?? ? ? ?? ? ? ? ?

p1 < ? = ? ? 0

p2 0 = ? > ?

p3 < ? = ? 0

Out

put < < <

=

> >

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 16 / 29

Sorting Strings: Multikey Quicksort

a r r a y

a r r a n g e

k a y a k

k e r n e l

a r c a d e

a b a c u s

a l p h a

a r c a i c

k i t

k i t c h e n

k i t t e n

k i t e

k r y p t o n

<

=

>

? ? ? ?? ? ? ?? ? ? ? ?

p1 < 0 = 0 > 0

p2 < 0 = 0 > 0

p3 < 0 = 0 > 0

More

Output

Out

put < < <

=

> >

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 16 / 29

Super Scalar String Sample Sort (S5)

a r r a y

k i t

a r r a n g e

k a y a k

k e r n e l

k i t c h e n

k i t t e n

a r c a d e

k i t e

a b a c u s

k r y p t o n

a l p h a

a r c a i c

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 17 / 29

Super Scalar String Sample Sort (S5)

a r r a y

k i t

a r r a n g e

k a y a k

k e r n e l

k i t c h e n

k i t t e n

a r c a d e

k i t e

a b a c u s

k r y p t o n

a l p h a

a r c a i c

ar

ab ki

b0 b1 b2 b3 b4 b5 b6

<

=

>

< = > < = >

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 17 / 29

Super Scalar String Sample Sort (S5)

a r r a y

k i t

a r r a n g e

k a y a k

k e r n e l

k i t c h e n

k i t t e n

a r c a d e

k i t e

a b a c u s

k r y p t o n

a l p h a

a r c a i c

ar

ab ki

b0 b1 b2 b3 b4 b5 b6

b0 b1 b2 b3 b4 b5 b6

b0 b1 b2 b3 b4 b5 b6

<

=

>

< = > < = >

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 17 / 29

Super Scalar String Sample Sort (S5)

a r r a y

k i t

a r r a n g e

k a y a k

k e r n e l

k i t c h e n

k i t t e n

a r c a d e

k i t e

a b a c u s

k r y p t o n

a l p h a

a r c a i c

ar

ab ki

0 0 0 2 2 1 00 1 0 1 0 3 00 0 1 1 0 0 1

<

=

>

< = > < = >

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 17 / 29

Super Scalar String Sample Sort (S5)

a r r a y

k i t

a r r a n g e

k a y a k

k e r n e l

k i t c h e n

k i t t e n

a r c a d e

k i t e

a b a c u s

k r y p t o n

a l p h a

a r c a i c

ar

ab ki

0 0 1 4 8 9 120 1 1 5 8 12 120 1 2 6 8 12 13

<

=

>

< = > < = >

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 17 / 29

Super Scalar String Sample Sort (S5)

a b a c u s

a l p h a

a r r a y

a r r a n g e

a r c a d e

a r c a i c

k a y a k

k e r n e l

k i t

k i t c h e n

k i t t e n

k i t e

k r y p t o n

ar

ab ki

0 0 1 4 8 9 120 1 1 5 8 12 120 1 2 6 8 12 13

<

=

>

< = > < = >

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 17 / 29

Super Scalar String Sample Sort (S5)

a b a c u s

a l p h a

a r r a y

a r r a n g e

a r c a d e

a r c a i c

k a y a k

k e r n e l

k i t

k i t c h e n

k i t t e n

k i t e

k r y p t o n

prefix21

2

0

2

0

ar

ab ki

0 0 1 4 8 9 120 1 1 5 8 12 120 1 2 6 8 12 13

<

=

>

< = > < = >

1 0

increase prefix byLCP of splitters or

key size.

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 17 / 29

Super Scalar String Sample Sort (S5)

a b a c u s

a l p h a

a r r a y

a r r a n g e

a r c a d e

a r c a i c

k a y a k

k e r n e l

k i t

k i t c h e n

k i t t e n

k i t e

k r y p t o n

prefix21

2

0

2

0

ar

ab ki

0 0 1 4 8 9 120 1 1 5 8 12 120 1 2 6 8 12 13

<

=

>

< = > < = >

1 0

increase prefix byLCP of splitters or

key size.

partitions by w = 8 chars

easy parallelization

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 17 / 29

Super Scalar String Sample Sort (S5)

ar

ab ki

b0 b1 b2 b3 b4 b5 b6

<

=

>

< = > < = >

partitions by w = 8 chars

easy parallelization

256 KiB L2 cache: 13 levels

predicated instructions

ar ab ki

1 2 3

i := 2i + 0/1

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 18 / 29

Super Scalar String Sample Sort (S5)

ar

ab ki

b0 b1 b2 b3 b4 b5 b6

<

=

>

< = > < = >

partitions by w = 8 chars

easy parallelization

256 KiB L2 cache: 13 levels

predicated instructions

ar ab ki

1 2 3

i := 2i + 0/1

equality checking:1 at each node2 after full descent

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 18 / 29

Super Scalar String Sample Sort (S5)

ar

ab ki

b0 b1 b2 b3 b4 b5 b6

<

=

>

< = > < = >

partitions by w = 8 chars

easy parallelization

256 KiB L2 cache: 13 levels

predicated instructions

ar ab ki

1 2 3

i := 2i + 0/1

equality checking:1 at each node2 after full descent

interleave tree descents:classify 4 strings at once⇒ super scalar parallelism

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 18 / 29

Parallel S5 – Sub-Algorithms

|S| ≥ np fully parallel S5

np > |S| ≥ 216 sequential S5

216 > |S| ≥ 64 caching multikey quicksort

64 > |S| insertion sort

Important: dynamic load balancing with voluntary work freeing

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 19 / 29

URLs – 1.1 G Lines, 70.7 GiBhttp://algo2.iti.kit.edu/index.php

http://algo2.iti.kit.edu/english/index.php

http://algo2.iti.kit.edu/1483.php

http://algo2.iti.kit.edu/1484.php

http://www.kit.edu/

http://algo2.iti.kit.edu/286.php

http://algo2.iti.kit.edu/1294.php

http://algo2.iti.kit.edu/research.php

http://algo2.iti.kit.edu/publications.php

http://algo2.iti.kit.edu/members.php

http://algo2.iti.kit.edu/lehre.php

http://algo2.iti.kit.edu/1844.php

http://algo2.iti.kit.edu/294.php

http://algo2.iti.kit.edu/basic-toolbox-page.php

http://map.project-osrm.org?dest=49.0137004,8.419307&destname=%22Am%20Fa...

http://www.uni-karlsruhe.de/fs/Uni/info/campusplan/index.php?id=50.34

http://www.informatik.kit.edu/1158.php

http://algo2.iti.kit.edu/emailform.php?id=eea49c752e4c1329710cba2efae10511

http://algo2.iti.kit.edu/routeplanning.php

http://algo2.iti.kit.edu/sanders.php

http://www.mwk.baden-wuerttemberg.de/forschung/forschungsfoerderung/land...

http://algo2.iti.kit.edu/1996.php

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 20 / 29

URLs – Speedup on 32-core Intel E5

1 8 16 32 48 64

0

2

4

6

8

10

number of threads

spee

dup pS5

pMultikeyQuicksortpRadixSort

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 21 / 29

Wikipedia – Suffixes of 4 GiB<page>

<title>Karlsruhe Institute of Technology</title>

<ns>0</ns>

<id>5977314</id>

<revision>

<id>491222814</id>

<timestamp>2013-04-24T03:17:12Z</timestamp>

<text xml:space="preserve">{{Infobox university

|name = Karlsruhe Institute of Technology

|image = [[Image:KIT logo.png|270px|]]

|tagline = ’’{{lang|de|Forschungsuniversität}}’’ (Research University)

|established = Fridericiana: 1825 as polytechnical school, 1865 as university;&lt;br...

|president = Horst Hippler, Eberhard Umbach

|students = 23,905 (October 2012)

|staff = 3.423<ref name="kit.edu"/>}}

The ’’’Karlsruhe Institute of Technology’’’ (’’’KIT’’’) is one of the largest research and

educations institution in Germany, resulting from a merger of the university (’’Universität

Karlsruhe (TH)’’) and the research center (’’Forschungszentrum Karlsruhe’’)&lt;ref&gt;[[

Federal Ministry of Education and Research (Germany)]]: http://www.bmbf.de/pub/eck

punktepapier_kit.pdf&lt;/ref&gt; of the city of [[Karlsruhe]]. The university, also known

as ’’’’’Fridericiana’’’’’, was founded in 1825. In 2009, it merged with the former national

nuclear research center founded in 1956 as the ’’Kernforschungszentrum Karlsruhe (KfK)’’.

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 22 / 29

Suffixes – Speedup on 32-core Intel E5

1 8 16 32 48 640

5

10

15

20

number of threads

spee

dup

pS5

pMultikeyQuicksortpRadixSort

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 23 / 29

GOV2 – 3.1 G Lines in 128 GiB<p align="left"><font face="Verdana, Arial, Helvetica, sans-serif"><b>BACKGROUND

INFORMATION. </b>The Forest Ecosystem Dynamics (FED) Project is concerned

with modeling and monitoring ecosystem processes and patterns in response

to natural and anthropogenic effects. The project uses coupled ecosystem

models and remote sensing models and measurements to predict and observe

ecosystem change. The overall objective of the FED project is to link

and use models of forest dynamics, soil processes, and canopy energetics

to understand how ecosystem response to change affects patterns and processes

in northern and boreal forests and to assess the implications for global

change. See <a href="http://fedwww.gsfc.nasa.gov/html/conceptdiagram.html">

Conceptual Diagram</a> for model schematic. </font></p>

<p align="left"><font face="Verdana, Arial, Helvetica, sans-serif">The Forest

Ecosystem Dynamics World-Wide-Web server has been online since July 1994.

The FED server was created for the dissemination of project information,

to archive numerous spatial and scientific data sets, and demonstrate

the linking of ecosystem and remote sensing models. </font></p>

</td>

<td valign="top" width="28%" align="left" height="565">

<p><img src="vertrule.gif" alt="vertical rule" width="2" height="580" hspace="5" ...

<p><font size="2" face="Verdana, Arial, Helvetica, sans-serif"><a href="http://fedwww...

Abstract</a> </font></p>

<p><font face="Verdana, Arial, Helvetica, sans-serif" size="2"><a href="http://fedwww...

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 24 / 29

GOV2 – Speedup on 32-core Intel E5

1 8 16 32 48 64

0

2

4

6

8

10

12

number of threads

spee

dup pS5

pMultikeyQuicksortpRadixSort

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 25 / 29

Words – 31.6 M Lines, 382 MiBstereosto

tocellular

cellularand

andportable

wellas

asdiscussion

ofcomputer

computersecurity

securityissues

issuesin

ingeneral

generalWhile

Whilewe

wewelcome

welcomeall

allmanner

mannerof

ofquestion

questionabout

aboutencrypted

encryptedmessages

messagesHow

doesPGP

PGPencrypt

encrypta

amessage

messageWhat

apublic

publickey

keyencrypted

messagesthemselves

themselvesare

offtopic

topicHost

CamelotMessage

MessageFile

FileBBS

BBSCulver

CulverCity

CityCA

ListGraphics

GraphicsGeneral

GeneralGraphics

Graphics15

15Discussions

aboutall

alltypes

typesof

computergraphics

graphicsand

languageconference

forthe

theexchange

exchangeof

ofideas

ideasin

Learnmore

moreabout

theSpanish

Spanishculture

culturewhile

communicatingin

thelanguage

languageHost

ListFantsySciF

FantsySciFFantasy

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 26 / 29

Words – Speedup on 4-core Intel i7

1 2 3 4 5 6 7 8

1

2

3

4

number of threads

spee

dup

pS5

pMultikeyQuicksortpRadixSort

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 27 / 29

Future Work

Non-uniform memory architecture (NUMA) effects

distributed string sorting algorithms

distributed text index construction and queryhigh-performance middleware?

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 28 / 29

Thank you for your attention!

Questions?

Timo Bingmann and Peter Sanders – Parallel String Sorting with Super Scalar String Sample SortInstitute of Theoretical Informatics – Algorithmics September 4th, 2013 29 / 29

top related