1 tuesday, november 14, 2006 “unix was never designed to keep people from doing stupid things,...

49
1 Tuesday, November 14, 2006 “UNIX was never designed to keep people from doing stupid things, because that policy would also keep them from doing clever things.” - Doug Gwyn

Post on 19-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

1

Tuesday, November 14, 2006

“UNIX was never designed to keep people from doing stupid things,

because that policy would also keep them from doing clever things.”

- Doug Gwyn

2

Sorting Algorithms

Sorting fundamental part of computer operations.

Most common based method of sorting is comparison-based.

3

The basic operation of comparison-based sorting is compare-exchange.

The lower bound on any sequential comparison-based sort of n numbers is ?

4

The lower bound on any sequential comparison-based sort of n numbers is Θ(nlog n).

5

Sorting: Basics

What is a parallel sorted sequence? Where are the input and output lists stored?

6

Sorting: Basics

Assumption that the input and output lists are distributed.

Sorting can be intermediate step in another algorithm

7

Sorting: Basics

The sorted output list is partitioned with the property that each partitioned list is sorted and each element in processor Pi's list is less than that in Pj's list if i < j.

8

Sorting: Parallel Compare Exchange Operation

One element per process.

ts+tw

Overall runtime dominated by inter-process communication

9

Sorting: Parallel Compare Split Operation

More than one element per process.The time for a compare-split operation is ? (assuming that the two partial lists were initially sorted).

10

Sorting: Parallel Compare Split Operation

More than one element per process.The time for a compare-split operation is (ts+ twn/p) (assuming that the two partial lists were initially sorted).For larger block sizes time to merge two blocks is O(n/p)

11

Sorting Networks Networks of comparators designed specifically for sorting.

12

Sorting Networks

We denote an increasing comparator by and a decreasing comparator by Ө.

13

The speed of the network is proportional to ?

14

The speed of the network is proportional to its depth

15

Sorting Networks: Bitonic Sort

A bitonic sequence has two tones

Increasing - decreasing tone

Any cyclic rotation of such networks is also considered bitonic. (i.e. a bitonic sequence that becomes increasing-decreasing after shifting its elements)

Is 1,2,4,7,6,0 a bitonic sequence?

16

Sorting Networks: Bitonic Sort

Is 8,9,2,1,0,4 a bitonic sequence?

17

Sorting Networks: Bitonic Sort

Is 8,9,2,1,0,4 a bitonic sequence?

Yes, it is a cyclic shift of 0,4,8,9,2,1.

18

Sorting Networks: Bitonic Sort Let s = a0,a1,…,an-1 be a bitonic sequence such that a0 ≤ a1 ≤ ··· ≤ an/2-1 and

an/2 ≥ an/2+1 ≥ ··· ≥ an-1.

Consider the following subsequences of s: s1 = min{a0,an/2},min{a1,an/2+1},…,min{an/2-

1,an-1}

s2 = max{a0,an/2},max{a1,an/2+1},

…,max{an/2-1,an-1}

19

Sorting Networks: Bitonic Sort s1 = min{a0,an/2},min{a1,an/2+1},…,min{an/2-1,an-1}

s2 = max{a0,an/2},max{a1,an/2+1},…,max{an/2-1,an-

1}

In s1 there is an element bi such that all elements before bi

are from increasing part and all elements after bi are from the decreasing part.

In s2 there is an element b’i such that all elements before

b’i are from decreasing part and all elements after b’i are from the increasing part.

20

s1 = min{a0,an/2},min{a1,an/2+1},…,min{an/2-1,an-

1}

s2 = max{a0,an/2},max{a1,an/2+1},…,max{an/2-1,an-

1}

Sorting Networks: Bitonic Sort

s1 and s2 are both bitonic and each element of s1 is less than every element in s2.

21

s1 = min{a0,an/2},min{a1,an/2+1},…,min{an/2-1,an-

1}

s2 = max{a0,an/2},max{a1,an/2+1},…,max{an/2-1,an-

1}

Sorting Networks: Bitonic Sort

s1 and s2 are both bitonic and each element of s1 is less than every element in s2.

Divided a bigger problem into two smaller problems (bitonic split).

We can apply the procedure recursively on s1 and s2 to get the sorted sequence.

22

Bitonic merge:

The procedure of sorting a bitonic sequence using bitonic splits.

Sorting Networks: Bitonic Sort

23

Sorting Networks: Bitonic Sort

A bitonic merging network for n = 16. Note: input is a bitonic sequence.

A BM[16] bitonic merging network: The network takes a bitonic sequence and outputs it in sorted order.

24

Number of splits required are?

25

Number of splits required are log(n).There are log(n) columns in a bitonic

merge network.

26

How do we sort n unordered elements?

27

How do we sort n unordered elements using bitonic merge?

We must first build a single bitonic sequence from the given sequence.

28

A sequence of length 2 is a bitonic sequence.

29

A bitonic sequence of length 4 can be built by sorting the first two elements using BM[2] and next two, using ӨBM[2].

This process can be repeated to generate larger bitonic sequences.

30

Sorting Networks: Bitonic Sort

The last merging network (BM[16]) sorts the input. In this example, n = 16.

31

Sorting Networks: Bitonic Sort

The comparator network that transforms an input sequence of 16

unordered numbers into a bitonic sequence.

32

Sorting Networks: Bitonic Sort

A bitonic merging network for n = 16. Note: input is a bitonic sequence.

A BM[16] bitonic merging network: The network takes a bitonic sequence and outputs it in sorted order.

33

d(n) = d(n/2) +log(n)

2/)log(log2log

1

nnin

i

34

The depth of the network is Θ(log2 n).

A serial implementation of the network would have complexity Θ(nlog2 n).

35

Bitonic algorithm is communication intensive.

Take into account topology of underlying network.

36

Bitonic sorting network for n elements: log n stages

stage i consists of i columns of n/2 comparators.

On a parallel computer the compare exchange operations is performed by a pair of processes.

37

Sorting Networks: Bitonic Sort

A bitonic merging network for n = 16. Note: input is a bitonic sequence.

A BM[16] bitonic merging network: The network takes a bitonic sequence and outputs it in sorted order.

38

One element per processor.

How to map processes? Compare-exchange should ideally be between

neighboring processes.

39

Mapping Bitonic Sort to Hypercubes

The compare-exchange operation is performed between two wires only if their labels differ in exactly one bit!

This implies a direct mapping of wires to processors. All communication is nearest neighbor!

40

41

42

43

44

Mapping Bitonic Sort to Hypercubes

Communication characteristics of bitonic sort on a hypercube. During each stage of the algorithm, processes communicate along the dimensions shown.

45

Mapping Bitonic Sort to Hypercubes

Parallel formulation of bitonic sort on a hypercube with n = 2d processes.

What is the parallel runtime?

46

Mapping Bitonic Sort to Hypercubes

Parallel formulation of bitonic sort on a hypercube with n = 2d processes.

Parallel runtime: Tp=O(log2n)

Cost optimal?

47

Block of Elements Per Processor

Each process is assigned a block of n/p elements.

48

Block of Elements Per Processor

The first step is a local sort of the local block.

49

Block of Elements Per Processor: Hypercube Initially the processes sort their n/p elements

(using merge sort) in time Θ((n/p)log(n/p)) and then perform Θ(log2p) compare-split steps.

The parallel run time of this formulation is

Comparing to an optimal sort, the algorithm can efficiently use up to processes.

)2( lognp