sorting

70
1 Sorting We have actually seen already two efficient ways to sort:

Upload: abner

Post on 14-Jan-2016

22 views

Category:

Documents


0 download

DESCRIPTION

Sorting. We have actually seen already two efficient ways to sort:. A kind of “insertion” sort. Insert the elements into a red-black tree one by one Traverse the tree in in-order and collect the keys Takes O(nlog(n)) time. Heapsort (Willians, Floyd, 1964). Put the elements in an array - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Sorting

1

Sorting

• We have actually seen already two efficient ways to sort:

Page 2: Sorting

2

A kind of “insertion” sort

• Insert the elements into a red-black tree one by one

• Traverse the tree in in-order and collect the keys

• Takes O(nlog(n)) time

Page 3: Sorting

3

Heapsort (Willians, Floyd, 1964)

• Put the elements in an array• Make the array into a heap• Do a deletemin and put the

deleted element at the last position of the array

Page 4: Sorting

4

Quicksort (Hoare 1961)

Page 5: Sorting

5

quicksort

Input: an array A[p, r]

Quicksort (A, p, r) if (p < r)

then q = Partition (A, p, r) //q is the position of the pivot element

Quicksort (A, p, q-1) Quicksort (A, q+1, r)

Page 6: Sorting

6

2 8 7 1 3 5 6 4

i j

2 8 7 1 3 5 6 4

i j

2 8 7 1 3 5 6 4

i j

2 8 7 1 3 5 6 4

i j

2 1 7 8 3 5 6 4

i j

p r

Page 7: Sorting

7

2 1 7 8 3 5 6 4

i j

2 1 3 8 7 5 6 4

i j

2 1 3 8 7 5 6 4

i j

2 1 3 8 7 5 6 4

i j

2 1 3 4 7 5 6 8

i j

Page 8: Sorting

8

2 8 7 1 3 5 6 4p r

Partition(A, p, r) x ←A[r]

i ← p-1 for j ← p to r-1

do if A[j] ≤ x then i ← i+1 exchange A[i] ↔ A[j] exchange A[i+1] ↔A[r] return i+1

Page 9: Sorting

9

Analysis

• Running time is proportional to the number of comparisons

• Each pair is compared at most once O(n2)

• In fact for each n there is an input of size n on which quicksort takes Ω(n2) time

Page 10: Sorting

10

But

• Assume that the split is even in each iteration

Page 11: Sorting

11

T(n) = 2T(n/2) + n

How do we solve linear recurrences like this ? (read Chapter 4)

Page 12: Sorting

12

Recurrence tree

T(n/2)

n

T(n/2)

Page 13: Sorting

13

Recurrence tree

n/2

n

n/2

T(n/4)T(n/4)T(n/4)T(n/4)

Page 14: Sorting

14

Recurrence tree

n/2

n

n/2

T(n/4)T(n/4)T(n/4)T(n/4)logn

In every level we do bn comparisonsSo the total number of comparisons is O(nlogn)

Page 15: Sorting

17

Observations

• We can’t guarantee good splits

• But intuitively on random inputs we will get good splits

Page 16: Sorting

18

Randomized quicksort

• Use randomized-partition rather than partition

Randomized-partition (A, p, r) i ← random(p,r)

exchange A[r] ↔ A[i] return partition(A,p,r)

Page 17: Sorting

19

• On the same input we will get a different running time in each run !

• Look at the average for one particular input of all these running times

Page 18: Sorting

20

Expected # of comparisons

Let X be the expected # of comparisons

This is a random variable

Want to know E(X)

Page 19: Sorting

21

Expected # of comparisons

Let z1,z2,.....,zn the elements in sorted order

Let Xij = 1 if zi is compared to zj and 0 otherwise

So,

1n

1i

n

1ijijXX

Page 20: Sorting

22

by linearity of expectation

Page 21: Sorting

23

Consider zi,zi+1,.......,zj ≡ Zij

Claim: zi and zj are compared either zi or zj is the first chosen in Zij

Proof: 3 cases:– {zi, …, zj} Compared on this

partition, and never again.– {zi, …, zj} the same

– {zi, …, zk, …, zj} Not compared on this partition. Partition separates them, so no future partition uses both.

Page 22: Sorting

24

= 1/(j-i+1) + 1/(j-i+1)= 2/(j-i+1)

Pr{zi is compared to zj}

= Pr{zi or zj is first pivot chosen from Zij} just explained

= Pr{zi is first pivot chosen from Zij} + Pr{zj is first pivot chosen from Zij}

mutually exclusivepossibilities

Page 23: Sorting

25

Simplify with a change of variable, k=j-i+1.

Simplify and overestimate, by adding terms.

Page 24: Sorting

26

Lower bound for sorting in the comparison model

Page 25: Sorting

27

A lower bound

• Comparison model: We assume that the operation from which we deduce order among keys are comparisons

• Then we prove that we need Ω(nlogn) comparisons on the worst case

Page 26: Sorting

Model the algorithm as a decision tree

Page 27: Sorting

Insertion sort

1:2

2:3

<

<2:3

>

1:2

>

1:2

>

< > < >

x y z

x y z y x z

x y z y x z y z x

y z x z y x

x z y

z x yx z y

Page 28: Sorting

Quicksort

1:3

2:3

<

<2:3

>

1:2

>

2:3

>

< > < >

<

x y z

x y z

x y z y x z

x z y y z x z x y

z x y z y x

Page 29: Sorting

31

Important observations

• Every algorithm can be represented as a (binary) tree like this

• For every node v there is an input on which the algorithm reaches v

• The # of leaves is n!

Page 30: Sorting

32

Important observations

• Each path corresponds to a run on some input

• The worst case # of comparisons corresponds to the longest path

Page 31: Sorting

33

The lower bound

Let d be the length of the longest path

#leaves ≤ 2dn! ≤

log2(n!) ≤ d

Page 32: Sorting

34

Lower bound for sorting

• Any sorting algorithm based on comparisons between elements requires (n log n) comparisons.

Page 33: Sorting

35

Beating the lower bound

• We can beat the lower bound if we can deduce order relations between keys not by comparisons

Examples:• Count sort• Radix sort

Page 34: Sorting

36

Count sort

• Assume that keys are integers between 0 and k

2 3 0 5 3 5 0 2 0A

Page 35: Sorting

37

Count sort

• Allocate a temporary array of size k: cell x counts the # of keys =x

2 3 0 5 3 5 0 2 5A

0 0 0 0 0 0C

Page 36: Sorting

38

Count sort

2 3 0 5 3 5 0 2 5A

0 0 1 0 0 0C

Page 37: Sorting

39

Count sort

2 3 0 5 3 5 0 2 5A

0 0 1 1 0 0C

Page 38: Sorting

40

Count sort

2 3 0 5 3 5 0 2 5A

1 0 1 1 0 0C

Page 39: Sorting

41

Count sort

2 3 0 5 3 5 0 2 5A

2 0 2 2 0 3C

Page 40: Sorting

42

Count sort

2 3 0 5 3 5 0 2 5A

2 0 2 2 0 3C

• Compute prefix sums of C: cell x holds the # of keys ≤ x (rather than =x)

Page 41: Sorting

43

Count sort

2 3 0 5 3 5 0 2 5A

2 2 4 6 6 9C

• Compute prefix sums of C: cell x holds the # of keys ≤ x (rather than =x)

Page 42: Sorting

44

Count sort

2 3 0 5 3 5 0 2 5A

2 2 4 6 6 9C

• Move items to output array

/ / / / / / / / /B

Page 43: Sorting

45

Count sort

2 3 0 5 3 5 0 2 5A

2 2 4 6 6 9C

/ / / / / / / / /B

Page 44: Sorting

46

Count sort

2 3 0 5 3 5 0 2 5A

2 2 4 6 6 8C

/ / / / / / / / 5B

Page 45: Sorting

47

Count sort

2 3 0 5 3 5 0 2 5A

2 2 3 6 6 8C

/ / / 2 / / / / 5B

Page 46: Sorting

48

Count sort

2 3 0 5 3 5 0 2 5A

1 2 3 6 6 8C

/ 0 / 2 / / / / 5B

Page 47: Sorting

49

Count sort

2 3 0 5 3 5 0 2 5A

1 2 3 6 6 7C

/ 0 / 2 / / / 5 5B

Page 48: Sorting

50

Count sort

2 3 0 5 3 5 0 2 5A

1 2 3 5 6 7C

/ 0 / 2 / 3 / 5 5B

Page 49: Sorting

51

Count sort

2 3 0 5 3 5 0 2 5A

0 2 2 4 6 6C

0 0 2 2 3 3 5 5 5B

Page 50: Sorting

52

Count sort

• Complexity: O(n+k)• The sort is stable• Note that count sort does not

perform any comparison

Page 51: Sorting

53

Radix sort• Say we have numbers with d digits

each between 0 and k

2 8 7 1

4 5 9 1

6 5 7 2

1 3 0 1

2 4 7 2

3 5 5 5

7 0 2 2

8 3 9 4

4 8 4 4

3 5 3 6

Page 52: Sorting

54

Radix sort• Use a stable sort to sort by the

least significant digit (e.g. count sort)

2 8 7 1

4 5 9 1

6 5 7 2

1 3 0 1

2 4 7 2

3 5 5 5

7 0 2 2

8 3 9 4

4 8 4 4

3 5 3 6

Page 53: Sorting

55

Radix sort

2 8 7 1

4 5 9 1

6 5 7 2

1 3 0 1

2 4 7 2

3 5 5 5

7 0 2 2

8 3 9 4

4 8 4 4

3 5 3 6

2 8 7 1

4 5 9 1

1 3 0 1

6 5 7 2

2 4 7 2

7 0 2 2

8 3 9 4

4 8 4 4

3 5 5 5

3 5 3 6

Page 54: Sorting

56

Radix sort

2 8 7 1

4 5 9 1

6 5 7 2

1 3 0 1

2 4 7 2

3 5 5 5

7 0 2 2

8 3 9 4

4 8 4 4

3 5 3 6

2 8 7 1

4 5 9 1

1 3 0 1

6 5 7 2

2 4 7 2

7 0 2 2

8 3 9 4

4 8 4 4

3 5 5 5

3 5 3 6

Page 55: Sorting

57

Radix sort

2 8 7 1

4 5 9 1

6 5 7 2

1 3 0 1

2 4 7 2

3 5 5 5

7 0 2 2

8 3 9 4

4 8 4 4

3 5 3 6

2 8 7 1

4 5 9 1

1 3 0 1

6 5 7 2

2 4 7 2

7 0 2 2

8 3 9 4

4 8 4 4

3 5 5 5

3 5 3 6

1 3 0 1

7 0 2 2

3 5 3 6

4 8 4 4

3 5 5 5

2 8 7 1

6 5 7 2

2 4 7 2

4 5 9 1

8 3 9 4

Page 56: Sorting

58

Radix sort

2 8 7 1

4 5 9 1

6 5 7 2

1 3 0 1

2 4 7 2

3 5 5 5

7 0 2 2

8 3 9 4

4 8 4 4

3 5 3 6

2 8 7 1

4 5 9 1

1 3 0 1

6 5 7 2

2 4 7 2

7 0 2 2

8 3 9 4

4 8 4 4

3 5 5 5

3 5 3 6

1 3 0 1

7 0 2 2

3 5 3 6

4 8 4 4

3 5 5 5

2 8 7 1

6 5 7 2

2 4 7 2

4 5 9 1

8 3 9 4

Page 57: Sorting

59

Radix sort

2 8 7 1

4 5 9 1

6 5 7 2

1 3 0 1

2 4 7 2

3 5 5 5

7 0 2 2

8 3 9 4

4 8 4 4

3 5 3 6

2 8 7 1

4 5 9 1

1 3 0 1

6 5 7 2

2 4 7 2

7 0 2 2

8 3 9 4

4 8 4 4

3 5 5 5

3 5 3 6

1 3 0 1

7 0 2 2

3 5 3 6

4 8 4 4

3 5 5 5

2 8 7 1

6 5 7 2

2 4 7 2

4 5 9 1

8 3 9 4

7 0 2 2

1 3 0 1

8 3 9 4

2 4 7 2

3 5 3 6

3 5 5 5

6 5 7 2

4 5 9 1

4 8 4 4

2 8 7 1

Page 58: Sorting

60

Radix sort

2 8 7 1

4 5 9 1

6 5 7 2

1 3 0 1

2 4 7 2

3 5 5 5

7 0 2 2

8 3 9 4

4 8 4 4

3 5 3 6

2 8 7 1

4 5 9 1

1 3 0 1

6 5 7 2

2 4 7 2

7 0 2 2

8 3 9 4

4 8 4 4

3 5 5 5

3 5 3 6

1 3 0 1

7 0 2 2

3 5 3 6

4 8 4 4

3 5 5 5

2 8 7 1

6 5 7 2

2 4 7 2

4 5 9 1

8 3 9 4

7 0 2 2

1 3 0 1

8 3 9 4

2 4 7 2

3 5 3 6

3 5 5 5

6 5 7 2

4 5 9 1

4 8 4 4

2 8 7 1

Page 59: Sorting

61

Radix sort

2 8 7 1

4 5 9 1

6 5 7 2

1 3 0 1

2 4 7 2

3 5 5 5

7 0 2 2

8 3 9 4

4 8 4 4

3 5 3 6

2 8 7 1

4 5 9 1

1 3 0 1

6 5 7 2

2 4 7 2

7 0 2 2

8 3 9 4

4 8 4 4

3 5 5 5

3 5 3 6

1 3 0 1

7 0 2 2

3 5 3 6

4 8 4 4

3 5 5 5

2 8 7 1

6 5 7 2

2 4 7 2

4 5 9 1

8 3 9 4

7 0 2 2

1 3 0 1

8 3 9 4

2 4 7 2

3 5 3 6

3 5 5 5

6 5 7 2

4 5 9 1

4 8 4 4

2 8 7 1

1 3 0 1

2 4 7 2

2 8 7 1

3 5 3 6

3 5 5 5

4 5 9 1

4 8 4 4

6 5 7 2

7 0 2 2

8 3 9 4

Page 60: Sorting

62

Radix sort

• Complexity O(d(n+k)) if we use count sort and have d digits each between 0 and k

Page 61: Sorting

64

Assume something about the input

• Random, “almost sorted”• For such inputs we want to sort

faster

Page 62: Sorting

65

Sorting an almost sorted input

• Suppose we know that the input is “almost” sorted

• Let I be the number of “inversions” in the input: The number of pairs ai,aj such that i<j and ai>aj

Page 63: Sorting

66

Example

1, 4 , 5 , 8 , 3

I=3

8, 7 , 5 , 3 , 1 I=10

Page 64: Sorting

67

Insertion sort

• Think of “insertion sort”

• How long it takes to insert ak ?

• As the number of inversions ai,ak for i < k lets call this Ik

Page 65: Sorting

68

Analysis

The running time is:

1

(1 ) ( )n

ki

I O n I

Page 66: Sorting

69

Thoughts

• When I=Ω(n2) the running time is Ω(n2)

• But we would like it to be O(nlog(n)) for any input, and faster when I is small

Page 67: Sorting

70

Finger red black trees

Page 68: Sorting

71

Finger treeTake a regular search tree and reverse the direction of the pointers on the rightmost spine

We go up from the last leaf until we find the subtree containing the item and we descend into it

Page 69: Sorting

72

Finger treesSay we search for a position at distance d from the end

Then we go up to height O(1+log(d))

Insertions and deletions still take O(log n) worst case time but O(1+log(d)) amortized time

So search for the dth position takes O(1+log(d)) time

Page 70: Sorting

73

Back to sorting

• Suppose we implement the insertion sort using a finger search tree

• When we insert item k then d=O(Ik+1) and it takes O(1+log(Ik+1)) time

• Total time is bounded by O(n+n log ((I+n)/n))