1 sorting we have actually seen already two efficient ways to sort:

59
1 Sorting • We have actually seen already two efficient ways to sort:

Post on 22-Dec-2015

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 1 Sorting We have actually seen already two efficient ways to sort:

1

Sorting

• We have actually seen already two efficient ways to sort:

Page 2: 1 Sorting We have actually seen already two efficient ways to sort:

2

A kind of “insertion” sort

• Insert the elements into a red-black tree one by one

• Traverse the tree in in-order and collect the keys

• Takes O(nlog(n)) time

Page 3: 1 Sorting We have actually seen already two efficient ways to sort:

3

Heapsort (Willians, Floyd, 1964)

• Put the elements in an array• Make the array into a heap• Do a deletemin and put the

deleted element at the last position of the array

Page 4: 1 Sorting We have actually seen already two efficient ways to sort:

4

Quicksort (Hoare 1961)

Page 5: 1 Sorting We have actually seen already two efficient ways to sort:

5

quicksort

Input: an array A[p, r]

Quicksort (A, p, r) if (p < r)

then q = Partition (A, p, r) //q is the position of the pivot element

Quicksort (A, p, q-1) Quicksort (A, q+1, r)

Page 6: 1 Sorting We have actually seen already two efficient ways to sort:

6

2 8 7 1 3 5 6 4

i j

2 8 7 1 3 5 6 4

i j

2 8 7 1 3 5 6 4

i j

2 8 7 1 3 5 6 4

i j

2 1 7 8 3 5 6 4

i j

p r Last element Pivot = 4

On i and left

elements smaller

than pivot

j explores to right

exchange

Between i and j greater

than pivot

Page 7: 1 Sorting We have actually seen already two efficient ways to sort:

7

2 1 7 8 3 5 6 4

i j

2 1 3 8 7 5 6 4

i j

2 1 3 8 7 5 6 4

i j

2 1 3 8 7 5 6 4

i j

2 1 3 4 7 5 6 8

i j

pivot><=

Page 8: 1 Sorting We have actually seen already two efficient ways to sort:

8

2 8 7 1 3 5 6 4p r

Partition(A, p, r) x ←A[r]

i ← p-1 for j ← p to r-1

do if A[j] ≤ x then i ← i+1 exchange A[i] ↔ A[j] exchange A[i+1] ↔A[r] return i+1 Partition point

Page 9: 1 Sorting We have actually seen already two efficient ways to sort:

9

Analysis

• Running time is proportional to the number of comparisons

• Each pair is compared at most once O(n2)

• In fact for each n there is an input of size n on which quicksort takes cn2 Ω(n2)

Page 10: 1 Sorting We have actually seen already two efficient ways to sort:

10

But

• Assume that the split is even in each iteration

Page 11: 1 Sorting We have actually seen already two efficient ways to sort:

11

T(n) = 2T(n/2) + n

How do we solve linear recurrences like this ? (read Chapter 4)

Page 12: 1 Sorting We have actually seen already two efficient ways to sort:

12

Recurrence tree

T(n/2)

n

T(n/2)

Page 13: 1 Sorting We have actually seen already two efficient ways to sort:

13

Recurrence tree

n/2

n

n/2

T(n/4)T(n/4)T(n/4)T(n/4)

Page 14: 1 Sorting We have actually seen already two efficient ways to sort:

14

Recurrence tree

n/2

n

n/2

T(n/4)T(n/4)T(n/4)T(n/4)logn

In every level we do bn comparisonsSo the total number of comparisons is O(nlogn)

Page 15: 1 Sorting We have actually seen already two efficient ways to sort:

15

Analysis of 1:9 split

Page 16: 1 Sorting We have actually seen already two efficient ways to sort:

16

Analysis of 1:9 split

Page 17: 1 Sorting We have actually seen already two efficient ways to sort:

17

Observations

• We can’t guarantee good splits

• But intuitively on random inputs we will get good splits

Page 18: 1 Sorting We have actually seen already two efficient ways to sort:

18

Randomized quicksort

• Use randomized-partition rather than partition

Randomized-partition (A, p, r) i ← random(p,r)

exchange A[r] ↔ A[i] return partition(A,p,r)

Page 19: 1 Sorting We have actually seen already two efficient ways to sort:

19

• On the same input we will get a different running time in each run !

• Look at the average for one particular input of all these running times

Page 20: 1 Sorting We have actually seen already two efficient ways to sort:

20

Expected # of comparisons

Let X be the # of comparisons

This is a random variable

Want to know E(X)

Page 21: 1 Sorting We have actually seen already two efficient ways to sort:

21

Expected # of comparisons

Let z1,z2,.....,zn the elements in sorted order

Let Xij = 1 if zi is compared to zj and 0 otherwise

So,

1n

1i

n

1ijijXX

All elements are compared to pivot. At the end of phase the partition puts them in

proper sides so will not compare with pivot

again.

Page 22: 1 Sorting We have actually seen already two efficient ways to sort:

22

n 1 n n 1 n

ij iji 1 j i 1 i 1 j i 1

E X E X E X

by linearity of expectation

n 1 n

i ji 1 j i 1

Pr{z is compared to z }

Page 23: 1 Sorting We have actually seen already two efficient ways to sort:

23

n 1 n n 1 n

ij iji 1 j i 1 i 1 j i 1

E X E X E X

by linearity of expectation

n 1 n

i 1 j ii j

1

Pr{z is compared to z }

Page 24: 1 Sorting We have actually seen already two efficient ways to sort:

24

Consider zi,zi+1,.......,zj ≡ Zij

Claim: zi and zj are compared either zi or zj is the first chosen (pivot) in Zij

Proof: 3 cases:– {zi, …, zj} Compared on this

partition, and never again.– {zi, …, zj} the same– {zi, …, zk, …, zj} Not compared on

this partition. Partition separates them, so no future partition uses both.

Page 25: 1 Sorting We have actually seen already two efficient ways to sort:

25

= 1/(j-i+1) + 1/(j-i+1)= 2/(j-i+1)

Pr{zi is compared to zj}

= Pr{zi or zj is first pivot chosen from Zij} just explained

= Pr{zi is first pivot chosen from Zij} +

Pr{zj is first pivot chosen from Zij}

mutually exclusivepossibilities

Page 26: 1 Sorting We have actually seen already two efficient ways to sort:

26

1n

1i

n

1ij 1ij

2XE

n 1 n i+1

i 1 k 2

2

kSimplify with a change of variable, k=j-i+1.

1n

1i

n

1k k

2Simplify and overestimate, by adding terms.

1n

1i

n lgO

n) lg O(n

Page 27: 1 Sorting We have actually seen already two efficient ways to sort:

27

Sum 1/k

nkkn

k

n

ln/1/11 1

nkkn

k

n

ln/11)/1(1 1

Page 28: 1 Sorting We have actually seen already two efficient ways to sort:

28

Lower bound for sorting in the comparison model

•Cannot deal with an algorithm•Must deal with the PROBLEM

Page 29: 1 Sorting We have actually seen already two efficient ways to sort:

29

A lower bound

• Comparison model: We assume that the operation from which we deduce order among keys are comparisons

• Then we prove that we need Ω(nlogn) comparisons on the worst case

Page 30: 1 Sorting We have actually seen already two efficient ways to sort:

Model the algorithm as a decision tree

Insertion Sortדוגמה: מיון הכנסה

A[1[,….,A[i דואגים שהאלמנטים ]i - איטרציה ה

נמצאים בסדר יחסי תקין (על ידי החלפות)

Page 31: 1 Sorting We have actually seen already two efficient ways to sort:

Insertion sort

1:2

2:3

<

<1:3

>

A[1] < A[2] < A[3] A[2] < A[1] < A[3]

1:3

>

2:3

>

< >

A[1] < A[3] < A[2] A[3] < A[1] < A[2]

< >

A[2] < A[3] < A[1] A[3] < A[2] < A[1]

A[1] < A[2] A[1] < A[2]

Finds the right order

A[1] < A[3]A[2] < A[3]

Page 32: 1 Sorting We have actually seen already two efficient ways to sort:

Quicksort

1:3

2:3

<

<2:3

>

A[1] < A[3] < A[2] A[2] < A[3] < A[1]

1:2

>

2:3

>

< >

A[1] < A[2] < A[3] A[2] < A[1] < A[3]

< >

A[3] < A[1] < A[2] A[3] < A[2] < A[1]

<

Page 33: 1 Sorting We have actually seen already two efficient ways to sort:

33

Important Observations

• Every comparison algorithm can be represented as a (binary) tree like this

• Assume that for every node v there is an input on which the algorithm reaches v

• Then the # of leaves is n!

Page 34: 1 Sorting We have actually seen already two efficient ways to sort:

34

Important Observations

• Each path corresponds to a run on some input

• The worst case # of comparisons corresponds to the longest path

Page 35: 1 Sorting We have actually seen already two efficient ways to sort:

35

The lower bound

Let d be the length of the longest path

#leaves ≤ 2dn! ≤

log2(n!) ≤d

Perhaps some orders

represented more than

once

Page 36: 1 Sorting We have actually seen already two efficient ways to sort:

36

Lower Bound for Sorting

• Any sorting algorithm based on comparisons between elements requires (N log N) comparisons.

Page 37: 1 Sorting We have actually seen already two efficient ways to sort:

(nlogn- אפשר להראות שגם הוא (

עלים - עומק מסלול ממוצעk להראות שעץ בינארי צורת ההוכחה:- .logkפחות ל

הקטן ביותר כך שלא מתקיים.T יהי הוכחה בשלילה:

בן אחד או שניים:T אז ל

אם בן בודדא) סתירה לקטן ביותר

n

1n

עומקממוצעקטן מlogk

kעלים

1n 2n

n אם שני בנים:ב) מספר העלים בהם הוא

k1 -וk2

k1<k k-k1=k2

ממוצע לזמן תחתון חסם

Page 38: 1 Sorting We have actually seen already two efficient ways to sort:

Tאזי ממוצע אורך מסלולים לעלים ב )log()log(1הוא: 2

21

21

21

1

k

kk

kk

kk

k

)k1+k2=k- מציאת חסם תחתון לביטוי, ע”י מציאת מינימום שלו (תחת אילוץ

k1=k2 עיה נותן מינימום ב בפתרון ה k

k

kkT

log

12/log

12/log2/log 21

21

Page 39: 1 Sorting We have actually seen already two efficient ways to sort:

39

Beating the lower bound

• We can beat the lower bound if we can deduce order relations between keys not by comparisons

Examples:• Count sort• Radix sort

Page 40: 1 Sorting We have actually seen already two efficient ways to sort:

O(nlognמיונים שראינו עד כה: (

?O(nlognהאם אפשר לבצע בפחות מ (

אם לא יודעים כלום על המספרים(nlogn( ראינו:- אפשר לרדת אם יודעים בשלב זה:

:1דוגמה n,….,1 נמצאים המפתחות A[1,…,nאם יודעים שבמערך ]

]O(n: B[A[i[.key[ = A[i ב (Bאזי מיון לתוך

Count Sort :2דוגמה k,…,1, איברים A[1[,…,A[nמערך ]

כל איבר מופיע מספר פעמים:

3 2 3 1 4 2 2 5

. ספירת האיברים מכל סוג1מיון המערך ע”י:

. כתיבתם במערך תוצאה2

Cormen 175-177פרטים:

)( nk

BIN/RADIX SORTING

Page 41: 1 Sorting We have actually seen already two efficient ways to sort:

:A, מיון בתוך 1בתנאים של דוגמה

A[j עם ]A[i החלף ]A[i[.key = jאם From i = 1 to n do

while A[i[.key <> i do

swap(A[i[, A[A[i[.key[)

צעדיםO(n(פעולות:

)O(n(!איבר שנחת במקומו לא יוחלף יותר) החלפות

BIN SORTING) הינו מיון האיברים לתוך תאים BINS ולבסוף- שרשור (התאים

פשוטBIN-SORT היא 1- דוגמה

)1 קבוע (BINגודל

פעולות שנרצה:במקרה הכללי גודל משתנה

BINא) הכנס איבר לתוך

-יםBINב) חבר שני

3דוגמה

Page 42: 1 Sorting We have actually seen already two efficient ways to sort:

פתרון:

רשימה מקושרתBIN) כל 1

2 (HEADERSמצביעים על תחילת הרשומה

H1

E1

H2

E2

O(1הכנסה: (

O(1שרשור: (

סליםnכעת ניתן לשרשר מספר שרירותי של רשימות לתוך

Page 43: 1 Sorting We have actually seen already two efficient ways to sort:

אנליזה:m(מספר הסלים) מספר הערכים האפשריים - nמספר המפתחות -

n) = סיבוכיות הכנסותO(n

O(mסיבוכיות שרשורים = (O(m+n)

> mאם מספר המפתחות גדול ממספר הסלים (n(O(n)

< mאם מספר המפתחות קטן ממספר הסלים (n(

m = n2למשל

O(n2)

דוגמה:

i=1,2,..,10כאשר i2מיין את המספרים

0,1,4,....,100כלומר מיין את

Hanoch:

Sort exams of class with

grades xx.yy

Hanoch:

Sort exams of class with

grades xx.yy

Page 44: 1 Sorting We have actually seen already two efficient ways to sort:

פתרון: סליםn- הכן

- מיין לפי הספרה הפחות משמעותית- מיין לפי הספרה היותר משמעותית

Bin0

1

2

3

4

5

6

7

8

9

איברים0

1, 81

-

-

64, 4

25

36, 16

-

-

9, 49

Bin0

1

2

3

4

5

6

7

8

9

איברים0, 1, 4, 9

16

25

36

49

-

64

-

81

-

0, 1, 81, 64, 4, 25, 36, 16, 9, 49

שרשור

Page 45: 1 Sorting We have actually seen already two efficient ways to sort:

למה עובד?i = 10a + b, j = 10c + d נניח:

i < j נניח:

ברור ש

ca המתאימים, והמיון תקין. אזי שלב שני ישים בסלים a < c - אם

?BIN SORTלמה טוב - תחומים שידועה עליהם אינפורמציה כמו

1,…,nk (קבוע k)

k- מחרוזת באורך

ולכן:b < d אזי a = c- אם

מיון ראשון ימיין בסדר

בשלב השניj יכנס לסל לפני iלכן

b

Page 46: 1 Sorting We have actually seen already two efficient ways to sort:

האם תמיד טוב?

מאוד גדול!!kלא אם

n = 100, k = 100דוגמה:

nk :BIN SORT

פעולות)100 מחזורים של 100 (

nlognמיון אחר:

nk > nlognו-

אבל… זהירות בהשוואות!

O(k במיון רגיל- השוואה = (

ולכן יש חשיבות למודל החישובי!!!

Page 47: 1 Sorting We have actually seen already two efficient ways to sort:

f1,…,fk מפתחות k- נתונים

- רוצים למיין בסדר לכסיקוגרפי

) אמ”ם:a1,…,ak) > (b1,…,bk כלומר: (

1 (a1 < b1

a1 = b1, a2 < b2) או 2

a1 = b1,…., ak-1 = bk-1, ak = bk (k מוכלל, רק צריך לכל סוג מפתחות את תחום הסלים BIN SORTדומה ל

שלו.

RADIX SORT

Page 48: 1 Sorting We have actually seen already two efficient ways to sort:

48

Linear time sorting

• Or assume something about the input: random, “almost sorted”

Page 49: 1 Sorting We have actually seen already two efficient ways to sort:

49

Sorting an almost sorted input

• Suppose we know that the input is “almost” sorted

• Let I be the number of “inversions” in the input: The number of pairs ai,aj such that i<j and ai>aj

Page 50: 1 Sorting We have actually seen already two efficient ways to sort:

50

Example

1, 4 , 5 , 8 , 3

I=3

8, 7 , 5 , 3 , 1 I=10

Page 51: 1 Sorting We have actually seen already two efficient ways to sort:

51

• Think of “insertion sort” using a list

• When we insert the next item ak, how deep it gets into the list?

• As the number of inversions ai,ak for i < k lets call this Ik

Page 52: 1 Sorting We have actually seen already two efficient ways to sort:

52

Analysis

The running time is:

1

n

jj

I n I n

Page 53: 1 Sorting We have actually seen already two efficient ways to sort:

53

Thoughts

• When I=Ω(n2) the running time is Ω(n2)

• But we would like it to be O(nlog(n)) for any input, and faster when I is small

Slides got

updated

Cs
15/12/2009
Page 54: 1 Sorting We have actually seen already two efficient ways to sort:

54

Finger red black trees

Page 55: 1 Sorting We have actually seen already two efficient ways to sort:

55

Finger treeTake a regular search tree and reverse the direction of the pointers on the rightmost spine

We go up from the last leaf until we find the subtree containing the item and we descend into it

Page 56: 1 Sorting We have actually seen already two efficient ways to sort:

56

Finger treesSay we search for a position at distance d from the end

Then we go up to height O(log(d))

•Insertions and deletions still take O(log n) worst case time

• But: Amortized time :

• Tree modification = O(1)

• Search = O(log d) ( contribution of this transaction)

So search for the dth position takes O(log(d)) time

Page 57: 1 Sorting We have actually seen already two efficient ways to sort:

57

Back to sorting

• Suppose we implement the insertion sort using a finger search tree

• Insert one by one from the input• If most elements are sorted – then

elements enter at right corner.

• When we insert item k then d=O(Ik) and it take O(log(Ik)) time to search

Page 58: 1 Sorting We have actually seen already two efficient ways to sort:

58

Overall cost

• d=O(Ik) and it take O(log(Ik)) time to search

N

kkINO

1

)log()(

modifications search

Page 59: 1 Sorting We have actually seen already two efficient ways to sort:

59

Analysis

The running time is:

1

( log( ) )n

jj

O I n

Since ∑Ij = I this is at most

logI

O n nn