randomized algorithms pasi fränti 1.10.2014. treasure island treasure worth 20.000 awaits 5000 daa...

Randomized Algorithms

Pasi Fränti1.10.2014

Treasure islandTreasure worth 20.000 awaits

5000

DAAexpedition

5000

5000

?

?

Map for sale: 3000

To buy or not to buy

Buy the map:

Take a change:

20000 – 5000 – 3000 = 12.000

20000 – 5000 = 15.000

20000 – 5000 – 5000 = 10.000

To buy or not to buy

Buy the map:

Take a change:

20000 – 5000 – 3000 = 12.000

20000 – 5000 = 15.000

20000 – 5000 – 5000 = 10.000

Expected result:0.5 ∙ 15000 + 0.5 ∙ 10000 = 12.500

Three type of randomization

1. Las Vegas- Output is always correct result- Result is not always found- Probability of success p

2. Monte Carlo- Result is always found- Result can be inaccurate (or even false!)- Probability of success p

3. Sherwood- Balancing the worst case behavior

Las Vegas

Eating philosophizes

Who eats?

Las Vegas

Input: Binary vector A[1, n]Output:Index of any 1-bit from A

LV(A, n)

REPEATk ← RAND(1, n);

UNTIL A[k]=1;

RETURN k

Revise

8-Queens puzzle

INPUT: Eight chess queens and an 8×8 chessboardOUTPUT: Setup where no queens attack each other

8-Queens brute force

Brute force• Try all positions• Mark illegal squares• Backtrack if dead-end• 114 setups in total

Random• Select positions randomly• If dead-end, start over

Randomized• Select k rows randomly• Rest rows by Brute Force

8

5

4

…

Where next…?

Pseudo code8-Queens(k)

FOR i=1 TO k DO // k Queens randomly r Random[1,8];IF Board[i,r]=TAKEN THEN RETURN Fail;ELSE ConquerSquare(i,r);

FOR i=k+1 TO 8 DO // Rest by Brute Forcer1; foundNO;WHILE (r≤8) AND (NOT found) DO

IF Board[i,r] NOT TAKEN THEN ConquerSquare(i,r); foundYES;

IF NOT found THEN RETURN Fail;

ConquerSquare(i,j)Board[i,j] QUEEN;FOR z=i+1 TO 8 DO

Board[z,j] TAKEN;Board[z,j-(z-i)] TAKEN;Board[z,j+(z-i)] TAKEN;

Probability of success

s = processing time in case of successe = processing time in case of failure

p = probability of successq = 1-p = probability of failure

ep

qst

qepspttt

qepsqtt

qtqepsteqpst

Example:

s=e=1, p=1/6

t=1+5/1∙1=6

Experiments with varying k

K S E T P

0 114 - 114 100%

1 39.6 - 39.6 100%

2 22.5 36.7 25.2 88%

3 13.5 15.1 29.0 49%

4 10.3 8.8 35.1 26%

5 9.3 7.3 46.9 16%

6 9.1 7 53.5 14%

7 9 7 56.0 13%

8 9 7 56.0 13%

Fastestexpectedtime

Two centroids , butonly one cluster .

One centroid , buttwo clusters .

Two centroids , butonly one cluster .

One centroid , buttwo clusters .

Swap-based clustering

Clustering by Random Swap

RandomSwap(X) → C, PC ← SelectRandomRepresentatives(X);P ← OptimalPartition(X, C);REPEAT T times

(Cnew, j) ← RandomSwap(X, C);Pnew ← LocalRepartition(X, Cnew, P, j);Cnew, Pnew ← Kmeans(X, Cnew, Pnew);IF f(Cnew, Pnew) < f(C, P) THEN

(C, P) ← Cnew, Pnew;

RETURN (C, P);

P. Fränti and J. Kivijärvi, "Randomised local search algorithm for the clustering problem", Pattern Analysis and Applications, 3 (4), 358-

369, 2000.

Select random neighbor

Accept only if it improves

1. Random swap:

2. Re-partition vectors from old cluster:

3. Create new cluster:

c x j random M i random Nj i ( , ), ( , )1 1

p d x c i p jik M

i k i

arg min ,1

2

p d x c i Nik j k p

i ki

arg min , ,2

1


Choices for swapSwap is made from

centroid rich area tocentroid poor area.

Swap is made fromcentroid rich area tocentroid poor area.

O(M) clusters

to be removed

O(M) clusters

where to add

O(M2) different choices in total

=

Select a proper centroid for removal:

– M clusters in total: premoval=1/M.

Select a proper new location:

– N choices: padd=1/N

– M of them significantly different: padd=1/M

In total:

– M2 significantly different swaps.

– Probability of each is pswap=1/M2

– Open question: how many of these are good

– Theorem: α are good for add and removal.

Probability for successful Swap

Probability of not finding good swap:T

Mq

2

2

1

2

2

1loglogM

Tq

2

2

1log

log

M

qT

Estimated number of iterations:


Iterated T times

2

2

ln -α

MqT

2

2

2222-ln

/

ln -

/1ln

ln

α

Mq

Mα

q

Mα

qT

Upper limit:

Lower limit similarly; resulting in:

Bounds for the iterations

Number of iterations needed (T):

α

NMq-N

α

Mq-MNT

2

2

2 lnln ,

2

2

ln -α

MqT

t = O(αN)

Total time:

Time complexity of single step (t):

Total time complexity

Monte Carlo

Monte Carlo

Input: A bit vector A[1, n], iterations IOutput: An index of any 1 bit from A

LV(A, n, I) i ← 0; DO k ← RAND(1, n); i ← i + 1; WHILE (A[k]≠1 AND i ≤ I) RETURN k

Revise

Monte Carlo

Potential problems to be considered:• Detecting prime numbers• Calculating integral of a function

To appear in 2014… maybe…

Sherwood

Selection of pivot element

Something about Quicksort and Selection:• Practical example of re-sorting• Median selection

Add material for 2014

N-11

N-21

N-31

…O(N2)

Simulated dynamic linked list

1. Sorted array- Search efficient: O(logN)- Insert and Delete slow: O(N)

2. Dynamically linked list- Insert and Delete fast: O(1)- Search inefficient: O(N)

Simulated dynamic linked listExample

i 1 2 3 4 5 6 7

Value 2 4 15

1 5 21

7

Next 2 5 6 1 7 0 3

1 152 4 75 21Head

Linked list:

Head=4Simulated by

array:

SEARCH (A, x)

i := A.HEAD;max := A[i].VALUE;

FOR k:=1 TO N DOj:=RANDOM(1, N);y:=A[j].VALUE;IF (max<y) AND (y≤x)

THENi:=j; max:=y;

RETURN LinearSearch(A, x, i);

Simulated dynamic linked listDivide-and-conquer with randomization

N random breakpoints

Biggest breakpoint ≤ x

Value searched

Full search from breakpoint i

Analysis of the search

max search for

N N(on

average)

• Divide into N segments• Each segment has N/N = N elements• Linear search within one segment.• Expected time complexity = N + N =

O(N)

Experiment with students

1 2 3 4 99 100

Data (N=100) consists of numbers from 1..100:

Select N breaking points:

Searching for…

42

Empty space for notes

randomized algorithms pasi fränti 1.10.2014. treasure island treasure worth 20.000 awaits 5000 daa...

Documents