duality - university of rhode island · duality duality is a powerful algorithm design tool that...

Duality

Duality is a powerful algorithm design tool that allows us to explore different algorithmicalternatives based on the inherent constraints of the underlying problem.

In our case the constraints on the solution (the position of the decision surface) are theobservations close to the class boundaries where the classes meet.

We find that the dual solution is dominated by the constraints represented by the pointsclose to the class boundaries.

– p. 1/

Perceptron Learning

...

repeat

for i = 1 to l

if sign(w • xi − b) �= yi then

w ← w + ηyixi

b← b− ηyir2

end if

end for

until sign(w • xj − b) = yj with j = 1, . . . , l

...

Observation: w will contain many copies of the scaled position vector xi for “difficult”points; points that are difficult to classify.

Idea: Instead of updating the normal vector of the decision surface, count the number oftimes a particular point is being misclassified.

– p. 2/

Duality

Introduce a new variable that keeps track of the misclassification counts:

α = (α1, . . . , αl)

Then we can write our learning algorithm as,

Initialize α and b to 0.

repeat

for each (xi, yi) ∈ D do

if f̂(xi) �= yi then

Increment αi by 1

Update b

end if

end for

until D is perfectly classified

return α and b

Problem: How do we represent our free parameter w using α? Or f̂ for that matter?

– p. 3/

Duality

+

w =lX

i=1

ηαiyixi

= ηlX

i=1

αiyixi.

– p. 4/

Duality

This means that we can rewrite the normal vector w as the following sum,

w =lX

i=1

αiyixi

where

αi ≈ 0 for “easy” points,

αi � 1 for “difficult” points.

Note: The learning rate η is simply a scaling constant of the resulting normal vector w

and since we are primarily interested in the orientation of the decision surface it iscustomary to drop this constant.

– p. 5/

Duality

Now our perceptron decision function now looks like

f̂(x) = sign(w • x − b)

= sign

(

lXi=1

αiyixi • x) − b

!

where l is the number of instances in D.

– p. 6/

Duality

let D = {(x1, y1), (x2, y2), . . . , (xl, y)} ⊂ Rn × {+1,−1}

let 0 < η < 1

α← 0

b← 0

r ← max{|x| | (x, y) ∈ D}repeat

for i = 1 to l

if sign(Pl

j=1 αjyjxj • xi − b) �= yi then

αi ← αi + 1

b← b− ηyir2

end if

end for

until sign(Pl

j=1 αjyjxj • xk − b) = yl with k = 1, . . . , l

return (α, b)

– p. 7/

Duality

The two perceptron learning algorithms give rise to different representations of theinduced decision surface.

++

-- -

-

+

-

++

-- -

-

+

-

– p. 8/

Duality

The dual algorithm still can give rise to degenerate decision surfaces.

+

++

--

-

-

-

– p. 9/

duality - university of rhode island · duality duality is a powerful algorithm design tool that...

Documents