solving systems of linear inequalities with randomized ...jhaddock/slides/skm/mathclub2016.pdf ·...

Optimization Linear Feasibility Projection Methods Hybrid Method Convergence Rate Expected Finiteness Conclusion

Solving Systems of Linear Inequalitieswith Randomized Projections

Jamie Haddock

Graduate Group in Applied Mathematics,Department of Mathematics,University of California, Davis

Math ClubMarch 2, 2016

Joint work with Jesus De Loera and Deanna Needell


Optimization

I think about problems of the sort:

min f(x)

s.t. g(x) ≤ 0

These sorts of problems are all around us!

Today we’ll consider a specific form of optimization problem...


Linear Programs

I think about problems of the sort:

min cTx (LP )

s.t. Ax ≤ b

A ∈ Rm×n, b ∈ Rm and we are optimizing over x ∈ Rn.

But, even this can be simplified...


Linear Feasibility Problem

In fact, we’ll consider the linear feasibility problem (LF):

Find x such that Ax ≤ b or conclude one does not exist.

It can be shown that (LP) and (LF) are equivalent.



Reminder: linear equations represent a hyperplane (in theproper dimension), so linear inequalities define a halfspace.

aTi x = bioo

ai oo



LF can be interpreted as seeking a point within a (possiblynonempty) polyhedron P = {x|Ax ≤ b}:

aTi x = bioo

ai oo

P


How to Solve LFIsn’t the linear feasibility problem easy to solve?

Answer: Sort of...Good news: Our geometric intuition for the problem gives us agood idea for how to solve it!

Motzkin Kaczmarz


How to Solve LFIsn’t the linear feasibility problem easy to solve?Answer: Sort of...

Good news: Our geometric intuition for the problem gives us agood idea for how to solve it!

Motzkin Kaczmarz


How to Solve LFIsn’t the linear feasibility problem easy to solve?

Answer: Sort of...

Good news: Our geometric intuition for the problem gives us agood idea for how to solve it!

Motzkin Kaczmarz


Projection Methods

If we want all of the linear inequalities to be satisfied (meaningwe want our point to lie on the correct side of all thehyperplanes), then we need that each of the linear inequalitiesis satisfied.

Tautology.

So... If we have some point that isn’t satisfying one of theinequalities, we should force it to satisfy that inequality!


Projection Methods

If we want all of the linear inequalities to be satisfied (meaningwe want our point to lie on the correct side of all thehyperplanes), then we need that each of the linear inequalitiesis satisfied. Tautology.



Projection Methods

If we want all of the linear inequalities to be satisfied (meaningwe want our point to lie on the correct side of all thehyperplanes), then we need that each of the linear inequalitiesis satisfied.

Tautology.



Projection Methods

x0

P

•


Projection Methods

x0

P

•

• x1


Motzkin’s Relaxation Method(s)

Method

Suppose A ∈ Rm×n, b ∈ Rm and P := {x ∈ Rn : Ax ≤ b} isnonempty. Fix 0 < λ ≤ 2. Given x0 ∈ Rn, iteratively constructapproximations to P in the following way:

1. If xk is feasible, stop.

2. Choose ik ∈ [m] as ik := argmaxi∈[m]

aTi xk−1 − bi.

3. Define xk := xk−1 − λaTik

xk−1−bik||aik ||

2 aik .

4. Repeat.


Motzkin’s Method

• x0P


Motzkin’s Method

P

•

• x1


Motzkin’s Method

P

•

•

•x2


Motzkin’s Method

P

•

•

•

•x3


Randomized Kaczmarz Method

Method

Suppose A ∈ Rm×n, b ∈ Rm and P := {x ∈ Rn : Ax ≤ b} isnonempty. Given x0 ∈ Rn, iteratively construct approximationsto P in the following way:


2. Choose ik ∈ [m] with probability||aik ||

2

||A||2F.

3. Define xk := xk−1 −(aTik

xk−1−bik )+

||aik ||2 aik .

4. Repeat.


Kaczmarz Method

• x0P


Kaczmarz Method

P

•

•x1


Kaczmarz Method

P

•

•x2


Kaczmarz Method

P

•

•x3


Kaczmarz Method

P

•

•

•x4


Motivation

Motzkin’s MethodPro: convergence produces monotone decreasing distancesequenceCon: computationally expensive for large systems

Kaczmarz MethodPro: computationally inexpensive, able to analyze the expectedconvergence rateCon: slow convergence near the polyhedral solution set


A Hybrid Method

Method (SKMM)

Suppose A ∈ Rm×n, b ∈ Rm and P := {x ∈ Rn : Ax ≤ b} isnonempty. Fix 0 < λ ≤ 2. Given x0 ∈ Rn, iteratively constructapproximations to P in the following way:


2. Choose τk ⊂ [m] to be a sample of size β constraints chosenuniformly at random from among the rows of A.

3. From among these β rows, choose ik := argmaxi∈τk

aTi xk−1− bi.

4. Define xk := xk−1 − λ(aTik

xk−1−bik )+

||aik ||2 aik .

5. Repeat.


A Hybrid Method

• x0P


A Hybrid Method

P

•

• x1


A Hybrid Method

P

•

•

•x2


A Hybrid Method

P

•

•

•x3


A Hybrid Method

P

•

•

•

•x4


Generalized Method

Note that both previous methods are captured by the class ofSKMM methods:

1. The Kaczmarz method is SKMM where the sample size,β = 1 and the relaxation parameter, λ = 1.

2. Motzkin’s Relaxation methods are SKMM where thesample size, β = m.


An Important Reminder

These methods may not actually stop with a solution...

However, we can ensure that our iterate points get arbitrarilyclose to the solution set, P !


An Important Reminder

These methods may not actually stop with a solution...However, we can ensure that our iterate points get arbitrarilyclose to the solution set, P !


Experimental Results


Motzkin’s Method Convergence Rate

Theorem (Agmon)

For a normalized system, ||ai|| = 1 for all i = 1, ...,m, if thefeasible region, P := {x|Ax ≤ b}, is nonempty then therelaxation methods converges linearly:

d(xk, P )2 ≤(

1− 2λ− λ2

mL22

)kd(x0, P )2.


Random Kaczmarz Method Convergence Rate

Theorem (Lewis, Leventhal)

If the feasible region, P := {x|Ax ≤ b}, is nonempty then theRandomized Kaczmarz method with relaxation parameter λconverges linearly in expectation:

E[d(xk, P )2] ≤(

1− 2λ− λ2

||A||2FL22

)kd(x0, P )2.


SKM Method Convergence Rate

Theorem (De Loera, H., Needell)

If the feasible region (for normalized A) is nonempty, then theSKM methods with samples of size β converges at least linearlyin expectation: In each iteration,

E[d(xk, P )2] ≤(

1− 2λ− λ2

Sk−1L22

)d(xk−1, P )2

where Sk−1 = max{m− sk−1,m− β + 1} and sk−1 is thenumber of constraints satsifed by xk−1. Clearly then,

E[d(xk, P )2] ≤(

1− 2λ− λ2

mL22

)kd(x0, P )2.


Improved Rate


If the feasible region, P = {x|Ax ≤ b} is generic and nonempty(for normalized A), then an SKM method with samples of sizeβ ≤ m− n is guaranteed an increased convergence rate aftersome K:

E[d(xk, P )2] ≤(

1−2λ− λ2

mL22

)K(1− 2λ− λ2

(m− β + 1)L22

)k−Kd(x0, P )2.


Finiteness of Motzkin’s Method

Theorem (Telgen)

Either the relaxation method* detects feasibility of the system,

Ax ≤ b (with A normalized), within k =⌈

24L

nλ(2−λ)

⌉iterations or

the system is infeasible.

*with x0 = 0


Expected Finiteness of SKM methods


If the system, Ax ≤ b is feasible, then with high probability theSampling Kaczmarz-Motzkin method* with relaxation parameter0 < λ < 2 will detect feasibility within a given number of steps.

*with x0 = 0


Conclusions


Future work:

1. Provide theoretical guidance for selection of the optimalsample size, β, and optimal overshooting parameter, λ for agiven (class of) system(s).

2. Describe the K after which the convergence rate isguaranteed to be improved.


Acknowledgements

Thanks to you for attending!

Are there any questions?


References I

I Kaczmarz, S. (1937).

Angenaherte auflosung von systemen linearer gleichungen.

Bull.Internat.Acad.Polon.Sci.Lettres A, pages 335–357.

I Leventhal, D. and Lewis, A. S. (2010).

Randomized methods for linear constraints: convergence rates and conditioning.

Math.Oper.Res., 35(3):641–654.

65F10 (15A39 65K05 90C25); 2724068 (2012a:65083); Raimundo J. B. de Sampaio.

I Motzkin, T. S. and Schoenberg, I. J. (1954).

The relaxation method for linear inequalities.

Canadian J. Math., 6:393–404.

I Needell, D. (2010).

Randomized kaczmarz solver for noisy linear systems.

BIT, 50(2):395–403.


References II

I Needell, D., Sbrero, N., and Ward, R. (2013).

Stochastic gradient descent and the randomized kaczmarz algorithm.

submitted.

I Needell, D. and Tropp, J. A. (2013).

Paved with good intentions: Analysis of a randomized block kaczmarz method.

Linear Algebra Appl.

I Schrijver, A. (1986).

Theory of linear and integer programming.

Wiley-Interscience Series in Discrete Mathematics. John Wiley & Sons, Ltd., Chichester.

A Wiley-Interscience Publication.

I Strohmer, T. and Vershynin, R. (2009).

A randomized kaczmarz algorithm with exponential convergence.

J. Fourier Anal. Appl., 15:262–278.

solving systems of linear inequalities with randomized ...jhaddock/slides/skm/mathclub2016.pdf ·...

Documents