frank-wolfe algorithms for saddle point problemsgauthier gidel frank-wolfe algorithms for sp 10th...
TRANSCRIPT
![Page 1: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/1.jpg)
Frank-Wolfe Algorithms for Saddle Pointproblems
Gauthier Gidel1 Tony Jebara2 Simon Lacoste-Julien3
1INRIA Paris, Sierra Team 2Department of CS, Columbia University
3Department of CS & OR (DIRO) Université de Montréal
10th December 2016Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 2: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/2.jpg)
Overview
I Frank-Wolfe algorithm (FW) gained in popularity in thelast couple of years.
I Main advantage: FW only needs LMO.
I Extend FW properties to solve saddle point problem.
I Straightforward extension but Non trivial analysis.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 3: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/3.jpg)
Saddle point and link with variational inequalitiesLet L : X × Y → R, where X and Y are convex and compact.
Saddle point problem: solve minx∈X
maxy∈YL(x,y)
A solution (x∗,y∗) is called a Saddle Point.
I Necessary stationary conditions:
〈x− x∗, ∇xL(x∗,y∗)〉 ≥ 0〈y − y∗,−∇yL(x∗,y∗)〉 ≥ 0
I Variational inequality:
∀z ∈ X × Y 〈z − z∗, g(z∗)〉 ≥ 0
where (x∗,y∗) = z∗ and g(z) = (∇xL(z),−∇yL(z))I Sufficient condition: Global solution if L
convex-concave. ∀(x,y) ∈ X × Y
x′ 7→ L(x′,y) is convex and y′ 7→ L(x,y′) is concave.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 4: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/4.jpg)
Saddle point and link with variational inequalitiesLet L : X × Y → R, where X and Y are convex and compact.
Saddle point problem: solve minx∈X
maxy∈YL(x,y)
A solution (x∗,y∗) is called a Saddle Point.I Necessary stationary conditions:
〈x− x∗, ∇xL(x∗,y∗)〉 ≥ 0
〈y − y∗,−∇yL(x∗,y∗)〉 ≥ 0I Variational inequality:
∀z ∈ X × Y 〈z − z∗, g(z∗)〉 ≥ 0
where (x∗,y∗) = z∗ and g(z) = (∇xL(z),−∇yL(z))I Sufficient condition: Global solution if L
convex-concave. ∀(x,y) ∈ X × Y
x′ 7→ L(x′,y) is convex and y′ 7→ L(x,y′) is concave.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 5: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/5.jpg)
Saddle point and link with variational inequalitiesLet L : X × Y → R, where X and Y are convex and compact.
Saddle point problem: solve minx∈X
maxy∈YL(x,y)
A solution (x∗,y∗) is called a Saddle Point.I Necessary stationary conditions:
〈x− x∗, ∇xL(x∗,y∗)〉 ≥ 0〈y − y∗,−∇yL(x∗,y∗)〉 ≥ 0
I Variational inequality:
∀z ∈ X × Y 〈z − z∗, g(z∗)〉 ≥ 0
where (x∗,y∗) = z∗ and g(z) = (∇xL(z),−∇yL(z))I Sufficient condition: Global solution if L
convex-concave. ∀(x,y) ∈ X × Y
x′ 7→ L(x′,y) is convex and y′ 7→ L(x,y′) is concave.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 6: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/6.jpg)
Saddle point and link with variational inequalitiesLet L : X × Y → R, where X and Y are convex and compact.
Saddle point problem: solve minx∈X
maxy∈YL(x,y)
A solution (x∗,y∗) is called a Saddle Point.I Necessary stationary conditions:
〈x− x∗, ∇xL(x∗,y∗)〉 ≥ 0〈y − y∗,−∇yL(x∗,y∗)〉 ≥ 0
I Variational inequality:
∀z ∈ X × Y 〈z − z∗, g(z∗)〉 ≥ 0
where (x∗,y∗) = z∗ and g(z) = (∇xL(z),−∇yL(z))
I Sufficient condition: Global solution if Lconvex-concave. ∀(x,y) ∈ X × Y
x′ 7→ L(x′,y) is convex and y′ 7→ L(x,y′) is concave.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 7: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/7.jpg)
Saddle point and link with variational inequalitiesLet L : X × Y → R, where X and Y are convex and compact.
Saddle point problem: solve minx∈X
maxy∈YL(x,y)
A solution (x∗,y∗) is called a Saddle Point.I Necessary stationary conditions:
〈x− x∗, ∇xL(x∗,y∗)〉 ≥ 0〈y − y∗,−∇yL(x∗,y∗)〉 ≥ 0
I Variational inequality:
∀z ∈ X × Y 〈z − z∗, g(z∗)〉 ≥ 0
where (x∗,y∗) = z∗ and g(z) = (∇xL(z),−∇yL(z))I Sufficient condition: Global solution if L
convex-concave. ∀(x,y) ∈ X × Y
x′ 7→ L(x′,y) is convex and y′ 7→ L(x,y′) is concave.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 8: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/8.jpg)
Motivations: games and robust learningI Zero-sum games with two players:
minx∈∆(I)
maxy∈∆(J)
x>My
I Generative Adversarial Network (GAN)I Robust learning:1 We want to learn
minθ∈Θ
1n
n∑i=1
` (fθ(xi), yi) + λΩ(θ)
with an uncertainty regarding the data:
minθ∈Θ
maxw∈∆n
n∑i=1
ωi` (fθ(xi), yi) + λΩ(θ)
Minimize the worst case → gives robustness
1J. Wen, C. Yu, and R. Greiner. “Robust Learning under UncertainTest Distributions: Relating Covariate Shift to Model Misspecification.”In: ICML. 2014.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 9: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/9.jpg)
Motivations: games and robust learningI Zero-sum games with two players:
minx∈∆(I)
maxy∈∆(J)
x>My
I Generative Adversarial Network (GAN)
I Robust learning:1 We want to learn
minθ∈Θ
1n
n∑i=1
` (fθ(xi), yi) + λΩ(θ)
with an uncertainty regarding the data:
minθ∈Θ
maxw∈∆n
n∑i=1
ωi` (fθ(xi), yi) + λΩ(θ)
Minimize the worst case → gives robustness
1J. Wen, C. Yu, and R. Greiner. “Robust Learning under UncertainTest Distributions: Relating Covariate Shift to Model Misspecification.”In: ICML. 2014.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 10: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/10.jpg)
Motivations: games and robust learningI Zero-sum games with two players:
minx∈∆(I)
maxy∈∆(J)
x>My
I Generative Adversarial Network (GAN)I Robust learning:1 We want to learn
minθ∈Θ
1n
n∑i=1
` (fθ(xi), yi) + λΩ(θ)
with an uncertainty regarding the data:
minθ∈Θ
maxw∈∆n
n∑i=1
ωi` (fθ(xi), yi) + λΩ(θ)
Minimize the worst case → gives robustness1J. Wen, C. Yu, and R. Greiner. “Robust Learning under Uncertain
Test Distributions: Relating Covariate Shift to Model Misspecification.”In: ICML. 2014.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 11: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/11.jpg)
Problem with Hard projection
The structured SVM:
minω∈Rd
λΩ(ω) + 1n
n∑i=1
maxy∈Yi
(Li(y)− 〈ω, φi(y)〉)︸ ︷︷ ︸structured hinge loss
Regularization: penalized → constrained.
minΩ(ω)≤β
maxα∈∆(|Y|)
bTα− ωTMα
Hard to project when:I Structured sparsity norm (group lasso norm).I The output Y is structured: exponential size.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 12: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/12.jpg)
Problem with Hard projection
The structured SVM:
minω∈Rd
λΩ(ω) + 1n
n∑i=1
maxy∈Yi
(Li(y)− 〈ω, φi(y)〉)︸ ︷︷ ︸structured hinge loss
Regularization: penalized → constrained.
minΩ(ω)≤β
maxα∈∆(|Y|)
bTα− ωTMα
Hard to project when:I Structured sparsity norm (group lasso norm).I The output Y is structured: exponential size.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 13: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/13.jpg)
Problem with Hard projection
The structured SVM:
minω∈Rd
λΩ(ω) + 1n
n∑i=1
maxy∈Yi
(Li(y)− 〈ω, φi(y)〉)︸ ︷︷ ︸structured hinge loss
Regularization: penalized → constrained.
minΩ(ω)≤β
maxα∈∆(|Y|)
bTα− ωTMα
Hard to project when:I Structured sparsity norm (group lasso norm).I The output Y is structured: exponential size.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 14: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/14.jpg)
Problem with Hard projection
The structured SVM:
minω∈Rd
λΩ(ω) + 1n
n∑i=1
maxy∈Yi
(Li(y)− 〈ω, φi(y)〉)︸ ︷︷ ︸structured hinge loss
Regularization: penalized → constrained.
minΩ(ω)≤β
maxα∈∆(|Y|)
bTα− ωTMα
Hard to project when:I Structured sparsity norm (group lasso norm).I The output Y is structured: exponential size.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 15: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/15.jpg)
Problem with Hard projection
The structured SVM:
minω∈Rd
λΩ(ω) + 1n
n∑i=1
maxy∈Yi
(Li(y)− 〈ω, φi(y)〉)︸ ︷︷ ︸structured hinge loss
Regularization: penalized → constrained.
minΩ(ω)≤β
maxα∈∆(|Y|)
bTα− ωTMα
Hard to project when:
I Structured sparsity norm (group lasso norm).I The output Y is structured: exponential size.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 16: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/16.jpg)
Problem with Hard projection
The structured SVM:
minω∈Rd
λΩ(ω) + 1n
n∑i=1
maxy∈Yi
(Li(y)− 〈ω, φi(y)〉)︸ ︷︷ ︸structured hinge loss
Regularization: penalized → constrained.
minΩ(ω)≤β
maxα∈∆(|Y|)
bTα− ωTMα
Hard to project when:I Structured sparsity norm (group lasso norm).
I The output Y is structured: exponential size.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 17: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/17.jpg)
Problem with Hard projection
The structured SVM:
minω∈Rd
λΩ(ω) + 1n
n∑i=1
maxy∈Yi
(Li(y)− 〈ω, φi(y)〉)︸ ︷︷ ︸structured hinge loss
Regularization: penalized → constrained.
minΩ(ω)≤β
maxα∈∆(|Y|)
bTα− ωTMα
Hard to project when:I Structured sparsity norm (group lasso norm).I The output Y is structured: exponential size.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 18: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/18.jpg)
Standard approaches in literature
Simplest algorithm to solve Saddle point problems is theprojected gradient algorithm.
x(t+1) = PX (x(t) − η∇xL(x(t),y(t)))y(t+1) = PY(y(t) + η∇yL(x(t),y(t)))
For non-smooth optimization,
1T
T∑t=1
(x(t),y(t)
)−→T→∞
(x∗,y∗)
Faster algorithm: projected extra-gradient algorithm.
Can use LMO to compute approximate projections2.
2N. He and Z. Harchaoui. “Semi-proximal Mirror-Prox for NonsmoothComposite Minimization”. In: NIPS. 2015.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 19: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/19.jpg)
Standard approaches in literature
Simplest algorithm to solve Saddle point problems is theprojected gradient algorithm.
x(t+1) = PX (x(t) − η∇xL(x(t),y(t)))y(t+1) = PY(y(t) + η∇yL(x(t),y(t)))
For non-smooth optimization,
1T
T∑t=1
(x(t),y(t)
)−→T→∞
(x∗,y∗)
Faster algorithm: projected extra-gradient algorithm.
Can use LMO to compute approximate projections2.
2N. He and Z. Harchaoui. “Semi-proximal Mirror-Prox for NonsmoothComposite Minimization”. In: NIPS. 2015.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 20: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/20.jpg)
Standard approaches in literature
Simplest algorithm to solve Saddle point problems is theprojected gradient algorithm.
x(t+1) = PX (x(t) − η∇xL(x(t),y(t)))y(t+1) = PY(y(t) + η∇yL(x(t),y(t)))
For non-smooth optimization,
1T
T∑t=1
(x(t),y(t)
)−→T→∞
(x∗,y∗)
Faster algorithm: projected extra-gradient algorithm.
Can use LMO to compute approximate projections2.
2N. He and Z. Harchaoui. “Semi-proximal Mirror-Prox for NonsmoothComposite Minimization”. In: NIPS. 2015.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 21: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/21.jpg)
The FW algorithm
Algorithm Frank-Wolfe algorithm1: Let x(0) ∈ X2: for t = 0 . . . T do3: Compute r(t) = ∇f(x(t))4: Compute s(t) ∈ argmin
s∈X
⟨s, r(t)⟩
5: Compute gt :=⟨x(t) − s(t), r(t)⟩
6: if gt ≤ ε then return x(t)
7: Let γ = 22+t (or do line-search)
8: Update x(t+1) := (1−γ)x(t)+γs(t)
9: end for↵
f(↵)
M
f
Figure: One step of the FWalgorithm
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 22: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/22.jpg)
The FW algorithm
Algorithm Frank-Wolfe algorithm1: Let x(0) ∈ X2: for t = 0 . . . T do3: Compute r(t) = ∇f(x(t))4: Compute s(t) ∈ argmin
s∈X
⟨s, r(t)⟩
5: Compute gt :=⟨x(t) − s(t), r(t)⟩
6: if gt ≤ ε then return x(t)
7: Let γ = 22+t (or do line-search)
8: Update x(t+1) := (1−γ)x(t)+γs(t)
9: end for↵
f(↵)
M
f
f(↵) +
s0 ↵,rf(↵)
↵
Figure: One step of the FWalgorithm
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 23: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/23.jpg)
The FW algorithm
Algorithm Frank-Wolfe algorithm1: Let x(0) ∈ X2: for t = 0 . . . T do3: Compute r(t) = ∇f(x(t))4: Compute s(t) ∈ argmin
s∈X
⟨s, r(t)⟩
5: Compute gt :=⟨x(t) − s(t), r(t)⟩
6: if gt ≤ ε then return x(t)
7: Let γ = 22+t (or do line-search)
8: Update x(t+1) := (1−γ)x(t)+γs(t)
9: end for↵
f(↵)
M
f
f(↵) +
s0 ↵,rf(↵)
↵
Figure: One step of the FWalgorithm
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 24: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/24.jpg)
The FW algorithm
Algorithm Frank-Wolfe algorithm1: Let x(0) ∈ X2: for t = 0 . . . T do3: Compute r(t) = ∇f(x(t))4: Compute s(t) ∈ argmin
s∈X
⟨s, r(t)⟩
5: Compute gt :=⟨x(t) − s(t), r(t)⟩
6: if gt ≤ ε then return x(t)
7: Let γ = 22+t (or do line-search)
8: Update x(t+1) := (1−γ)x(t)+γs(t)
9: end for↵
f(↵)
M
f
s
f(↵) +
s0 ↵,rf(↵)
↵
Figure: One step of the FWalgorithm
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 25: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/25.jpg)
SP-FWAlgorithm Saddle point FW algorithm
1: Let z(0) = (x(0),y(0)) ∈ X × Y2: for t = 0 . . . T do3: Compute r(t) :=
(∇xL(x(t),y(t))−∇yL(x(t),y(t))
)4: Compute s(t) ∈ argmin
z∈X×Y
⟨z, r(t)
⟩5: Compute gt :=
⟨z(t) − s(t), r(t)
⟩6: if gt ≤ ε then return z(t)
7: Let γ = min(1, νC gt
)or γ = 2
2+t8: Update z(t+1) := (1− γ)z(t) + γs(t)
9: end for
I One can define FW extension with away step.I γt = 1
1+t ⇒ z(t) = 1t
∑ti=0 s(i).
I (γt = 11+t) + Bilinear objective ↔ fictitious play algorithm.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 26: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/26.jpg)
SP-FWAlgorithm Saddle point FW algorithm
1: Let z(0) = (x(0),y(0)) ∈ X × Y2: for t = 0 . . . T do3: Compute r(t) :=
(∇xL(x(t),y(t))−∇yL(x(t),y(t))
)4: Compute s(t) ∈ argmin
z∈X×Y
⟨z, r(t)
⟩5: Compute gt :=
⟨z(t) − s(t), r(t)
⟩6: if gt ≤ ε then return z(t)
7: Let γ = min(1, νC gt
)or γ = 2
2+t8: Update z(t+1) := (1− γ)z(t) + γs(t)
9: end for
I One can define FW extension with away step.I γt = 1
1+t ⇒ z(t) = 1t
∑ti=0 s(i).
I (γt = 11+t) + Bilinear objective ↔ fictitious play algorithm.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 27: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/27.jpg)
SP-FWAlgorithm Saddle point FW algorithm
1: Let z(0) = (x(0),y(0)) ∈ X × Y2: for t = 0 . . . T do3: Compute r(t) :=
(∇xL(x(t),y(t))−∇yL(x(t),y(t))
)4: Compute s(t) ∈ argmin
z∈X×Y
⟨z, r(t)
⟩5: Compute gt :=
⟨z(t) − s(t), r(t)
⟩6: if gt ≤ ε then return z(t)
7: Let γ = min(1, νC gt
)or γ = 2
2+t8: Update z(t+1) := (1− γ)z(t) + γs(t)
9: end for
I One can define FW extension with away step.I γt = 1
1+t ⇒ z(t) = 1t
∑ti=0 s(i).
I (γt = 11+t) + Bilinear objective ↔ fictitious play algorithm.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 28: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/28.jpg)
SP-FWAlgorithm Saddle point FW algorithm
1: Let z(0) = (x(0),y(0)) ∈ X × Y2: for t = 0 . . . T do3: Compute r(t) :=
(∇xL(x(t),y(t))−∇yL(x(t),y(t))
)4: Compute s(t) ∈ argmin
z∈X×Y
⟨z, r(t)
⟩5: Compute gt :=
⟨z(t) − s(t), r(t)
⟩6: if gt ≤ ε then return z(t)
7: Let γ = min(1, νC gt
)or γ = 2
2+t8: Update z(t+1) := (1− γ)z(t) + γs(t)
9: end forI One can define FW extension with away step.
I γt = 11+t ⇒ z(t) = 1
t
∑ti=0 s(i).
I (γt = 11+t) + Bilinear objective ↔ fictitious play algorithm.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 29: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/29.jpg)
SP-FWAlgorithm Saddle point FW algorithm
1: Let z(0) = (x(0),y(0)) ∈ X × Y2: for t = 0 . . . T do3: Compute r(t) :=
(∇xL(x(t),y(t))−∇yL(x(t),y(t))
)4: Compute s(t) ∈ argmin
z∈X×Y
⟨z, r(t)
⟩5: Compute gt :=
⟨z(t) − s(t), r(t)
⟩6: if gt ≤ ε then return z(t)
7: Let γ = min(1, νC gt
)or γ = 2
2+t8: Update z(t+1) := (1− γ)z(t) + γs(t)
9: end forI One can define FW extension with away step.I γt = 1
1+t ⇒ z(t) = 1t
∑ti=0 s(i).
I (γt = 11+t) + Bilinear objective ↔ fictitious play algorithm.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 30: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/30.jpg)
Advantages of SP-FWSame main property as FW:
Only LMO (linear minimization oracle).
Same other advantages as FW:I Convergence certificate gt for free.I Affine invariance of the algorithm.I Sparsity of the iterates.I Universal step size γt := 2
2+t , adaptive step size γt := νC gt.
Main difference with SP:I No line-search.
When constraints set is a “complicated” structured polytopeprojections can be hard whereas LMO might be tractable.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 31: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/31.jpg)
Advantages of SP-FWSame main property as FW:
Only LMO (linear minimization oracle).
Same other advantages as FW:I Convergence certificate gt for free.
I Affine invariance of the algorithm.I Sparsity of the iterates.I Universal step size γt := 2
2+t , adaptive step size γt := νC gt.
Main difference with SP:I No line-search.
When constraints set is a “complicated” structured polytopeprojections can be hard whereas LMO might be tractable.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 32: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/32.jpg)
Advantages of SP-FWSame main property as FW:
Only LMO (linear minimization oracle).
Same other advantages as FW:I Convergence certificate gt for free.I Affine invariance of the algorithm.
I Sparsity of the iterates.I Universal step size γt := 2
2+t , adaptive step size γt := νC gt.
Main difference with SP:I No line-search.
When constraints set is a “complicated” structured polytopeprojections can be hard whereas LMO might be tractable.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 33: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/33.jpg)
Advantages of SP-FWSame main property as FW:
Only LMO (linear minimization oracle).
Same other advantages as FW:I Convergence certificate gt for free.I Affine invariance of the algorithm.I Sparsity of the iterates.
I Universal step size γt := 22+t , adaptive step size γt := ν
C gt.Main difference with SP:I No line-search.
When constraints set is a “complicated” structured polytopeprojections can be hard whereas LMO might be tractable.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 34: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/34.jpg)
Advantages of SP-FWSame main property as FW:
Only LMO (linear minimization oracle).
Same other advantages as FW:I Convergence certificate gt for free.I Affine invariance of the algorithm.I Sparsity of the iterates.I Universal step size γt := 2
2+t , adaptive step size γt := νC gt.
Main difference with SP:I No line-search.
When constraints set is a “complicated” structured polytopeprojections can be hard whereas LMO might be tractable.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 35: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/35.jpg)
Advantages of SP-FWSame main property as FW:
Only LMO (linear minimization oracle).
Same other advantages as FW:I Convergence certificate gt for free.I Affine invariance of the algorithm.I Sparsity of the iterates.I Universal step size γt := 2
2+t , adaptive step size γt := νC gt.
Main difference with SP:I No line-search.
When constraints set is a “complicated” structured polytopeprojections can be hard whereas LMO might be tractable.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 36: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/36.jpg)
Advantages of SP-FWSame main property as FW:
Only LMO (linear minimization oracle).
Same other advantages as FW:I Convergence certificate gt for free.I Affine invariance of the algorithm.I Sparsity of the iterates.I Universal step size γt := 2
2+t , adaptive step size γt := νC gt.
Main difference with SP:I No line-search.
When constraints set is a “complicated” structured polytopeprojections can be hard whereas LMO might be tractable.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 37: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/37.jpg)
Theoretical contributionSP extension of FW with away step:
Convergence: Linear rate with adaptive step size.Sublinear rate with universal step size.
I Similar hypothesis as AFW for linear convergence:1. Strong convexity and smoothness of the function.2. X and Y polytopes.
I Additional assumption on the bilinearity.
L(x,y) = f(x) + x>My − g(y)
‖M‖ smaller than the strong convexity constant.I Proof use recent advances on AFW.I Partially answering a 30 years old conjecture3.
3J. Hammond. “Solving asymmetric variational inequality problems andsystems of equations with generalized nonlinear programming algorithms”.PhD thesis. MIT, 1984.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 38: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/38.jpg)
Theoretical contributionSP extension of FW with away step:
Convergence: Linear rate with adaptive step size.Sublinear rate with universal step size.
I Similar hypothesis as AFW for linear convergence:1. Strong convexity and smoothness of the function.2. X and Y polytopes.
I Additional assumption on the bilinearity.
L(x,y) = f(x) + x>My − g(y)
‖M‖ smaller than the strong convexity constant.I Proof use recent advances on AFW.I Partially answering a 30 years old conjecture3.
3J. Hammond. “Solving asymmetric variational inequality problems andsystems of equations with generalized nonlinear programming algorithms”.PhD thesis. MIT, 1984.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 39: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/39.jpg)
Theoretical contributionSP extension of FW with away step:
Convergence: Linear rate with adaptive step size.Sublinear rate with universal step size.
I Similar hypothesis as AFW for linear convergence:1. Strong convexity and smoothness of the function.2. X and Y polytopes.
I Additional assumption on the bilinearity.
L(x,y) = f(x) + x>My − g(y)
‖M‖ smaller than the strong convexity constant.
I Proof use recent advances on AFW.I Partially answering a 30 years old conjecture3.
3J. Hammond. “Solving asymmetric variational inequality problems andsystems of equations with generalized nonlinear programming algorithms”.PhD thesis. MIT, 1984.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 40: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/40.jpg)
Theoretical contributionSP extension of FW with away step:
Convergence: Linear rate with adaptive step size.Sublinear rate with universal step size.
I Similar hypothesis as AFW for linear convergence:1. Strong convexity and smoothness of the function.2. X and Y polytopes.
I Additional assumption on the bilinearity.
L(x,y) = f(x) + x>My − g(y)
‖M‖ smaller than the strong convexity constant.I Proof use recent advances on AFW.I Partially answering a 30 years old conjecture3.3J. Hammond. “Solving asymmetric variational inequality problems and
systems of equations with generalized nonlinear programming algorithms”.PhD thesis. MIT, 1984.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 41: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/41.jpg)
Difficulties for saddle point
Usual descent Lemma:
ht+1 ≤ ht − γtgt︸︷︷︸≥0
+γ2t
L‖d(t)‖2
2
With γt small enough the sequence decreases.
For saddle point problem the Lipschitz gradient property gives
Lt+1 − L∗ ≤ Lt − L∗ − γt(g
(x)t − g
(y)t
)︸ ︷︷ ︸
arbitrary sign
+γ2t
L‖d(t)‖2
2 .
I Cannot control the oscillation of the sequence.I Must introduce other quantities to establish convergence.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 42: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/42.jpg)
Difficulties for saddle point
Usual descent Lemma:
ht+1 ≤ ht − γtgt︸︷︷︸≥0
+γ2t
L‖d(t)‖2
2
With γt small enough the sequence decreases.
For saddle point problem the Lipschitz gradient property gives
Lt+1 − L∗ ≤ Lt − L∗ − γt(g
(x)t − g
(y)t
)︸ ︷︷ ︸
arbitrary sign
+γ2t
L‖d(t)‖2
2 .
I Cannot control the oscillation of the sequence.I Must introduce other quantities to establish convergence.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 43: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/43.jpg)
Toy experiments
SP-AFW on a toy exampled = 30. with theoreticalstep-size γt = ν
C gt.
Figure: SP-AFW on a toyexample d = 30 with heuristicstep-size. γt = gt
C+2 ‖M‖2D2µ
C = 2LD2
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 44: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/44.jpg)
Toy experiments
SP-AFW on a toy exampled = 30. with theoreticalstep-size γt = ν
C gt.
Figure: SP-AFW on a toyexample d = 30 with heuristicstep-size. γt = gt
C+2 ‖M‖2D2µ
C = 2LD2
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 45: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/45.jpg)
Conclusion
I SP-FW one of the first SP solver only working with LMO.
I FW resurgence lead to new structured problems.I Same hope as FW for SP-FW
Call for applications !
I Still many theoretical opened questions.I With a bilinear objective this algorithm is highly related
to the fictitious play algorithm.I Rich interplay tapping into this game theory literature.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 46: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/46.jpg)
Conclusion
I SP-FW one of the first SP solver only working with LMO.I FW resurgence lead to new structured problems.
I Same hope as FW for SP-FW
Call for applications !
I Still many theoretical opened questions.I With a bilinear objective this algorithm is highly related
to the fictitious play algorithm.I Rich interplay tapping into this game theory literature.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 47: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/47.jpg)
Conclusion
I SP-FW one of the first SP solver only working with LMO.I FW resurgence lead to new structured problems.I Same hope as FW for SP-FW
Call for applications !
I Still many theoretical opened questions.I With a bilinear objective this algorithm is highly related
to the fictitious play algorithm.I Rich interplay tapping into this game theory literature.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 48: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/48.jpg)
Conclusion
I SP-FW one of the first SP solver only working with LMO.I FW resurgence lead to new structured problems.I Same hope as FW for SP-FW
Call for applications !
I Still many theoretical opened questions.I With a bilinear objective this algorithm is highly related
to the fictitious play algorithm.I Rich interplay tapping into this game theory literature.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 49: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/49.jpg)
Conclusion
I SP-FW one of the first SP solver only working with LMO.I FW resurgence lead to new structured problems.I Same hope as FW for SP-FW
Call for applications !
I Still many theoretical opened questions.
I With a bilinear objective this algorithm is highly relatedto the fictitious play algorithm.
I Rich interplay tapping into this game theory literature.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 50: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/50.jpg)
Conclusion
I SP-FW one of the first SP solver only working with LMO.I FW resurgence lead to new structured problems.I Same hope as FW for SP-FW
Call for applications !
I Still many theoretical opened questions.I With a bilinear objective this algorithm is highly related
to the fictitious play algorithm.
I Rich interplay tapping into this game theory literature.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 51: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/51.jpg)
Conclusion
I SP-FW one of the first SP solver only working with LMO.I FW resurgence lead to new structured problems.I Same hope as FW for SP-FW
Call for applications !
I Still many theoretical opened questions.I With a bilinear objective this algorithm is highly related
to the fictitious play algorithm.I Rich interplay tapping into this game theory literature.
Gauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016
![Page 52: Frank-Wolfe Algorithms for Saddle Point problemsGauthier Gidel Frank-Wolfe Algorithms for SP 10th December 2016 Motivations: gamesandrobustlearning IZero-sum games with two players:](https://reader033.vdocuments.net/reader033/viewer/2022050505/5f97041c5f587c5caa4b915f/html5/thumbnails/52.jpg)
Thank You !
Slides available on www.di.ens.fr/~gidel.