calculating hessian matrices - télécom paristech · calculating hessian matrices calculating...
TRANSCRIPT
![Page 1: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/1.jpg)
Calculating Hessian matrices
Calculating Hessian matrices
StudentRobert Mansel Gower [email protected]
AdvisorMargarida Pinheiro Mello [email protected]
July 25, 2011
![Page 2: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/2.jpg)
Calculating Hessian matrices
Contents
1 Motivation
2 Computational graph
3 GradientForward GradientPartial derivatives on computational graph
4 HessianForward HessianHessian on computational graphNew Reverse Hessian algorithmComparative tests
![Page 3: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/3.jpg)
Calculating Hessian matrices
Motivation
Motivation from nonlinear programming
Second order Taylor approximations very common in nonlinearprogramming.
Hessians desirable in interior-point and augmented Lagrangianmethods.
Sensitivity analysis
![Page 4: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/4.jpg)
Calculating Hessian matrices
Computational graph
Function Representation
−1 0
1 2
3
v−1 = x−1 v0 = x0−1 0
v1 = h(v−1) 1 v2 = g(v−1, v0)2
v3 = f (v2, v1) 3
f (h(x−1), g(x−1, x0))v−1 = x−1v0 = x0v1 = h(v−1)v2 = g(v−1, v0)v3 = f (v2, v1)
Indices of matrices and vectors shifted by −n.y ∈ Rm : y = (y1−n, . . . , ym−n)T
Node numbering is in order of evaluation.
(j is a predecessor of i) ≡ j ∈ P(i).
(i is a sucessor of j) ≡ i ∈ S(j).
![Page 5: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/5.jpg)
Calculating Hessian matrices
Computational graph
Function Representation
−1 0
1 2
3
v−1 = x−1 v0 = x0−1 0
v1 = h(v−1) 1 v2 = g(v−1, v0)2
v3 = f (v2, v1) 3
f (h(x−1), g(x−1, x0))
v−1 = x−1v0 = x0v1 = h(v−1)v2 = g(v−1, v0)v3 = f (v2, v1)
Indices of matrices and vectors shifted by −n.y ∈ Rm : y = (y1−n, . . . , ym−n)T
Node numbering is in order of evaluation.
(j is a predecessor of i) ≡ j ∈ P(i).
(i is a sucessor of j) ≡ i ∈ S(j).
![Page 6: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/6.jpg)
Calculating Hessian matrices
Computational graph
Function Representation
−1 0
1 2
3
v−1 = x−1 v0 = x0−1 0
v1 = h(v−1) 1 v2 = g(v−1, v0)2
v3 = f (v2, v1) 3
f (h(x−1), g(x−1, x0))v−1 = x−1v0 = x0v1 = h(v−1)v2 = g(v−1, v0)v3 = f (v2, v1)
Indices of matrices and vectors shifted by −n.y ∈ Rm : y = (y1−n, . . . , ym−n)T
Node numbering is in order of evaluation.
(j is a predecessor of i) ≡ j ∈ P(i).
(i is a sucessor of j) ≡ i ∈ S(j).
![Page 7: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/7.jpg)
Calculating Hessian matrices
Computational graph
Function Representation
−1 0
1 2
3
v−1 = x−1 v0 = x0−1 0
v1 = h(v−1) 1 v2 = g(v−1, v0)2
v3 = f (v2, v1) 3
f (h(x−1), g(x−1, x0))v−1 = x−1v0 = x0v1 = h(v−1)v2 = g(v−1, v0)v3 = f (v2, v1)
Indices of matrices and vectors shifted by −n.y ∈ Rm : y = (y1−n, . . . , ym−n)T
Node numbering is in order of evaluation.
(j is a predecessor of i) ≡ j ∈ P(i).
(i is a sucessor of j) ≡ i ∈ S(j).
![Page 8: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/8.jpg)
Calculating Hessian matrices
Computational graph
Function Representation
−1 0
1 2
3
v−1 = x−1 v0 = x0−1 0
v1 = h(v−1) 1 v2 = g(v−1, v0)2
v3 = f (v2, v1) 3
f (h(x−1), g(x−1, x0))v−1 = x−1v0 = x0v1 = h(v−1)v2 = g(v−1, v0)v3 = f (v2, v1)
Indices of matrices and vectors shifted by −n.y ∈ Rm : y = (y1−n, . . . , ym−n)T
Node numbering is in order of evaluation.
(j is a predecessor of i) ≡ j ∈ P(i).
(i is a sucessor of j) ≡ i ∈ S(j).
![Page 9: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/9.jpg)
Calculating Hessian matrices
Computational graph
Function Representation
−1 0
1 2
3
v−1 = x−1 v0 = x0
−1 0
v1 = h(v−1) 1
v2 = g(v−1, v0)2
v3 = f (v2, v1) 3
f (h(x−1), g(x−1, x0))v−1 = x−1v0 = x0v1 = h(v−1)v2 = g(v−1, v0)v3 = f (v2, v1)
Indices of matrices and vectors shifted by −n.y ∈ Rm : y = (y1−n, . . . , ym−n)T
Node numbering is in order of evaluation.
(j is a predecessor of i) ≡ j ∈ P(i).
(i is a sucessor of j) ≡ i ∈ S(j).
![Page 10: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/10.jpg)
Calculating Hessian matrices
Computational graph
Function Representation
−1 0
1 2
3
v−1 = x−1 v0 = x0
−1 0
v1 = h(v−1)
1
v2 = g(v−1, v0)2
v3 = f (v2, v1) 3
f (h(x−1), g(x−1, x0))v−1 = x−1v0 = x0v1 = h(v−1)v2 = g(v−1, v0)v3 = f (v2, v1)
Indices of matrices and vectors shifted by −n.y ∈ Rm : y = (y1−n, . . . , ym−n)T
Node numbering is in order of evaluation.
(j is a predecessor of i) ≡ j ∈ P(i).
(i is a sucessor of j) ≡ i ∈ S(j).
![Page 11: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/11.jpg)
Calculating Hessian matrices
Computational graph
Function Representation
−1 0
1 2
3
v−1 = x−1 v0 = x0
−1 0
v1 = h(v−1)
1
v2 = g(v−1, v0)
2
v3 = f (v2, v1) 3
f (h(x−1), g(x−1, x0))v−1 = x−1v0 = x0v1 = h(v−1)v2 = g(v−1, v0)v3 = f (v2, v1)
Indices of matrices and vectors shifted by −n.y ∈ Rm : y = (y1−n, . . . , ym−n)T
Node numbering is in order of evaluation.
(j is a predecessor of i) ≡ j ∈ P(i).
(i is a sucessor of j) ≡ i ∈ S(j).
![Page 12: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/12.jpg)
Calculating Hessian matrices
Computational graph
Function Representation
−1 0
1 2
3
v−1 = x−1 v0 = x0
−1 0
v1 = h(v−1)
1
v2 = g(v−1, v0)
2
v3 = f (v2, v1)
3
f (h(x−1), g(x−1, x0))v−1 = x−1v0 = x0v1 = h(v−1)v2 = g(v−1, v0)v3 = f (v2, v1)
Indices of matrices and vectors shifted by −n.y ∈ Rm : y = (y1−n, . . . , ym−n)T
Node numbering is in order of evaluation.
(j is a predecessor of i) ≡ j ∈ P(i).
(i is a sucessor of j) ≡ i ∈ S(j).
![Page 13: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/13.jpg)
Calculating Hessian matrices
Computational graph
Function Representation
−1 0
1 2
3
v−1 = x−1 v0 = x0
−1 0
v1 = h(v−1)
1
v2 = g(v−1, v0)
2
v3 = f (v2, v1)
3
f (h(x−1), g(x−1, x0))v−1 = x−1v0 = x0v1 = h(v−1)v2 = g(v−1, v0)v3 = f (v2, v1)
Indices of matrices and vectors shifted by −n.y ∈ Rm : y = (y1−n, . . . , ym−n)T
Node numbering is in order of evaluation.
(j is a predecessor of i) ≡ j ∈ P(i).
(i is a sucessor of j) ≡ i ∈ S(j).
![Page 14: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/14.jpg)
Calculating Hessian matrices
Computational graph
Function Evaluation ≡ Computational Graph
Nodes for Independent variables:
vi−n = xi−n, for i = 1, . . . , nIndependent nodes Z = {1− n, . . . , 0}
Nodes for Intermediate variables:vi = φi (vP(i)), for i = 1, . . . , `.Intermediate nodes V = {1, . . . , `}
Function Evaluation ≡ G = (Z ∪ V ,E ) & φ set of elementalfunctions with derivatives codedTIME(eval(f (x))) = O(`+ n).
![Page 15: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/15.jpg)
Calculating Hessian matrices
Computational graph
Function Evaluation ≡ Computational Graph
Nodes for Independent variables:vi−n = xi−n, for i = 1, . . . , n
Independent nodes Z = {1− n, . . . , 0}
Nodes for Intermediate variables:vi = φi (vP(i)), for i = 1, . . . , `.Intermediate nodes V = {1, . . . , `}
Function Evaluation ≡ G = (Z ∪ V ,E ) & φ set of elementalfunctions with derivatives codedTIME(eval(f (x))) = O(`+ n).
![Page 16: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/16.jpg)
Calculating Hessian matrices
Computational graph
Function Evaluation ≡ Computational Graph
Nodes for Independent variables:vi−n = xi−n, for i = 1, . . . , nIndependent nodes Z = {1− n, . . . , 0}
Nodes for Intermediate variables:
vi = φi (vP(i)), for i = 1, . . . , `.Intermediate nodes V = {1, . . . , `}
Function Evaluation ≡ G = (Z ∪ V ,E ) & φ set of elementalfunctions with derivatives codedTIME(eval(f (x))) = O(`+ n).
![Page 17: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/17.jpg)
Calculating Hessian matrices
Computational graph
Function Evaluation ≡ Computational Graph
Nodes for Independent variables:vi−n = xi−n, for i = 1, . . . , nIndependent nodes Z = {1− n, . . . , 0}
Nodes for Intermediate variables:vi = φi (vP(i)), for i = 1, . . . , `.
Intermediate nodes V = {1, . . . , `}
Function Evaluation ≡ G = (Z ∪ V ,E ) & φ set of elementalfunctions with derivatives codedTIME(eval(f (x))) = O(`+ n).
![Page 18: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/18.jpg)
Calculating Hessian matrices
Computational graph
Function Evaluation ≡ Computational Graph
Nodes for Independent variables:vi−n = xi−n, for i = 1, . . . , nIndependent nodes Z = {1− n, . . . , 0}
Nodes for Intermediate variables:vi = φi (vP(i)), for i = 1, . . . , `.Intermediate nodes V = {1, . . . , `}
Function Evaluation ≡ G = (Z ∪ V ,E ) & φ set of elementalfunctions with derivatives codedTIME(eval(f (x))) = O(`+ n).
![Page 19: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/19.jpg)
Calculating Hessian matrices
Computational graph
Function Evaluation ≡ Computational Graph
Nodes for Independent variables:vi−n = xi−n, for i = 1, . . . , nIndependent nodes Z = {1− n, . . . , 0}
Nodes for Intermediate variables:vi = φi (vP(i)), for i = 1, . . . , `.Intermediate nodes V = {1, . . . , `}
Function Evaluation ≡ G = (Z ∪ V ,E )
& φ set of elementalfunctions with derivatives codedTIME(eval(f (x))) = O(`+ n).
![Page 20: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/20.jpg)
Calculating Hessian matrices
Computational graph
Function Evaluation ≡ Computational Graph
Nodes for Independent variables:vi−n = xi−n, for i = 1, . . . , nIndependent nodes Z = {1− n, . . . , 0}
Nodes for Intermediate variables:vi = φi (vP(i)), for i = 1, . . . , `.Intermediate nodes V = {1, . . . , `}
Function Evaluation ≡ G = (Z ∪ V ,E ) & φ set of elementalfunctions with derivatives coded
TIME(eval(f (x))) = O(`+ n).
![Page 21: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/21.jpg)
Calculating Hessian matrices
Computational graph
Function Evaluation ≡ Computational Graph
Nodes for Independent variables:vi−n = xi−n, for i = 1, . . . , nIndependent nodes Z = {1− n, . . . , 0}
Nodes for Intermediate variables:vi = φi (vP(i)), for i = 1, . . . , `.Intermediate nodes V = {1, . . . , `}
Function Evaluation ≡ G = (Z ∪ V ,E ) & φ set of elementalfunctions with derivatives codedTIME(eval(f (x))) = O(`+ n).
![Page 22: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/22.jpg)
Calculating Hessian matrices
Gradient
Forward Gradient
Forward Gradient: The first attempt
Set of elemental function = Sums, multiplication and unaryfunctions.
vi = φi (vP(i))
∇vi =∑
j∈P(i)
∂φi
∂vj∇vj .
Each j passes on ∂φi∂vj∇vj to each sucessor i .
![Page 23: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/23.jpg)
Calculating Hessian matrices
Gradient
Forward Gradient
Forward Gradient: The first attempt
Set of elemental function = Sums, multiplication and unaryfunctions.
vi = φi (vP(i))
∇vi =∑
j∈P(i)
∂φi
∂vj∇vj .
Each j passes on ∂φi∂vj∇vj to each sucessor i .
![Page 24: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/24.jpg)
Calculating Hessian matrices
Gradient
Forward Gradient
Forward Gradient: The first attempt
Set of elemental function = Sums, multiplication and unaryfunctions.
vi = φi (vP(i))
∇vi =∑
j∈P(i)
∂φi
∂vj∇vj .
Each j passes on ∂φi∂vj∇vj to each sucessor i .
![Page 25: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/25.jpg)
Calculating Hessian matrices
Gradient
Forward Gradient
Resume of Forward gradient
For each node i one stores ∇vi = ( ∂vi∂x1−n
, . . . , ∂vi∂x0).
Memory complexity: O(n`).
For each node visit, perform n-dimension vector arithmetic.
Time complexity: O(n`).
Storing and calculating all ∇vi ’s is expensive andunnecessary.
![Page 26: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/26.jpg)
Calculating Hessian matrices
Gradient
Forward Gradient
Resume of Forward gradient
For each node i one stores ∇vi = ( ∂vi∂x1−n
, . . . , ∂vi∂x0).
Memory complexity: O(n`).
For each node visit, perform n-dimension vector arithmetic.
Time complexity: O(n`).
Storing and calculating all ∇vi ’s is expensive andunnecessary.
![Page 27: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/27.jpg)
Calculating Hessian matrices
Gradient
Forward Gradient
Resume of Forward gradient
For each node i one stores ∇vi = ( ∂vi∂x1−n
, . . . , ∂vi∂x0).
Memory complexity: O(n`).
For each node visit, perform n-dimension vector arithmetic.
Time complexity: O(n`).
Storing and calculating all ∇vi ’s is expensive andunnecessary.
![Page 28: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/28.jpg)
Calculating Hessian matrices
Gradient
Forward Gradient
Resume of Forward gradient
For each node i one stores ∇vi = ( ∂vi∂x1−n
, . . . , ∂vi∂x0).
Memory complexity: O(n`).
For each node visit, perform n-dimension vector arithmetic.
Time complexity: O(n`).
Storing and calculating all ∇vi ’s is expensive andunnecessary.
![Page 29: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/29.jpg)
Calculating Hessian matrices
Gradient
Forward Gradient
Resume of Forward gradient
For each node i one stores ∇vi = ( ∂vi∂x1−n
, . . . , ∂vi∂x0).
Memory complexity: O(n`).
For each node visit, perform n-dimension vector arithmetic.
Time complexity: O(n`).
Storing and calculating all ∇vi ’s is expensive andunnecessary.
![Page 30: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/30.jpg)
Calculating Hessian matrices
Gradient
Forward Gradient
Resume of Forward gradient
For each node i one stores ∇vi = ( ∂vi∂x1−n
, . . . , ∂vi∂x0).
Memory complexity: O(n`).
For each node visit, perform n-dimension vector arithmetic.
Time complexity: O(n`).
Storing and calculating all ∇vi ’s is expensive andunnecessary.
![Page 31: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/31.jpg)
Calculating Hessian matrices
Gradient
Partial derivatives on computational graph
−1 0
1 2
3
f (x) = φ3(φ1(x−1), φ2(x−1, x0))
x−1 x0
∂f
∂x−1=∂φ3
∂v2
∂φ2
∂v−1+∂φ3
∂v1
∂φ1
∂v−1
∂f
∂xi=
∑p|path from i to `
(weight of path p)
![Page 32: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/32.jpg)
Calculating Hessian matrices
Gradient
Partial derivatives on computational graph
−1 0
1 2
3
f (x) = φ3(φ1(x−1), φ2(x−1, x0))
x−1 x0
∂φ1∂x−1
∂φ2∂x0
∂φ2∂x−1
∂φ3∂v2
∂φ3∂v1
∂f
∂x−1=∂φ3
∂v2
∂φ2
∂v−1+∂φ3
∂v1
∂φ1
∂v−1
∂f
∂xi=
∑p|path from i to `
(weight of path p)
![Page 33: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/33.jpg)
Calculating Hessian matrices
Gradient
Partial derivatives on computational graph
−1 0
1 2
3
f (x) = φ3(φ1(x−1), φ2(x−1, x0))
x−1 x0
∂φ1∂x−1
∂φ2∂x0
∂φ2∂x−1
∂φ3∂v2
∂φ3∂v1
∂f
∂x0
∂f
∂x−1=∂φ3
∂v2
∂φ2
∂v−1+∂φ3
∂v1
∂φ1
∂v−1
∂f
∂xi=
∑p|path from i to `
(weight of path p)
![Page 34: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/34.jpg)
Calculating Hessian matrices
Gradient
Partial derivatives on computational graph
−1 0
1 2
3
f (x) = φ3(φ1(x−1), φ2(x−1, x0))
x−1 x0
∂φ1∂x−1
∂φ2∂x0
∂φ2∂x−1
∂φ3∂v2
∂φ3∂v1
∂f
∂x0=∂φ3
∂v2
∂φ2
∂x0+ 0
∂f
∂x−1=∂φ3
∂v2
∂φ2
∂v−1+∂φ3
∂v1
∂φ1
∂v−1
∂f
∂xi=
∑p|path from i to `
(weight of path p)
![Page 35: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/35.jpg)
Calculating Hessian matrices
Gradient
Partial derivatives on computational graph
−1 0
1 2
3
f (x) = φ3(φ1(x−1), φ2(x−1, x0))
x−1 x0
∂φ1∂x−1
∂φ2∂x0
∂φ2∂x−1
∂φ3∂v2
∂φ3∂v1
∂f
∂x0=∂φ3
∂v2
∂φ2
∂x0+ 0
∂f
∂x−1=∂φ3
∂v2
∂φ2
∂v−1+∂φ3
∂v1
∂φ1
∂v−1
∂f
∂xi=
∑p|path from i to `
(weight of path p)
![Page 36: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/36.jpg)
Calculating Hessian matrices
Gradient
Partial derivatives on computational graph
−1 0
1 2
3
f (x) = φ3(φ1(x−1), φ2(x−1, x0))
x−1 x0
∂φ1∂x−1
∂φ2∂x0
∂φ2∂x−1
∂φ3∂v2
∂φ3∂v1
∂f
∂x0=∂φ3
∂v2
∂φ2
∂x0+ 0
∂f
∂x−1=∂φ3
∂v2
∂φ2
∂v−1+∂φ3
∂v1
∂φ1
∂v−1
∂f
∂xi=
∑p|path from i to `
(weight of path p)
![Page 37: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/37.jpg)
Calculating Hessian matrices
Gradient
Partial derivatives on computational graph
−1 0
1 2
3
f (x) = φ3(φ1(x−1), φ2(x−1, x0))
x−1 x0
∂φ1∂x−1
∂φ2∂x0
∂φ2∂x−1
∂φ3∂v2
∂φ3∂v1
∂f
∂x0=∂φ3
∂v2
∂φ2
∂x0+ 0
∂f
∂x−1=∂φ3
∂v2
∂φ2
∂v−1+∂φ3
∂v1
∂φ1
∂v−1
∂f
∂xi=
∑p|path from i to `
(weight of path p)
![Page 38: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/38.jpg)
Calculating Hessian matrices
Gradient
Partial derivatives on computational graph
−1 0
1 2
3
f (x) = φ3(φ1(x−1), φ2(x−1, x0))
x−1 x0
∂φ1∂x−1
∂φ2∂x0
∂φ2∂x−1
∂φ3∂v2
∂φ3∂v1
∂f
∂x0=∂φ3
∂v2
∂φ2
∂x0+ 0
∂f
∂x−1=∂φ3
∂v2
∂φ2
∂v−1+∂φ3
∂v1
∂φ1
∂v−1
∂f
∂xi=
∑p|path from i to `
(weight of path p)
![Page 39: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/39.jpg)
Calculating Hessian matrices
Gradient
Partial derivatives on computational graph
Reverse Gradient - Accumulating paths
−1 0
1 2
3
f (x) = φ3(φ1(x−1), φ2(x−1, x0))
∂φ1∂x−1
∂φ2∂x0
∂φ2∂x−1
∂φ3∂v2
∂φ3∂v1
v3 = 1 3
v1 = ∂φ3∂v1
v2 = ∂φ3∂v2
v−1 =∂φ2
∂x−1v2 v0 =
∂φ2
∂x0v2v−1 =
∂φ1
∂x−1v1 +
∂φ2
∂x−1v2
∂f
∂x−1= v−1
∂f
∂x0= v0
vi =∑path from i to `
(path weight)
vj =∑i∈S(j)
∂φi∂vj
vi
TIME(∇f (x))= TIME(f (x))
![Page 40: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/40.jpg)
Calculating Hessian matrices
Gradient
Partial derivatives on computational graph
Reverse Gradient - Accumulating paths
−1 0
1 2
3
f (x) = φ3(φ1(x−1), φ2(x−1, x0))
∂φ1∂x−1
∂φ2∂x0
∂φ2∂x−1
∂φ3∂v2
∂φ3∂v1
v3 = 1 3
v1 = ∂φ3∂v1
v2 = ∂φ3∂v2
v−1 =∂φ2
∂x−1v2 v0 =
∂φ2
∂x0v2v−1 =
∂φ1
∂x−1v1 +
∂φ2
∂x−1v2
∂f
∂x−1= v−1
∂f
∂x0= v0
vi =∑path from i to `
(path weight)
vj =∑i∈S(j)
∂φi∂vj
vi
TIME(∇f (x))= TIME(f (x))
![Page 41: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/41.jpg)
Calculating Hessian matrices
Gradient
Partial derivatives on computational graph
Reverse Gradient - Accumulating paths
−1 0
1 2
3
f (x) = φ3(φ1(x−1), φ2(x−1, x0))
∂φ1∂x−1
∂φ2∂x0
∂φ2∂x−1
∂φ3∂v2
∂φ3∂v1
v3 = 1 3
v1 = ∂φ3∂v1
v2 = ∂φ3∂v2
v−1 =∂φ2
∂x−1v2 v0 =
∂φ2
∂x0v2v−1 =
∂φ1
∂x−1v1 +
∂φ2
∂x−1v2
∂f
∂x−1= v−1
∂f
∂x0= v0
vi =∑path from i to `
(path weight)
vj =∑i∈S(j)
∂φi∂vj
vi
TIME(∇f (x))= TIME(f (x))
![Page 42: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/42.jpg)
Calculating Hessian matrices
Gradient
Partial derivatives on computational graph
Reverse Gradient - Accumulating paths
−1 0
1 2
3
f (x) = φ3(φ1(x−1), φ2(x−1, x0))
∂φ1∂x−1
∂φ2∂x0
∂φ2∂x−1
∂φ3∂v2
∂φ3∂v1
v3 = 1 3
v1 = ∂φ3∂v1
v2 = ∂φ3∂v2
v−1 =∂φ2
∂x−1v2 v0 =
∂φ2
∂x0v2v−1 =
∂φ1
∂x−1v1 +
∂φ2
∂x−1v2
∂f
∂x−1= v−1
∂f
∂x0= v0
vi =∑path from i to `
(path weight)
vj =∑i∈S(j)
∂φi∂vj
vi
TIME(∇f (x))= TIME(f (x))
![Page 43: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/43.jpg)
Calculating Hessian matrices
Gradient
Partial derivatives on computational graph
Reverse Gradient - Accumulating paths
−1 0
1 2
3
f (x) = φ3(φ1(x−1), φ2(x−1, x0))
∂φ1∂x−1
∂φ2∂x0
∂φ2∂x−1
∂φ3∂v2
∂φ3∂v1
v3 = 1 3
2v1 = ∂φ3∂v1
v2 = ∂φ3∂v2
v−1 =∂φ2
∂x−1v2 v0 =
∂φ2
∂x0v2v−1 =
∂φ1
∂x−1v1 +
∂φ2
∂x−1v2
∂f
∂x−1= v−1
∂f
∂x0= v0
vi =∑path from i to `
(path weight)
vj =∑i∈S(j)
∂φi∂vj
vi
TIME(∇f (x))= TIME(f (x))
![Page 44: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/44.jpg)
Calculating Hessian matrices
Gradient
Partial derivatives on computational graph
Reverse Gradient - Accumulating paths
−1 0
1 2
3
f (x) = φ3(φ1(x−1), φ2(x−1, x0))
∂φ1∂x−1
∂φ2∂x0
∂φ2∂x−1
∂φ3∂v2
∂φ3∂v1
v3 = 1 3
1v1 = ∂φ3∂v1
v2 = ∂φ3∂v2
v−1 =∂φ2
∂x−1v2 v0 =
∂φ2
∂x0v2
v−1 =∂φ1
∂x−1v1 +
∂φ2
∂x−1v2
∂f
∂x−1= v−1
∂f
∂x0= v0
vi =∑path from i to `
(path weight)
vj =∑i∈S(j)
∂φi∂vj
vi
TIME(∇f (x))= TIME(f (x))
![Page 45: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/45.jpg)
Calculating Hessian matrices
Gradient
Partial derivatives on computational graph
Reverse Gradient - Accumulating paths
−1 0
1 2
3
f (x) = φ3(φ1(x−1), φ2(x−1, x0))
∂φ1∂x−1
∂φ2∂x0
∂φ2∂x−1
∂φ3∂v2
∂φ3∂v1
v3 = 1 3
v1 = ∂φ3∂v1
v2 = ∂φ3∂v2
v−1 =∂φ2
∂x−1v2
v0 =∂φ2
∂x0v2v−1 =
∂φ1
∂x−1v1 +
∂φ2
∂x−1v2
∂f
∂x−1= v−1
∂f
∂x0= v0
vi =∑path from i to `
(path weight)
vj =∑i∈S(j)
∂φi∂vj
vi
TIME(∇f (x))= TIME(f (x))
![Page 46: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/46.jpg)
Calculating Hessian matrices
Gradient
Partial derivatives on computational graph
Reverse Gradient - Accumulating paths
−1 0
1 2
3
f (x) = φ3(φ1(x−1), φ2(x−1, x0))
∂φ1∂x−1
∂φ2∂x0
∂φ2∂x−1
∂φ3∂v2
∂φ3∂v1
v3 = 1 3
v1 = ∂φ3∂v1
v2 = ∂φ3∂v2
v−1 =∂φ2
∂x−1v2
v0 =∂φ2
∂x0v2v−1 =
∂φ1
∂x−1v1 +
∂φ2
∂x−1v2
∂f
∂x−1= v−1
∂f
∂x0= v0
vi =∑path from i to `
(path weight)
vj =∑i∈S(j)
∂φi∂vj
vi
TIME(∇f (x))= TIME(f (x))
![Page 47: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/47.jpg)
Calculating Hessian matrices
Hessian
Forward Hessian
Forward Hessian: McCormick and Jackson 1986
vi = φi (vP(i))
∇vi =∑
j∈P(i)
∂φi
∂vj∇vj
v ′′i =∑
j ,k∈P(i)
∇vj ·∂2φi
∂vj∂vk· ∇vT
k +∑
j∈P(i)
∂φi
∂vj· v ′′j
![Page 48: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/48.jpg)
Calculating Hessian matrices
Hessian
Forward Hessian
Forward Hessian: McCormick and Jackson 1986
vi = φi (vP(i))
∇vi =∑
j∈P(i)
∂φi
∂vj∇vj
v ′′i =∑
j ,k∈P(i)
∇vj ·∂2φi
∂vj∂vk· ∇vT
k +∑
j∈P(i)
∂φi
∂vj· v ′′j
![Page 49: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/49.jpg)
Calculating Hessian matrices
Hessian
Forward Hessian
Forward Hessian: McCormick and Jackson 1986
vi = φi (vP(i))
∇vi =∑
j∈P(i)
∂φi
∂vj∇vj
v ′′i =∑
j ,k∈P(i)
∇vj ·∂2φi
∂vj∂vk· ∇vT
k +∑
j∈P(i)
∂φi
∂vj· v ′′j
![Page 50: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/50.jpg)
Calculating Hessian matrices
Hessian
Forward Hessian
Forward Hessian: McCormick and Jackson 1986
vi = φi (vP(i))
∇vi =∑
j∈P(i)
∂φi
∂vj∇vj
v ′′i =∑
j ,k∈P(i)
∇vj ·∂2φi
∂vj∂vk· ∇vT
k +∑
j∈P(i)
∂φi
∂vj· v ′′j
![Page 51: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/51.jpg)
Calculating Hessian matrices
Hessian
Forward Hessian
Forward Hessian resume
For each node, store and calculate a n × n matrix.
Is it necessary to calculate the gradient and Hessian of eachnode?
Gain a deeper understanding on the problem using gradientgraph.
![Page 52: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/52.jpg)
Calculating Hessian matrices
Hessian
Forward Hessian
Forward Hessian resume
For each node, store and calculate a n × n matrix.
Is it necessary to calculate the gradient and Hessian of eachnode?
Gain a deeper understanding on the problem using gradientgraph.
![Page 53: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/53.jpg)
Calculating Hessian matrices
Hessian
Forward Hessian
Forward Hessian resume
For each node, store and calculate a n × n matrix.
Is it necessary to calculate the gradient and Hessian of eachnode?
Gain a deeper understanding on the problem using gradientgraph.
![Page 54: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/54.jpg)
Calculating Hessian matrices
Hessian
Forward Hessian
Calculating the Hessian using the computational graph
Function’s computational graph + vi nodes and dependencies= gradient computational graph.
Interpret partial derivative on augmented graph: Second orderderivative.
Eliminate unnecessary symmetries on augmented graph.
![Page 55: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/55.jpg)
Calculating Hessian matrices
Hessian
Forward Hessian
Calculating the Hessian using the computational graph
Function’s computational graph + vi nodes and dependencies= gradient computational graph.
Interpret partial derivative on augmented graph: Second orderderivative.
Eliminate unnecessary symmetries on augmented graph.
![Page 56: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/56.jpg)
Calculating Hessian matrices
Hessian
Forward Hessian
Calculating the Hessian using the computational graph
Function’s computational graph + vi nodes and dependencies= gradient computational graph.
Interpret partial derivative on augmented graph: Second orderderivative.
Eliminate unnecessary symmetries on augmented graph.
![Page 57: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/57.jpg)
Calculating Hessian matrices
Hessian
Hessian on computational graph
The adjoint variables of the Reverse Gradient satisfy
vj =∑i∈S(j)
vi∂φi∂vj≡ ϕj .
Gradient’s graph has 2(`+ n) nodes:(v1−n, . . . , v`) and (v1−n, . . . , v`).
node ←→ vj .
i ∈ P () iff j ∈ P(i).
![Page 58: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/58.jpg)
Calculating Hessian matrices
Hessian
Hessian on computational graph
The adjoint variables of the Reverse Gradient satisfy
vj =∑i∈S(j)
vi∂φi∂vj≡ ϕj .
Gradient’s graph has 2(`+ n) nodes:(v1−n, . . . , v`) and (v1−n, . . . , v`).
node ←→ vj .
i ∈ P () iff j ∈ P(i).
![Page 59: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/59.jpg)
Calculating Hessian matrices
Hessian
Hessian on computational graph
The adjoint variables of the Reverse Gradient satisfy
vj =∑i∈S(j)
vi∂φi∂vj≡ ϕj .
Gradient’s graph has 2(`+ n) nodes:(v1−n, . . . , v`) and (v1−n, . . . , v`).
node ←→ vj .
i ∈ P () iff j ∈ P(i).
![Page 60: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/60.jpg)
Calculating Hessian matrices
Hessian
Hessian on computational graph
The adjoint variables of the Reverse Gradient satisfy
vj =∑i∈S(j)
vi∂φi∂vj≡ ϕj .
Gradient’s graph has 2(`+ n) nodes:(v1−n, . . . , v`) and (v1−n, . . . , v`).
node ←→ vj .
i ∈ P () iff j ∈ P(i).
![Page 61: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/61.jpg)
Calculating Hessian matrices
Hessian
Hessian on computational graph
−1 0
1 2
3
f (x) = sin(x−1)(x−1 + x0)
−1 0
1 2
3
v−1 = x−1 v3 = 1v0 = x0 v2 = v3v1v1 = sin(v−1) v1 = v3v2v2 = (v−1 + v0) v0 = v21v3 = v1v2 v−1 = v21 + v1 cos(v−1)
![Page 62: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/62.jpg)
Calculating Hessian matrices
Hessian
Hessian on computational graph
−1 0
1 2
3
f (x) = sin(x−1)(x−1 + x0)
−1 0
1 2
3
v−1 = x−1 v3 = 1v0 = x0 v2 = v3v1v1 = sin(v−1) v1 = v3v2v2 = (v−1 + v0) v0 = v21v3 = v1v2 v−1 = v21 + v1 cos(v−1)
![Page 63: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/63.jpg)
Calculating Hessian matrices
Hessian
Hessian on computational graph
−1 0
1 2
3
f (x) = sin(x−1)(x−1 + x0)
−1 0
1 2
3
v−1 = x−1 v3 = 1v0 = x0 v2 = v3v1 ←−v1 = sin(v−1) v1 = v3v2v2 = (v−1 + v0) v0 = v21v3 = v1v2 v−1 = v21 + v1 cos(v−1)
![Page 64: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/64.jpg)
Calculating Hessian matrices
Hessian
Hessian on computational graph
−1 0
1 2
3
f (x) = sin(x−1)(x−1 + x0)
−1 0
1 2
3
v−1 = x−1 v3 = 1v0 = x0 v2 = v3v1v1 = sin(v−1) v1 = v3v2 ←−v2 = (v−1 + v0) v0 = v21v3 = v1v2 v−1 = v21 + v1 cos(v−1)
![Page 65: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/65.jpg)
Calculating Hessian matrices
Hessian
Hessian on computational graph
−1 0
1 2
3
f (x) = sin(x−1)(x−1 + x0)
−1 0
1 2
3
v−1 = x−1 v3 = 1v0 = x0 v2 = v3v1v1 = sin(v−1) v1 = v3v2v2 = (v−1 + v0) v0 = v21v3 = v1v2 v−1 = v21 + v1 cos(v−1) ←−
![Page 66: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/66.jpg)
Calculating Hessian matrices
Hessian
Hessian on computational graph
−1 0
1 2
3
f (x) = sin(x−1)(x−1 + x0)
−1 0
1 2
3
v−1∂2f
∂xi∂xj=∑
p from −1 to −1
Weight(p)
v−1 = x−1 v3 = 1v0 = x0 v2 = v3v1v1 = sin(v−1) v1 = v3v2v2 = (v−1 + v0) v0 = v21v3 = v1v2 v−1 = v21 + v1 cos(v−1)
![Page 67: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/67.jpg)
Calculating Hessian matrices
Hessian
Hessian on computational graph
−1 0
1 2
3
f (x) = sin(x−1)(x−1 + x0)
−1 0
1 2
3
v−1 = x−1 v3 = 1v0 = x0 v2 = v3v1v1 = sin(v−1) v1 = v3v2v2 = (v−1 + v0) v0 = v21v3 = v1v2 v−1 = v21 + v1 cos(v−1)
![Page 68: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/68.jpg)
Calculating Hessian matrices
Hessian
Hessian on computational graph
−1 0
1 2
3
f (x) = sin(x−1)(x−1 + x0)
−1 0
1 2
3
cos(v−1)
cos(v−1)
∂φ1
∂v−1= cos(v−1)
∂ϕ−1
∂v1= cos(v−1)
v−1 = x−1 v3 = 1v0 = x0 v2 = v3v1v1 = sin(v−1) ←− v1 = v3v2v2 = (v−1 + v0) v0 = v21v3 = v1v2 v−1 = v21 + v1 cos(v−1) ←−
![Page 69: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/69.jpg)
Calculating Hessian matrices
Hessian
Hessian on computational graph
−1 0
1 2
3
f (x) = sin(x−1)(x−1 + x0)
−1 0
1 2
3
cos(v−1)
cos(v−1)∂ϕj
∂vk= ckj =
∂φk
∂vj
v−1 = x−1 v3 = 1v0 = x0 v2 = v3v1v1 = sin(v−1) ←− v1 = v3v2v2 = (v−1 + v0) v0 = v21v3 = v1v2 v−1 = v21 + v1 cos(v−1) ←−
![Page 70: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/70.jpg)
Calculating Hessian matrices
Hessian
Hessian on computational graph
−1 0
1 2
3
f (x) = sin(x−1)(x−1 + x0)
−1 0
1 2
3
v3
v3
v−1 = x−1 v3 = 1v0 = x0 v2 = v3v1 ←−v1 = sin(v−1) v1 = v3v2 ←−v2 = (v−1 + v0) v0 = v21v3 = v1v2 v−1 = v21 + v1 cos(v−1)
![Page 71: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/71.jpg)
Calculating Hessian matrices
Hessian
Hessian on computational graph
−1 0
1 2
3
f (x) = sin(x−1)(x−1 + x0)
−1 0
1 2
3
v3
v3
∂ϕ1
∂v2= v3
∂ϕ2
∂v1= v3
v−1 = x−1 v3 = 1v0 = x0 v2 = v3v1 ←−v1 = sin(v−1) v1 = v3v2 ←−v2 = (v−1 + v0) v0 = v21v3 = v1v2 v−1 = v21 + v1 cos(v−1)
![Page 72: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/72.jpg)
Calculating Hessian matrices
Hessian
Hessian on computational graph
−1 0
1 2
3
f (x) = sin(x−1)(x−1 + x0)
−1 0
1 2
3
∂ϕj
∂vk= ckj =
∂ϕk
∂vj
v−1 = x−1 v3 = 1v0 = x0 v2 = v3v1v1 = sin(v−1) v1 = v3v2v2 = (v−1 + v0) v0 = v21v3 = v1v2 v−1 = v21 + v1 cos(v−1)
![Page 73: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/73.jpg)
Calculating Hessian matrices
Hessian
Hessian on computational graph
−1 0
1 2
3
f (x) = sin(x−1)(x−1 + x0)
−1 0
1 2
3
∂ϕj
∂vk= ckj =
∂ϕk
∂vj
ckj =∑
i∈S(k)∩S(j)
vi∂2φi
∂vj∂vk
v−1 = x−1 v3 = 1v0 = x0 v2 = v3v1v1 = sin(v−1) v1 = v3v2v2 = (v−1 + v0) v0 = v21v3 = v1v2 v−1 = v21 + v1 cos(v−1)
![Page 74: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/74.jpg)
Calculating Hessian matrices
Hessian
Hessian on computational graph
−1 0
1 2
3
f (x) = sin(x−1)(x−1 + x0) = φ3(φ1(x−1), φ2(x−1, x0))
−1 0
1 2
3
v−1 = x−1 v3 = 1v0 = x0 v2 = v3v1v1 = sin(v−1) v1 = v3v2v2 = (v−1 + v0) v0 = v21v3 = v1v2 v−1 = v21 + v1 cos(v−1)
![Page 75: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/75.jpg)
Calculating Hessian matrices
Hessian
Hessian on computational graph
−1 0
1 2
3
f (x) = sin(x−1)(x−1 + x0) = φ3(φ1(x−1), φ2(x−1, x0))
−1 0
1 2
3
∂2f∂x2−1
= c1−1c21c2−1
v−1 = x−1 v3 = 1v0 = x0 v2 = v3v1v1 = sin(v−1) v1 = v3v2v2 = (v−1 + v0) v0 = v21v3 = v1v2 v−1 = v21 + v1 cos(v−1)
![Page 76: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/76.jpg)
Calculating Hessian matrices
Hessian
Hessian on computational graph
−1 0
1 2
3
f (x) = sin(x−1)(x−1 + x0) = φ3(φ1(x−1), φ2(x−1, x0))
−1 0
1 2
3
∂2f∂x2−1
= c1−1c21c2−1 + c2−1c21c1−1
v−1 = x−1 v3 = 1v0 = x0 v2 = v3v1v1 = sin(v−1) v1 = v3v2v2 = (v−1 + v0) v0 = v21v3 = v1v2 v−1 = v21 + v1 cos(v−1)
![Page 77: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/77.jpg)
Calculating Hessian matrices
Hessian
Hessian on computational graph
−1 0
1 2
3
f (x) = sin(x−1)(x−1 + x0) = φ3(φ1(x−1), φ2(x−1, x0))
−1 0
1 2
3
∂2f∂x2−1
= c1−1c21c2−1 + c2−1c21c1−1 + c−1−1
v−1 = x−1 v3 = 1v0 = x0 v2 = v3v1v1 = sin(v−1) v1 = v3v2v2 = (v−1 + v0) v0 = v21v3 = v1v2 v−1 = v21 + v1 cos(v−1)
![Page 78: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/78.jpg)
Calculating Hessian matrices
Hessian
Hessian on computational graph
−1 0
1 2
3
∂2f∂x2−1
= c1−1c21c2−1∂2f∂x2−1
= c1−1c21c2−1 + c2−1c21c1−1∂2f∂x2−1
= c1−1c21c2−1 + c2−1c21c1−1 + c−1−1
Fold mirror subgraph.
More symmetry
k 99K j iff k L99 j
ckj = cjk
![Page 79: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/79.jpg)
Calculating Hessian matrices
Hessian
Hessian on computational graph
−1 0
1 2
3
∂2f∂x2−1
= c1−1c21c2−1
∂2f∂x2−1
= c1−1c21c2−1 + c2−1c21c1−1∂2f∂x2−1
= c1−1c21c2−1 + c2−1c21c1−1 + c−1−1
Fold mirror subgraph.
More symmetry
k 99K j iff k L99 j
ckj = cjk
![Page 80: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/80.jpg)
Calculating Hessian matrices
Hessian
Hessian on computational graph
−1 0
1 2
3
∂2f∂x2−1
= c1−1c21c2−1
∂2f∂x2−1
= c1−1c21c2−1 + c2−1c21c1−1
∂2f∂x2−1
= c1−1c21c2−1 + c2−1c21c1−1 + c−1−1
Fold mirror subgraph.
More symmetry
k 99K j iff k L99 j
ckj = cjk
![Page 81: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/81.jpg)
Calculating Hessian matrices
Hessian
Hessian on computational graph
−1 0
1 2
3
∂2f∂x2−1
= c1−1c21c2−1∂2f∂x2−1
= c1−1c21c2−1 + c2−1c21c1−1
∂2f∂x2−1
= c1−1c21c2−1 + c2−1c21c1−1 + c−1−1
Fold mirror subgraph.
More symmetry
k 99K j iff k L99 j
ckj = cjk
![Page 82: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/82.jpg)
Calculating Hessian matrices
Hessian
Hessian on computational graph
−1 0
1 2
3
∂2f∂x2−1
= c1−1c21c2−1∂2f∂x2−1
= c1−1c21c2−1 + c2−1c21c1−1∂2f∂x2−1
= c1−1c21c2−1 + c2−1c21c1−1 + c−1−1
Fold mirror subgraph.
More symmetry
k 99K j iff k L99 j
ckj = cjk
![Page 83: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/83.jpg)
Calculating Hessian matrices
Hessian
Hessian on computational graph
−1 0
1 2
3
∂2f∂x2−1
= c1−1c21c2−1∂2f∂x2−1
= c1−1c21c2−1 + c2−1c21c1−1∂2f∂x2−1
= c1−1c21c2−1 + c2−1c21c1−1 + c−1−1
Fold mirror subgraph.
More symmetry
k 99K j iff k L99 j
ckj = cjk
![Page 84: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/84.jpg)
Calculating Hessian matrices
Hessian
Hessian on computational graph
−1 0
1 2
3
∂2f∂x2−1
= c1−1c21c2−1∂2f∂x2−1
= c1−1c21c2−1 + c2−1c21c1−1∂2f∂x2−1
= c1−1c21c2−1 + c2−1c21c1−1 + c−1−1
Fold mirror subgraph.
More symmetry
k 99K j iff k L99 j
ckj = cjk
![Page 85: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/85.jpg)
Calculating Hessian matrices
Hessian
Hessian on computational graph
−1 0
1 2
3
∂2f∂x2−1
= c1−1c21c2−1∂2f∂x2−1
= c1−1c21c2−1 + c2−1c21c1−1∂2f∂x2−1
= c1−1c21c2−1 + c2−1c21c1−1 + c−1−1
∂2f∂x2−1
= 2c1−1c21c2−1
Fold mirror subgraph.
More symmetry
k 99K j iff k L99 j
ckj = cjk
![Page 86: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/86.jpg)
Calculating Hessian matrices
Hessian
Hessian on computational graph
−1 0
1 2
3
∂2f∂x2−1
= c1−1c21c2−1∂2f∂x2−1
= c1−1c21c2−1 + c2−1c21c1−1∂2f∂x2−1
= c1−1c21c2−1 + c2−1c21c1−1 + c−1−1
∂2f∂x2−1
= 2c1−1c21c2−1 + c−1−1
Fold mirror subgraph.
More symmetry
k 99K j iff k L99 j
ckj = cjk
![Page 87: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/87.jpg)
Calculating Hessian matrices
Hessian
Hessian on computational graph
m k
i j
∂2f
∂xi∂xj=∑
nonlinear
edge {m, k}
∑{p| from i to m}
( weight of p) cmk
∑{p| from j to k}
( weight of p) .
![Page 88: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/88.jpg)
Calculating Hessian matrices
Hessian
Hessian on computational graph
m k
i j
∂2f
∂xi∂xj=∑
nonlinear
edge {m, k}
∑{p| from i to m}
( weight of p) cmk
∑{p| from j to k}
( weight of p) .
![Page 89: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/89.jpg)
Calculating Hessian matrices
Hessian
New Reverse Hessian algorithm
Building shortcuts
P(m) = {i , j}.
(m, k) ∈ Path
⇒ path 3 (i ,m, k) or path 3 (j ,m, k)
i j
m k
cmk
cjmcmk
cimcmk
Figure: Pushing the edge {m, k}
![Page 90: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/90.jpg)
Calculating Hessian matrices
Hessian
New Reverse Hessian algorithm
Building shortcuts
P(m) = {i , j}.(m, k) ∈ Path
⇒ path 3 (i ,m, k) or path 3 (j ,m, k)
i j
m kcmk
cjmcmk
cimcmk
Figure: Pushing the edge {m, k}
![Page 91: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/91.jpg)
Calculating Hessian matrices
Hessian
New Reverse Hessian algorithm
Building shortcuts
P(m) = {i , j}.(m, k) ∈ Path
⇒ path 3 (i ,m, k) or path 3 (j ,m, k)
i j
m kcmk
cjmcmk
cimcmk
Figure: Pushing the edge {m, k}
![Page 92: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/92.jpg)
Calculating Hessian matrices
Hessian
New Reverse Hessian algorithm
Building shortcuts
P(m) = {i , j}.(m, k) ∈ Path
⇒ path 3 (i ,m, k) or path 3 (j ,m, k)
i j
m k
cmk
cjmcmk
cimcmk
Figure: Pushing the edge {m, k}
![Page 93: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/93.jpg)
Calculating Hessian matrices
Hessian
New Reverse Hessian algorithm
Simple example of edge pushing execution
−2 −1 0
1 2
4
3
5
6
f (x) = (x−2 + 1)(x−1 + 1)3(x0 + 1)
v1 = v−2 + 1v2 = v−1 + 1v3 = v0 + 1v4 = v1v2v5 = 3v3v6 = v4v5v6 = 1v5 = v4v4 = v5
![Page 94: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/94.jpg)
Calculating Hessian matrices
Hessian
New Reverse Hessian algorithm
Simple example of edge pushing execution
−2 −1 0
1 2
4
3
5
66
f (x) = (x−2 + 1)(x−1 + 1)3(x0 + 1)
v1 = v−2 + 1v2 = v−1 + 1v3 = v0 + 1v4 = v1v2v5 = 3v3v6 = v4v5v6 = 1v5 = v4v4 = v5
![Page 95: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/95.jpg)
Calculating Hessian matrices
Hessian
New Reverse Hessian algorithm
Simple example of edge pushing execution
−2 −1 0
1 2
4
3
5
6
5
f (x) = (x−2 + 1)(x−1 + 1)3(x0 + 1)
v1 = v−2 + 1v2 = v−1 + 1v3 = v0 + 1v4 = v1v2v5 = 3v3v6 = v4v5v6 = 1v5 = v4v4 = v5
![Page 96: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/96.jpg)
Calculating Hessian matrices
Hessian
New Reverse Hessian algorithm
Simple example of edge pushing execution
−2 −1 0
1 2
4
3
5
6
4
f (x) = (x−2 + 1)(x−1 + 1)3(x0 + 1)
v1 = v−2 + 1v2 = v−1 + 1v3 = v0 + 1v4 = v1v2v5 = 3v3v6 = v4v5v6 = 1v5 = v4v4 = v5
![Page 97: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/97.jpg)
Calculating Hessian matrices
Hessian
New Reverse Hessian algorithm
Simple example of edge pushing execution
−2 −1 0
1 2
4
3
5
6
3
f (x) = (x−2 + 1)(x−1 + 1)3(x0 + 1)
v1 = v−2 + 1v2 = v−1 + 1v3 = v0 + 1v4 = v1v2v5 = 3v3v6 = v4v5v6 = 1v5 = v4v4 = v5
![Page 98: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/98.jpg)
Calculating Hessian matrices
Hessian
New Reverse Hessian algorithm
Simple example of edge pushing execution
−2 −1 0
1 2
4
3
5
6
2
f (x) = (x−2 + 1)(x−1 + 1)3(x0 + 1)
v1 = v−2 + 1v2 = v−1 + 1v3 = v0 + 1v4 = v1v2v5 = 3v3v6 = v4v5v6 = 1v5 = v4v4 = v5
![Page 99: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/99.jpg)
Calculating Hessian matrices
Hessian
New Reverse Hessian algorithm
Simple example of edge pushing execution
−2 −1 0
1 2
4
3
5
6
1
f (x) = (x−2 + 1)(x−1 + 1)3(x0 + 1)
v1 = v−2 + 1v2 = v−1 + 1v3 = v0 + 1v4 = v1v2v5 = 3v3v6 = v4v5v6 = 1v5 = v4v4 = v5
![Page 100: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/100.jpg)
Calculating Hessian matrices
Hessian
New Reverse Hessian algorithm
Simple example of edge pushing execution
−2 −1 0
1 2
4
3
5
6f (x) = (x−2 + 1)(x−1 + 1)3(x0 + 1)
v1 = v−2 + 1v2 = v−1 + 1v3 = v0 + 1v4 = v1v2v5 = 3v3v6 = v4v5v6 = 1v5 = v4v4 = v5
![Page 101: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/101.jpg)
Calculating Hessian matrices
Hessian
New Reverse Hessian algorithm
Simple example of edge pushing execution
−2 −1 0
1 2
4
3
5
6f (x) = (x−2 + 1)(x−1 + 1)3(x0 + 1)
v1 = v−2 + 1v2 = v−1 + 1v3 = v0 + 1v4 = v1v2v5 = 3v3v6 = v4v5
f ′′ =
0 X XX 0 XX X 0
![Page 102: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/102.jpg)
Calculating Hessian matrices
Hessian
New Reverse Hessian algorithm
pushing of nonlinear edges
i
j k
i
j k
i
j k
i
j k
i
j k
i
j k
sweepingnode i
wii
wiicji wiickiwiicjicki
wik
wikcji2wikcik
pwip
p
wipcji wipcki
![Page 103: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/103.jpg)
Calculating Hessian matrices
Hessian
New Reverse Hessian algorithm
The pseudo-code of edge pushing
Input: x ∈ Rn,for i = `, . . . , 1 do
Create nonlinear edges if φi is nonlinear ;Push nonlinear edges adjacent to i ;
end
![Page 104: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/104.jpg)
Calculating Hessian matrices
Hessian
Comparative tests
Competitor for edge pushing: Graph coloring
edge pushing implementation aimed at large sparse Hessians.
state-of-the-art competitor: graph coloring methodsGebremedhin, Manne, Pothen, Walther, Tarafdar
Efficient Computation of Sparse Hessians Using Coloring andAutomatic Differentiation(2009)What Color Is Your Jacobian? Graph Coloring for ComputingDerivatives(2005)
![Page 105: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/105.jpg)
Calculating Hessian matrices
Hessian
Comparative tests
Sparsity Pattern1 Color Compact
Calculates once
f ′′(x) f ′′(x)S
⇒
1Uses Walther’s 2008 algorithm
![Page 106: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/106.jpg)
Calculating Hessian matrices
Hessian
Comparative tests
Sparsity Pattern1 Color Compact
Calculates once
f ′′(x) f ′′(x)S
⇒
1Uses Walther’s 2008 algorithm
![Page 107: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/107.jpg)
Calculating Hessian matrices
Hessian
Comparative tests
Sparsity Pattern1 Color Compact Derive Compact Recover
Repeat for each x ∈ RnCalculates once
f ′′(x) f ′′(x)S
⇒
1Uses Walther’s 2008 algorithm
![Page 108: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/108.jpg)
Calculating Hessian matrices
Hessian
Comparative tests
Sparsity Pattern1 Color Compact Derive Compact Recover
Repeat for each x ∈ RnCalculates once
f ′′(x) f ′′(x)S
⇒
1Uses Walther’s 2008 algorithm
![Page 109: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/109.jpg)
Calculating Hessian matrices
Hessian
Comparative tests
Invests a large initial time in 1st run ⇒ fast subsequent runs.
Two different coloring methods with different recoveries: Starand Acyclic.
![Page 110: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/110.jpg)
Calculating Hessian matrices
Hessian
Comparative tests
Invests a large initial time in 1st run ⇒ fast subsequent runs.
Two different coloring methods with different recoveries: Starand Acyclic.
![Page 111: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/111.jpg)
Calculating Hessian matrices
Hessian
Comparative tests
Test set chosen from CUTE
n = 50′000.# colors
Name Pattern Star Acyclic
cosine B 1 3 2chainwoo B 2 3 3bc4 B 1 3 2cragglevy B 1 3 2pspdoc B 2 5 3scon1dls B 2 5 3morebv B 2 5 3augmlagn 5× 5 diagonal blocks 5 5lminsurf B 5 11 6brybnd B 5 13 7arwhead arrow 2 2nondquar arrow + B 1 4 3sinquad frame + diagonal 3 3bdqrtic arrow + B 3 8 5noncvxu2 irregular 12 7ncvxbqp1 irregular 12 7
![Page 112: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/112.jpg)
Calculating Hessian matrices
Hessian
Comparative tests
Numeric Results edge pushing × Colouring methods
Star AcyclicName 1st 2nd 1st 2nd e p
cosine 9.93 0.16 9.68 2.52 0.15chainwoo 35.07 0.33 33.24 5.08 0.30bc4 10.02 0.25 10.00 2.56 0.25cragglevy 28.17 0.79 28.15 2.60 0.48pspdoc 10.31 0.35 10.27 4.39 0.23scon1dls 11.00 0.59 10.97 4.96 0.40morebv 10.36 0.46 10.33 4.49 0.35augmlagn 15.99 0.68 8.36 16.74 0.27lminsurf 9.30 1.01 9.24 3.89 0.35brybnd 11.87 2.44 11.73 12.63 1.68arwhead 176.50 0.16 45.86 0.24 0.20nondquar 166.59 0.18 28.64 2.57 0.12sinquad 606.72 0.26 888.57 1.51 0.32bdqrtic 262.64 1.34 96.87 7.80 0.80noncvxu2 29.69 1.10 29.27 7.76 0.28ncvxbqp1 13.51 2.42 – – 0.37
Averages 87.98 0.78 82.08 5.32 0.41
Variances 25 083.44 0.54 50 313.10 19.32 0.14
1
![Page 113: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/113.jpg)
Calculating Hessian matrices
Hessian
Comparative tests
Graphical comparison: Star 2nd run versus edge pushing.
0.16
0.33
0.25
0.79
0.35
0.59
0.46
0.68
1.01
2.44
0.16
0.18
0.26
1.34
1.1
2.42
0.15
0.3
0.25
0.48
0.23
0.4
0.35
0.27
0.35
1.68
0.2
0.12
0.32
0.8
0.28
0.37
cosine
chainwoo
bc4
cragglevy
pspdoc
scon1dls
morebv
augmlagn
lminsurf
brybnd
arwhead
nondquar
sinquad
bdqrtic
noncvxu2
ncvxbqp1
edge pushing
Star
Time (in seconds)
Functions
1
![Page 114: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/114.jpg)
Calculating Hessian matrices
Hessian
Comparative tests
Summing up
Graph representation:
New algorithm.New perspective.Propagate from known contributions (nonlinear edges).
Algebraic representation:
New correctness.New algorithms.
edge pushing
Exploits the symmetry and sparsity.Promising test results.Lives up to Griewank 16th rule.
The calculation of gradients by nonincrementalreverse makes the corresponding computationalgraph symmetric, a property that should beexploited and maintained in accumulating Hes-sians.
![Page 115: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/115.jpg)
Calculating Hessian matrices
Hessian
Comparative tests
Summing up
Graph representation:New algorithm.
New perspective.Propagate from known contributions (nonlinear edges).
Algebraic representation:
New correctness.New algorithms.
edge pushing
Exploits the symmetry and sparsity.Promising test results.Lives up to Griewank 16th rule.
The calculation of gradients by nonincrementalreverse makes the corresponding computationalgraph symmetric, a property that should beexploited and maintained in accumulating Hes-sians.
![Page 116: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/116.jpg)
Calculating Hessian matrices
Hessian
Comparative tests
Summing up
Graph representation:New algorithm.New perspective.
Propagate from known contributions (nonlinear edges).
Algebraic representation:
New correctness.New algorithms.
edge pushing
Exploits the symmetry and sparsity.Promising test results.Lives up to Griewank 16th rule.
The calculation of gradients by nonincrementalreverse makes the corresponding computationalgraph symmetric, a property that should beexploited and maintained in accumulating Hes-sians.
![Page 117: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/117.jpg)
Calculating Hessian matrices
Hessian
Comparative tests
Summing up
Graph representation:New algorithm.New perspective.Propagate from known contributions (nonlinear edges).
Algebraic representation:
New correctness.New algorithms.
edge pushing
Exploits the symmetry and sparsity.Promising test results.Lives up to Griewank 16th rule.
The calculation of gradients by nonincrementalreverse makes the corresponding computationalgraph symmetric, a property that should beexploited and maintained in accumulating Hes-sians.
![Page 118: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/118.jpg)
Calculating Hessian matrices
Hessian
Comparative tests
Summing up
Graph representation:New algorithm.New perspective.Propagate from known contributions (nonlinear edges).
Algebraic representation:
New correctness.New algorithms.
edge pushing
Exploits the symmetry and sparsity.Promising test results.Lives up to Griewank 16th rule.
The calculation of gradients by nonincrementalreverse makes the corresponding computationalgraph symmetric, a property that should beexploited and maintained in accumulating Hes-sians.
![Page 119: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/119.jpg)
Calculating Hessian matrices
Hessian
Comparative tests
Summing up
Graph representation:New algorithm.New perspective.Propagate from known contributions (nonlinear edges).
Algebraic representation:New correctness.
New algorithms.
edge pushing
Exploits the symmetry and sparsity.Promising test results.Lives up to Griewank 16th rule.
The calculation of gradients by nonincrementalreverse makes the corresponding computationalgraph symmetric, a property that should beexploited and maintained in accumulating Hes-sians.
![Page 120: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/120.jpg)
Calculating Hessian matrices
Hessian
Comparative tests
Summing up
Graph representation:New algorithm.New perspective.Propagate from known contributions (nonlinear edges).
Algebraic representation:New correctness.New algorithms.
edge pushing
Exploits the symmetry and sparsity.Promising test results.Lives up to Griewank 16th rule.
The calculation of gradients by nonincrementalreverse makes the corresponding computationalgraph symmetric, a property that should beexploited and maintained in accumulating Hes-sians.
![Page 121: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/121.jpg)
Calculating Hessian matrices
Hessian
Comparative tests
Summing up
Graph representation:New algorithm.New perspective.Propagate from known contributions (nonlinear edges).
Algebraic representation:New correctness.New algorithms.
edge pushing
Exploits the symmetry and sparsity.Promising test results.Lives up to Griewank 16th rule.
The calculation of gradients by nonincrementalreverse makes the corresponding computationalgraph symmetric, a property that should beexploited and maintained in accumulating Hes-sians.
![Page 122: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/122.jpg)
Calculating Hessian matrices
Hessian
Comparative tests
Summing up
Graph representation:New algorithm.New perspective.Propagate from known contributions (nonlinear edges).
Algebraic representation:New correctness.New algorithms.
edge pushingExploits the symmetry and sparsity.
Promising test results.Lives up to Griewank 16th rule.
The calculation of gradients by nonincrementalreverse makes the corresponding computationalgraph symmetric, a property that should beexploited and maintained in accumulating Hes-sians.
![Page 123: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/123.jpg)
Calculating Hessian matrices
Hessian
Comparative tests
Summing up
Graph representation:New algorithm.New perspective.Propagate from known contributions (nonlinear edges).
Algebraic representation:New correctness.New algorithms.
edge pushingExploits the symmetry and sparsity.Promising test results.
Lives up to Griewank 16th rule.
The calculation of gradients by nonincrementalreverse makes the corresponding computationalgraph symmetric, a property that should beexploited and maintained in accumulating Hes-sians.
![Page 124: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/124.jpg)
Calculating Hessian matrices
Hessian
Comparative tests
Summing up
Graph representation:New algorithm.New perspective.Propagate from known contributions (nonlinear edges).
Algebraic representation:New correctness.New algorithms.
edge pushingExploits the symmetry and sparsity.Promising test results.Lives up to Griewank 16th rule.
The calculation of gradients by nonincrementalreverse makes the corresponding computationalgraph symmetric, a property that should beexploited and maintained in accumulating Hes-sians.
![Page 125: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/125.jpg)
Calculating Hessian matrices
Hessian
Comparative tests
HANK U
QUESTIONS?
![Page 126: Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida](https://reader036.vdocuments.net/reader036/viewer/2022081507/5eda4af6b3745412b5711909/html5/thumbnails/126.jpg)
Calculating Hessian matrices
Hessian
Comparative tests
HANK UQUESTIONS?