large sparse linear systemsds.postech.ac.kr/.../2020/08/large-sparse-linear-system.pdf · 2020. 8....
TRANSCRIPT
포항공과대학교 산업경영학과
Large Sparse Linear System
JeYong Lee
Statistics and Data Science Lab.
August 5, 2020
Contents
1. Before we study the iterative method
2. Krylov subspace method
3. Arnoldi process
4. Lanczos process
5. GMRES
6. Next seminar
3
Before we study iterative method
• Gram-Schmidt process
Given 𝑣1, 𝑣2, 𝑣3, … , we can construct the vectors 𝑒1, 𝑒2, 𝑒3, … which are orthonormal vectors.
Method for orthonormalizing a set of vectors
4
Before we study iterative method
• Modified Gram-Schmidt process
[1] The Gram-Schmidt process from wiki-pedia
Instead of, computing the vector 𝑢𝑘,
To avoid the rounding error, it is computed as
5
Before we study iterative method
• Modified Gram-Schmidt process
Algorithm : Modified Gram–Schmidt
1. q1 = 𝑎1/ 𝑎12.3. For i = 1 to n4. vi = ai5. For j = 1 to i - 16. 𝑟𝑖𝑗 = < v𝑖 , q𝑗 >
7. 𝑣i = 𝑣i − 𝑟𝑖𝑗𝑞j
8. 𝑟𝑖𝑖 = 𝑣𝑖9. 𝑞𝑖 = 𝑣𝑖/𝑟𝑖𝑖
Normalizing
Orthogonalization
[1] Trefethen and Bau. (1997). Numerical Linear Algebra
CGS : 9.1852e-12 MGS : 8.3750e-14
CGS : 2.9912 MGS : 2.1554e-11
6
Before we study iterative method
• CGS vs MGS in the sense of the numerical stability
CGS : 𝑣𝑗 = 𝑣𝑗 − (𝑣𝑘𝑇𝑥𝑗)𝑣𝑘 vs MGS : 𝑣𝑘 = 𝑣𝑘 − (𝑣𝑗
𝑇𝑣𝑘)𝑣𝑗
Now we consider the orthogonalization process of the two methods.
If an error is made in computing 𝑞2 in CGS, so that 𝑞1𝑇𝑞2 = 𝛿 is small,
but non-zero. This will not be corrected and accumulated.
CGSMGS
𝑞2 orthogonality
𝑞1 orthogonality
7
Before we study iterative method
• QR factorization
[1] Trefethen and Bau. (1997). Numerical Linear Algebra
Assume that 𝐴 ∈ ℂ𝑚∗𝑛 (𝑚 ≥ 𝑛) has full column rank n.
For many applications, we have interest in the column spaces of a matrix A.
We want the sequence 𝑞1, 𝑞2, 𝑞3, … to have the property < 𝑎1, 𝑎2, 𝑎3, … , 𝑎𝑗 > = < 𝑞1, 𝑞2, 𝑞3, … , 𝑞𝑗 >, 𝑗 = 1, … , 𝑛
⋮ ⋮ ⋮𝑎1 𝑎2 … 𝑎𝑛⋮ ⋮ ⋮
=⋮ ⋮ ⋮𝑞1 𝑞2 … 𝑞𝑛⋮ ⋮ ⋮
𝑟11 ⋯ 𝑟1𝑛⋱ ⋮
0 𝑟𝑛𝑛, 𝑤ℎ𝑒𝑟𝑒 𝑡ℎ𝑒 𝑑𝑖𝑎𝑔𝑜𝑛𝑎𝑙 𝑒𝑛𝑡𝑟𝑖𝑒𝑠 𝑟𝑘𝑘 𝑎𝑟𝑒 𝑛𝑜𝑛 𝑧𝑒𝑟𝑜
𝑎1, 𝑎2, 𝑎3, …, 𝑎𝑛 can be expressed as linaer combinations of 𝑞1, 𝑞2, 𝑞3, …, 𝑞𝑛
[1] expression of linear combination of orthnormal vectors
8
Before we study iterative method
Direct and iterative methods [1]
[1] Trefethen and Bau. (1997). Numerical Linear Algebra
• Direct method• Solve the problem by a finite sequence of operations• Under the situation in the absence of rounding errors, it would deliver an exact solution• Operate directly on elements of a matrix• O(m3) for general matrices if the matrix 𝐴 ∈ ℂ𝑚∗m
• Iterative method• Solve the problem by finding successive approximations to the solution starting from an initial guess• Useful even for linear problems involving a large number of variables where direct methods would be
prohibitively expensive• Exploit sparsity structure that operate in O(m2)
9
Before we study iterative method
• Exploiting Sparsity in the A∙x
[1] E. Chow and Y. Saad, (2014) preconditioned Krylov subspace methods for sampling multivariate Gaussian distribution. SIAM scientific computing
a11 a12a21 a22
…
…
⋱ ⋮⋮
…
…
⋱ ⋮⋮
⋮Dense
Sparse
N * N N * 1
a1,n−1 a1na2,n−1 a2n
an−1,nan,n
an−1,1an,1
…
⋮=
a11 ⋅ x1 + … + a1n ⋅ xn : 2n−1 flops
⋮=⋮
a11 ⋅ x1 +0 ⋅ x2… + 0 ⋅ xn : 2N(A)1−1 flops
Total flops : 2N(A)- n
Total flops : 2n2- n
N(A) is the number of nonzero elements of the sparse matrix
𝜈 notation, the number of nonzero elements per row, is often used in many practical case.
[1]
10
Krylov Subspace
• The definition of the Krylov subspace
[1] Krylov Subspace. wiki-pedia
[2] Ilse C.F. Ipsen and Carl D. Meyer. (1997) The Idea Behind Krylov Method. The American Mathematical Monthly 105(10) ·November
Def. Krylov subspace [1]
The linear subspace spanned by the image of b under the first r power of A (starting from the I)
• Why Krylov subspace? [2]
• Assume you have to solve the linear equation Ax = b when A is large and sparse.• If you try using the Gaussian elimination to solve this system, the O(n3) operations are required.• But the matrix-vector multiplications can be computed more inexpensively than the above.• So it is not so difficult to handle Κn even when A is very large.
• We will use the some iterative method for solving linear system based on Krylov subspace.• Arnoldi process is very famous and underlying algorithm for following various algorithm.• Our final goal is to reduce the number of operations from O(n3) to O(n2)
11
Arnoldi process
• The algorithm of the Arnoldi process
[1] Saad, Y. (2003). Iterative Methods for Sparse Linear Systems. SIAM, 2nd edition.
Algorithm. Arnoldi process [1]
1. Choose a vector v1 such that v1 = 1
2. For j = 1,2, … , m, Do3. Compute hij = < Avj, vi > for i = 1,2,…,j
4. Compute wj = Avj - σi=1j
hijvi5. hj+1,j = wj
6. If hj+1,j = 0 then Stop
7. vj+1 = wj / hj+1,j8. EndDo
Same as CGS
Each Avj is the given vectors to be orthogonalized
Each vi is the orthonormal vectors which are the basis of the Krylov subspace (cont.)
12
Arnoldi process
• The algorithm of the Arnoldi process
[1] Saad, Y. (2003). Iterative Methods for Sparse Linear Systems. SIAM, 2nd edition.
Algorithm. Arnoldi process (MGS) [1]
1. Choose a vector v1 such that v1 = 1
2. For j = 1,2, … , m, Do3. Compute hij = < Avj, vi > for i = 1,2,…,j
4. Compute wj = Avj - σi=1j
hijvi5. hj+1,j = vj6. If hj+1,j = 0 then Stop
7. vj+1 = wj / hj+1,j8. EndDo
Same as MGS
Each Avj is the given vectors to be orthogonalized
Each vi is the orthonormal vectors which are the basis of the Krylov subspace (cont.)
13
Arnoldi process
The Details of the Arnoldi process
[1] Saad, Y. (2003). Iterative Methods for Sparse Linear Systems. SIAM, 2nd edition.
1. Km = < b, Ab, A2b, … , Am−1b > = < q1, q2, … , qm >
Assume that the Arnoldi process does not stop before the mth step.Then the vectors {v1, v2,…., vm} form an orthonormal basis of the Krylov subspace Km(A, v1): Arnoldi process can be described as the systematic construction of orthonormal bases for successive Krylov subspace.
2. Vm : n x m matrix Ḫm : Hessenberg matrix (m+1) x m⋮ ⋮ ⋮v1 v2 … vm⋮ ⋮ ⋮
14
Arnoldi process
The Details of the Arnoldi process
[1] Saad, Y. (2003). Iterative Methods for Sparse Linear Systems. SIAM, 2nd edition.
Avj = h1jv1 + h2jv2 + … + hj+1,jvj+1 = σi=1j+1
hijvi for j = 1,2,…,m
Recall the orthogonalization process in the Arnoldi,
hj+1,jvj+1 = Avj − (h1jv1 + h2jv2 + … + hjjvj) for j = 1,2,…,m
Im∗m 0k∗1Hk
hm+1,mem
15
Arnoldi process
The Details of the Arnoldi process [1][2]
[1] Saad, Y. (2003). Iterative Methods for Sparse Linear Systems. SIAM, 2nd edition.
[2] Trefethen and Bau. (1997). Numerical Linear Algebra
• hn+1,n works as the stop criterion : i.e. the Arnoldi process breaks down at the n step. (hn+1,n = 0)
• It means that Kn is an invariant subspace of A : i.e. AKn ⊆ Kn
• It leads to the Kn = Kn+1 = Kn+2 = ….• It reach at the point that can’t be more extended
• Each eigenvalue of the Hn is an eigenvalue of A
• If A is nonsingular, then the solution x to the system of equations Ax = b lies in Kn
This is why we consider the Arnoldi process as underlying algorithim among iterative methods using Krylov subspace.
16
Lanczos process
• The Motivation of the Lanczos process [1][2]
[1] Saad, Y. (2003). Iterative Methods for Sparse Linear Systems. SIAM, 2nd edition.
[2] Trefethen and Bau. (1997). Numerical Linear Algebra
Assume that Arnoldi process is applied to a real symmetric matrix A. (some structure applied)
Tridiagonal MatrixHessenberg Matrix
It means that we can reduce from (n+1) term recurrence (at step n) to the three term recurrence (much cheaper!)
Avj = h1jv1 + h2jv2 + … + hj+1,jvj+1 Avj = 𝛽jvj−1 + 𝛼jvj + 𝛽j+1vj+1
17
Lanczos process
• The algorithm of the Lanczos process [1][2]
[1] Saad, Y. (2003). Iterative Methods for Sparse Linear Systems. SIAM, 2nd edition.
[2] Trefethen and Bau. (1997). Numerical Linear Algebra
Algorithm. Lanczos process [1]
1. Choose an initial vector v1 such that v1 = 1. Set 𝛽1 ≡ 0, v0 ≡ 0
2. For j = 1,2, … , m, Do3. wj ≔Avj − 𝛽jvj−14. 𝛼j ≔ < wj, vj >
5. wj ≔ wj - 𝛼jvj6. 𝛽j+1 ≔ wj . If 𝛽j+1 = 0 then Stop
7. vj+1 ≔ wj / 𝛽j+18. EndDo
Numerical stabiltiy
18
GMRES
• GMRES (Generalized Minimal RESidual) [1][2]
[1] Saad, Y. (2003). Iterative Methods for Sparse Linear Systems. SIAM, 2nd edition.
[2] Trefethen and Bau. (1997). Numerical Linear Algebra
Now we consider the how the Arnoldi process can be used to solve the systems of the equation Ax = b.The standard algorithm of this kind (non-symmetric system) is known as GMRES.
Idea : Approximating the x∗ by the vector xn ∈ Kn that minimize the norm of the residual rn = b – Axn
[1] The least squares polynomial approximation problem
It means that xn can be represented by the linear combination of the columns of the Krylov matrix Kn or the orthonormal basis {v1, v2, … , vn}
Thus the problem is to find a coefficient vector c ∈ ℂn such that
Argminc AKnc − b
Argminy AVny − b
Consider xn = Vny instead of Knc ,
where y ∈ ℂn
c ∈ ℂn
19
GMRES
• GMRES (Generalized Minimal RESidual) [1][2]
[1] Saad, Y. (2003). Iterative Methods for Sparse Linear Systems. SIAM, 2nd edition.
[2] Trefethen and Bau. (1997). Numerical Linear Algebra
Argminy AVny − b
Argminy Vn+1Ḫny − b
Since
Argminy Ḫny − Vn+1∗ b
Argminy Ḫny − b e1
At step n of GMRES we solve this problem for y, then set xn = Qny
Since We set the initial vector v1 = b / bv1t
⋮vn+1t
⋅ b v1 = b e1
20
GMRES
• GMRES (Generalized Minimal RESidual) [1][2]
[1] Saad, Y. (2003). Iterative Methods for Sparse Linear Systems. SIAM, 2nd edition.
[2] Trefethen and Bau. (1997). Numerical Linear Algebra
Algorithm. GMRES [2]
1. v1 = b / b
2. For n = 1,2, … Do3. < step n of Arnoldi iteration >4. Find y to minimize Ḫny − b e1 (= rn )5. xn = Qny6. EndDo
21
FOM (Full Orthogonalization Method)
[1] Saad, Y. (2003). Iterative Methods for Sparse Linear Systems. SIAM, 2nd edition.
[2] Trefethen and Bau. (1997). Numerical Linear Algebra
Algorithm. FOM [1]
• FOM based on GMRES [1][2]
22
Next Seminar
[1] Saad, Y. (2003). Iterative Methods for Sparse Linear Systems. SIAM, 2nd edition.
[2] Trefethen and Bau. (1997). Numerical Linear Algebra
• Conjugate Gradient Method • For Symmetric Positive Definte system problem, Ax = b• GMRES is the method for the general matrix A
• Preconditioning• For fast convergence, transformation that conditions a given problem into a form that is more suitable for
numerical solving method.• Reducing a condition number and repositioning the spectrum of specific matrix A.
• Related paper application• Sampling random multivariate Gaussian samples.