math 685/ csi 700/ or 682 lecture notes

60
MATH 685/ CSI 700/ OR 682 Lecture Notes Lecture 4. Least squares

Upload: yannis

Post on 03-Feb-2016

22 views

Category:

Documents


0 download

DESCRIPTION

MATH 685/ CSI 700/ OR 682 Lecture Notes. Lecture 4. Least squares. Method of least squares. Measurement errors are inevitable in observational and experimental sciences Errors can be smoothed out by averaging over many cases, i.e., taking more measurements than are strictly - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: MATH 685/ CSI 700/ OR 682  Lecture Notes

MATH 685/ CSI 700/ OR 682 Lecture Notes

Lecture 4.

Least squares

Page 2: MATH 685/ CSI 700/ OR 682  Lecture Notes

Method of least squares

• Measurement errors are inevitable in observational and experimental sciences

• Errors can be smoothed out by averaging over many cases, i.e., taking more measurements than are strictly necessary to determine parameters of system

• Resulting system is overdetermined, so usually there is no exact solution

• In effect, higher dimensional data are projected into lower dimensional space to suppress irrelevant detail

• Such projection is most conveniently accomplished by method of least squares

Page 3: MATH 685/ CSI 700/ OR 682  Lecture Notes

Linear least squares

Page 4: MATH 685/ CSI 700/ OR 682  Lecture Notes

Data fitting

Page 5: MATH 685/ CSI 700/ OR 682  Lecture Notes

Data fitting

Page 6: MATH 685/ CSI 700/ OR 682  Lecture Notes

Example

Page 7: MATH 685/ CSI 700/ OR 682  Lecture Notes

Example

Page 8: MATH 685/ CSI 700/ OR 682  Lecture Notes

Example

Page 9: MATH 685/ CSI 700/ OR 682  Lecture Notes

Existence/Uniqueness

Page 10: MATH 685/ CSI 700/ OR 682  Lecture Notes

Normal Equations

Page 11: MATH 685/ CSI 700/ OR 682  Lecture Notes

Orthogonality

Page 12: MATH 685/ CSI 700/ OR 682  Lecture Notes

Orthogonality

Page 13: MATH 685/ CSI 700/ OR 682  Lecture Notes

Orthogonal Projector

Page 14: MATH 685/ CSI 700/ OR 682  Lecture Notes

Pseudoinverse

Page 15: MATH 685/ CSI 700/ OR 682  Lecture Notes

Sensitivity and Conditioning

Page 16: MATH 685/ CSI 700/ OR 682  Lecture Notes

Sensitivity and Conditioning

Page 17: MATH 685/ CSI 700/ OR 682  Lecture Notes

Solving normal equations

Page 18: MATH 685/ CSI 700/ OR 682  Lecture Notes

Example

Page 19: MATH 685/ CSI 700/ OR 682  Lecture Notes

Example

Page 20: MATH 685/ CSI 700/ OR 682  Lecture Notes

Shortcomings

Page 21: MATH 685/ CSI 700/ OR 682  Lecture Notes

Augmented system method

Page 22: MATH 685/ CSI 700/ OR 682  Lecture Notes

Augmented system method

Page 23: MATH 685/ CSI 700/ OR 682  Lecture Notes

Orthogonal Transformations

Page 24: MATH 685/ CSI 700/ OR 682  Lecture Notes

Triangular Least Squares

Page 25: MATH 685/ CSI 700/ OR 682  Lecture Notes

Triangular Least Squares

Page 26: MATH 685/ CSI 700/ OR 682  Lecture Notes

QR Factorization

Page 27: MATH 685/ CSI 700/ OR 682  Lecture Notes

Orthogonal Bases

Page 28: MATH 685/ CSI 700/ OR 682  Lecture Notes

Computing QR factorization To compute QR factorization of m × n matrix A, with m > n, we annihilate subdiagonal entries of successive columns of A, eventually reaching upper triangular form

Similar to LU factorization by Gaussian elimination, but use orthogonal transformations instead of elementary elimination

matrices

Possible methods include Householder transformations Givens rotations Gram-Schmidt orthogonalization

Page 29: MATH 685/ CSI 700/ OR 682  Lecture Notes

Householder Transformation

Page 30: MATH 685/ CSI 700/ OR 682  Lecture Notes

Example

Page 31: MATH 685/ CSI 700/ OR 682  Lecture Notes

Householder QR factorization

Page 32: MATH 685/ CSI 700/ OR 682  Lecture Notes

Householder QR factorization

Page 33: MATH 685/ CSI 700/ OR 682  Lecture Notes

Householder QR factorization For solving linear least squares problem, product Q ofHouseholder transformations need not be formed explicitly

R can be stored in upper triangle of array initiallycontaining A

Householder vectors v can be stored in (now zero) lowertriangular portion of A (almost)

Householder transformations most easily applied in thisform anyway

Page 34: MATH 685/ CSI 700/ OR 682  Lecture Notes

Example

Page 35: MATH 685/ CSI 700/ OR 682  Lecture Notes

Example

Page 36: MATH 685/ CSI 700/ OR 682  Lecture Notes

Example

Page 37: MATH 685/ CSI 700/ OR 682  Lecture Notes

Example

Page 38: MATH 685/ CSI 700/ OR 682  Lecture Notes

Givens Rotations

Page 39: MATH 685/ CSI 700/ OR 682  Lecture Notes

Givens Rotations

Page 40: MATH 685/ CSI 700/ OR 682  Lecture Notes

Example

Page 41: MATH 685/ CSI 700/ OR 682  Lecture Notes

Givens QR factorization

Page 42: MATH 685/ CSI 700/ OR 682  Lecture Notes

Givens QR factorization Straightforward implementation of Givens method requiresabout 50% more work than Householder method, and alsorequires more storage, since each rotation requires twonumbers, c and s, to define it

These disadvantages can be overcome, but requires morecomplicated implementation

Givens can be advantageous for computing QRfactorization when many entries of matrix are already zero,since those annihilations can then be skipped

Page 43: MATH 685/ CSI 700/ OR 682  Lecture Notes

Gram-Schmidt orthogonalization

Page 44: MATH 685/ CSI 700/ OR 682  Lecture Notes

Gram-Schmidt algorithm

Page 45: MATH 685/ CSI 700/ OR 682  Lecture Notes

Modified Gram-Schmidt

Page 46: MATH 685/ CSI 700/ OR 682  Lecture Notes

Modified Gram-Schmidt QR factorization

Page 47: MATH 685/ CSI 700/ OR 682  Lecture Notes

Rank Deficiency If rank(A) < n, then QR factorization still exists, but yieldssingular upper triangular factor R, and multiple vectors xgive minimum residual norm

Common practice selects minimum residual solution xhaving smallest norm

Can be computed by QR factorization with column pivotingor by singular value decomposition (SVD)

Rank of matrix is often not clear cut in practice, so relativetolerance is used to determine rank

Page 48: MATH 685/ CSI 700/ OR 682  Lecture Notes

Near Rank Deficiency

Page 49: MATH 685/ CSI 700/ OR 682  Lecture Notes

QR with Column Pivoting

Page 50: MATH 685/ CSI 700/ OR 682  Lecture Notes

QR with Column Pivoting

Page 51: MATH 685/ CSI 700/ OR 682  Lecture Notes

Singular Value Decomposition

Page 52: MATH 685/ CSI 700/ OR 682  Lecture Notes

Example: SVD

Page 53: MATH 685/ CSI 700/ OR 682  Lecture Notes

Applications of SVD

Page 54: MATH 685/ CSI 700/ OR 682  Lecture Notes

Pseudoinverse

Page 55: MATH 685/ CSI 700/ OR 682  Lecture Notes

Orthogonal Bases

Page 56: MATH 685/ CSI 700/ OR 682  Lecture Notes

Lower-rank Matrix Approximation

Page 57: MATH 685/ CSI 700/ OR 682  Lecture Notes

Total Least Squares Ordinary least squares is applicable when right-hand

side b is subject to random error but matrix A is known accurately

When all data, including A, are subject to error, then total least squares is more appropriate

Total least squares minimizes orthogonal distances, rather than vertical distances, between model and data

Total least squares solution can be computed from SVD of [A, b]

Page 58: MATH 685/ CSI 700/ OR 682  Lecture Notes

Comparison of Methods

Page 59: MATH 685/ CSI 700/ OR 682  Lecture Notes

Comparison of Methods

Page 60: MATH 685/ CSI 700/ OR 682  Lecture Notes

Comparison of Methods