linear models and matrix algebra - ecourse2.ccu.edu.tw

Linear Models and Matrix AlgebraChapters 4–5 of Chiang and Wainwright (2005)

Tsung-Chih Lai

Fall 2021

Tsung-Chih Lai (CCU) Linear Models and Matrix Algebra Fall 2021 1 / 39

Matrix Algebra

• Matrix algebra:

• gives us a shorthand way of writing a large system of equations.

• allows us to test for the existence of solutions to simultaneous systems.

• allows us to solve a simultaneous system.

• A drawback is that it only works for linear systems. However, we canoften covert non-linear to linear systems, e.g.,

y = axb

⇒ ln y = ln a + b ln x


Matrices and Vectors

• Given two equations,

y = 10 − x ⇒ x + y = 10y = 2 + 3x ⇒ −3x + y = 2

• In matrix form: [1 1−3 1

]︸︷︷︸matrix ofcoefficients

[xy

]︸︷︷︸

vector ofunknowns

=

[102

]︸︷︷︸

vector ofconstants


Matrices and Vectors• In general, suppose there are m equations with n unknowns,

a11x1 + a12x2 + · · ·+ a1nxn = d1

a21x1 + a22x2 + · · ·+ a2nxn = d2

...

am1x1 + am2x2 + · · ·+ amnxn = dm

• In matrix form:a11 a12 · · · a1na21 a22 · · · a2n...

......

am1 am2 · · · amn

︸︷︷︸

A

x1x2...

xn

︸︷︷︸

x

=

d1d2...

dm

︸︷︷︸

d

⇔ Ax = d

where aij is the coefficient found in the i-th row and the j-th columnfor i = 1, . . . , m and j = 1, . . . , n.


Vectors as Special Matrices

• The numbers of rows and columns define the dimension of a matrix,say, A is a m × n matrix.

• A matrix containing only 1 column is called a column vector, e.g.,

x =

x1...

xn

is a n × 1 column vector.

• If a column vector were transposed to a horizontal array using theprime symbol ′, we would have a row vector, i.e.,

x′ = [x1 · · · xn] is a 1 × n row vector.

• A 1 × 1 vector is known as a scalar, e.g., [4] = 4 is a scalar.


Matrix Operations

• If two matrices A = [aij] and B = [bij] have the same dimension, then

A = B iff aij = bij

A+B = [aij + bij]

A−B = [aij − bij]

for all i and j.

• Example:


Matrix Operations

• Suppose we want to multiply a matrix by a scalar

k︸︷︷︸1×1

A︸︷︷︸m×n

• We multiply every element in A by the scalar k, i.e.,

kA = [kaij] =

ka11 ka12 · · · ka1nka21 ka22 · · · ka2n...

......

kam1 kam2 · · · kamn

• Example:


Multiplication of Matrices

• To multiply two matrices A and B, it must be true that A has thesame number of columns as B has rows:

A︸︷︷︸m×n

B︸︷︷︸n×p

= C︸︷︷︸m×p

• The product matrix C will have the same number of rows as A andthe same number of columns as B.

• Each element in C is computed by:

(i) multiplying each element in a given row of A by each element in a givencolumn by Y , and

(ii) summing up their products.


Examples

• Example 1:

• Example 2:

• In fact, this corresponds to the inner product of the two vectorsu = [4 8 4] and v = [9 8 7]:

u · v = (4 × 9) + (8 × 8) + (4 × 7) = 128


Example 3

• Suppose x = [x1 · · · xn]′, then x′ = [x1 · · · xn] and

x′x =[x1 · · · xn

]x1...

xn

=[x2

1 + · · ·+ x2n]=

n

∑i=1

x2i is a scalar.

• However, xx′ is a n × n matrix, i.e.,x1x2...

xn

[x1 x2 · · · xn]=

x2

1 x1x2 · · · x1xnx2x1 x2

2 · · · x2xn...

. . ....

xnx1 xnx2 · · · x2n


Example 4

• Let Ax = d be

• This produces

6x1 + 3x2 + x3 = 22x1 + 4x2 − 2x3 = 124x1 − x2 + 5x3 = 10


Division in Matrix Algebra

• In ordinary algebra,ab= ab−1 = b−1a = c is well defined iff b ̸= 0.

• But in matrix algebra,A

Bis NOT defined. Instead,

AB−1 = C

is well defined iff B−1, the inverse of B, exists.

• Even if B−1 exists, it is generally NOT true that

AB−1 ̸= B−1A

• We will explore these differences later.


Linear Dependence

• Suppose we have two equations:

x1 + 2x2 = 1 (1)

3x1 + 6x2 = 3 (2)

• There is no unique solution as (2) is equal to 3 times (1), i.e., they arelinearly dependent.

• A set of vectors v1, . . . ,vn is said to be linearly dependent iff any oneof them can be expressed as a linear combination of the remainingvectors, or equivalently,

n

∑i=1

kivi = 0

for k1, . . . , kn not all equal to zero. Otherwise, they are linearlyindependent.


Vector Space

• For any 2-dimensional linearly independent vectors u and v, all vectorsgenerated by u and v constitute the 2-dimensional vector space, R2.

• The vectors u and v are called a basis, where the most commonly usedis the unit vectors e1 = [1 0]′ and e2 = [0 1]′.

• For any vectors a and b (not necessary 2-dimensional), the distancebetween a and b is

d(a, b)

with the following properties:

(i) d(a, b) = 0 iff a = b.

(ii) d(a, b) = d(b,a).(iii) d(a, b) ≤ d(a, c) + d(c, b) (triangular inequality).


Metric Space

• A vector space equipped with a distance function satisfying theprevious properties is called a metric space.

• Let a = (a1, . . . , an) and b = (b1, . . . , bn), then the Euclidean n-spaceis the n-dimensional vector space equipped with the Euclidean distance

d(a, b) =√(a1 − b1)2 + · · ·+ (an − bn)2

=

√n

∑i=1

(ai − bi)2

=√(a− b)′(a− b)


Commutative, Associative, and Distributive Laws

• For ordinary algebra we have:

• commutative law of addition: a + b = b + a• commutative law of multiplication: ab = ba• associative law of addition: (a + b) + c = a + (b + c)• associative law of multiplication: (ab)c = a(bc)• distributive law: a(b + c) = ab + ac

• In matrix algebra, most but not all of these laws are true.


Laws of Matrix Addition

• Matrix addition is communicative:

A+B = B +A

• Matrix additiona is also associative:

A+ (B +C) = (A+B) +C


Laws of Matrix Multiplication• Matrix multiplication is in general NOT communtative:

AB ̸= BA

• Example:

• However, matrix multiplication is associative:

A(BC) = (AB)C = ABC

and distributive:

A(B +C) = AB +AC

(B +C)A = BA+CA


Identity Matrices

• An identity matrix is a square matrix with ones on its principaldiagonal and zeros everywhere else:

I2 =

[1 00 1

]In =

1 0 · · · 0

0 1...

.... . . 0

0 · · · 0 1

• In matrix algebra, the identity matrix plays the same role as scalar 1:

IA = AI = A

AIB = (AI)B = A(IB) = AB


Null Matrices

• A null matrix is simply a matrix where all elements equal zero:

02×2 =

[0 00 0

]0m×n =

0 · · · · · · 0...

...0 · · · · · · 0

which plays the same role as scalar 0:

0+A = A+ 0 = A

A0 = 0

0A = 0

but A0 and 0A are not necessarily the same. (why?)


Idiosyncrasies of Matrix Algebra

• In matrix algebra, even if AB = 0, it does not necessarily mean thateither A or B is 0.

• Example:


Transposes

• The transpose of A is denoted as A′ or A⊤ in which the rows andcolumns are interchanged.

• Example:

• Properties of transpose:

(i) (A′)′ = A

(ii) (A+B)′ = A′ +B′

(iii) (AB)′ = B′A′


Inverses

• The inverse of A is denoted as A−1 and is uniquely defined only if Ais a square and nonsingular matrix, where

AA−1 = A−1A = I

• Example:

• Properties of inverse:

(i) (A−1)−1 = A

(ii) (AB)−1 = B−1A−1

(iii) (A′)−1 = (A−1)′


Solving a Linear System

• If Ax = d, then x = A−1d provided that A−1 exists.

• Example:


Linear Dependence and Determinants• Suppose we have the following equations

x1 + 2x2 = 1 (3)

2x1 + 4x2 = 2 (4)

where (4) is twice (3). Therefore, there is no unique solution for x1, x2.

• In matrix form: let Ax = d where[1 22 4

]︸︷︷︸

A

[x1x2

]︸︷︷︸

x

=

[12

]︸︷︷︸d

• The determinant of the coefficient matrix A is

|A| = (1)(4)− (2)(2) = 0

• A determinant of zero tells us that the equations are linearly dependent.


Linear Dependence and Determinants

• In general, the determinant of a square matrix, A, is written as |A|.• Determinants are defined only for square matrices.

• For 2 × 2 case, |A| is uniquely determined by

|A| =∣∣∣∣a11 a12a21 a22

∣∣∣∣ = a11a22 − a12a21

• Any |A| = 0 implies linear independence.


Examples

• Example 1:

• Example 2:


3 × 3 Case

• However, this cross-diagonal method does not work for matricesgreater than 3 × 3.


Laplace Expansion

• The Laplace expansion process evaluates the determinant of a matrix,A, by means of subdeterminants of A.

• Given

A =

a11 a12 a13a21 a22 a23a31 a32 a33

• By deleting the first row and first column, we get a subdeterminant

|M11| =∣∣∣∣a22 a23a32 a33

∣∣∣∣which is called the minor of the element a11.

• Similarly, define |Mij| as the subdeterminant from deleting the i-th rowand the j-th column.


Cofactors

• A cofactor is a minor with a specific algebraic sign:

|Cij| = (−1)i+j|Mij|

• For example,

|C11| = (−1)2|M11| = |M11||C12| = (−1)3|M12| = −|M12|

• The determinant of A is given by

|A| = a11|C11|+ a12|C12|+ a13|C13|

= a11

∣∣∣∣a22 a23a32 a33

∣∣∣∣− a12

∣∣∣∣a21 a23a31 a33

∣∣∣∣+ a13

∣∣∣∣a21 a22a31 a32

∣∣∣∣• Note that Laplace expansion can be used along any row or any column.


Example


Basic Properties of Determinants

(i) The interchange of rows and columns does not affect the value of adeterminant, i.e., |A| = |A′|.

(ii) The interchange of any two rows (or any two columns) will alter thesign, but not the numerical value, of the determinant.

(iii) The multiplication of any one row (or one column) by a scalar k willchange the value of the determinant k-fold.

(iv) The addition/subtraction of a multiple of any row (column) to/fromanother row (column) will leave the value of the determinant unaltered.

(v) If one row (column) is a multiple of another row (column), the value ofthe determinant will be zero.


Rank of a Matrix

• The rank of a matrix is the maximum number of linearly independentrows in the matrix.

• If A is an m × n matrix, then the rank of A is

rank(A) ≤ min{m, n}

• How to use determinants to find the rank of a matrix?

Step 1 Suppose A is n × n and |A| = 0.Step 2 Then delete one row and one column, and find the determinant of this

new (n − 1)× (n − 1) matrix.

Step 3 Continue this process until you have a non-zero determinant.


Matrix Inversion

• Given an n × n matrix A with |A| ̸= 0, the cofactor matrix of A(denoted by C) is a matrix who’s elements are the cofactors of theelements of A.

• For example, if A =

[a11 a12a21 a22

], then

C =

[|C11| |C12||C21| |C22|

]=

[a22 −a21−a12 −a11

]• The inverse of A is given by

A−1 =1|A| adjA

where adjA is the adjoint matrix of A and is defined as C ′.


Matrix Inversion

• The general procedure for finding the inverse of a square matrix Ainvolves the following steps:

(i) find |A| to see if |A| = 0 (if so, then A−1 is undefined).

(ii) find the cofactors of all the elements of A, and arrange them as amatrix C.

(iii) take the transpose of C to get adjA.

(iv) divide adjA by |A|.


Example


Cramer’s Rule• Suppose there are two equations

a1x1 + a2x2 = d1

b1x1 + b2x2 = d2

written in matrix form as Ax = d, where[a1 a2b1 b2

]︸︷︷︸

A

[x1x2

]︸︷︷︸

x

=

[d1d2

]︸︷︷︸

d

with |A| = a1b2 − b1a2 ̸= 0.• After some manipulation, we have

x1 =d1b2 − d2a2

a1b2 − b1a2

where the denominator is |A|, and the numerator is the same as thedenominator except d1 and d2 replaces a1 and b1, respectively.


Cramer’s Rule• Therefore, we can rewrite x1 as

x1 =

∣∣∣∣d1 a2d2 b2

∣∣∣∣∣∣∣∣a1 a2b1 b2

∣∣∣∣ =d1b2 − d2a2

a1b2 − b1a2

where d = [d1 d2]′ replaces column 1 of A in the numerator.

• Similarly,

x2 =

∣∣∣∣a1 d1b1 d2

∣∣∣∣∣∣∣∣a1 a2b1 b2

∣∣∣∣ =d1b2 − d2a2

a1b2 − b1a2

where d = [d1 d2]′ replaces column 2 of A in the numerator.

• Generally, to find xi for i = 1, . . . , n, replace column i with vector d,and then calculate the determinant.


Example


linear models and matrix algebra - ecourse2.ccu.edu.tw

Documents