05. matrix modelsweb.yonsei.ac.kr/hgjung/lectures/mat203/05 matrix models.pdf · 2014. 12. 29. ·...

E-mail: [email protected]://web.yonsei.ac.kr/hgjung

5. Matrix Models5. Matrix Models


5.1. Dynamical Systems and Markov Chains5.1. Dynamical Systems and Markov Chains

In this section, we will show how matrix methods can be used to analyze the behavior of physical systems that evolve over time.



A dynamical system is a finite set of variables whose values change with time.

The value of a variable at a point in time is called the state of the variable at that time, and the vector formed from these states is called the state of the dynamical system at that time.

Dynamical Systems



Suppose that two competing television news channels, channel 1 and channel 2, each have 50% of the viewer market at some initial point in time. Assume that over each one-year period channel 1 captures 10% of channel 2’s share, and channel 2 captures 20% of channel 1’s share. What is each channel’s market share after one year?

Dynamical Systems

Example 1Example 1



Dynamical Systems

Example 1Example 1

x1(t)=0.8(0.5)+0.1(0.5)=0.45

x2(t)=0.2(0.5)+0.9(0.5)=0.55



Track the market shares of channel 1 and channel 2 in Example 1 over a five-year period.

Dynamical Systems

Example 2Example 2



In many dynamical systems the state of the variables are known with certainty but can be expressed as probabilities; such dynamical systems are called stochastic processes.

Markov Chains

In a stochastic process with n possible states, the state vector at each time t has the form

The entries in this vector must add up to 1 since they account for all n possibilities. In general, a vector with nonnegative entries that add up to 1 is called a probability vector.



A square matrix, each of whose columns is a probability vector, is called a stochastic matrix.

Markov Chains

stochastic matrix



Suppose that a tagged lion can migrate over three adjacent game reserves in search of food, reserve 1, reserve 2, and reserve 3. Based on data about the food resources, researchers conclude that the monthly migration pattern of the lion can be modeled by a Markov chain with transition matrix

Markov Chains

Example 4Example 4

Assuming that t is in months and the lion is released in reserve 2 at time t=0, track its probable locations over a six-month period.



Markov Chains

Example 4Example 4Let x1(k), x2(k), and x3(k) be the probabilities that the lion is in reserve 1, 2, or 3, respectively, at time t=k, and let



In a Markov chain,

Markov Chains in Terms of Powers of The Transition Matrix

For brevity,

Alternatively,

from which it follows that

without computing all of the intermediate states.



We will say that a sequence of vectors


Approaches a limit q or that it converges to q if all entries in xk can be made as close as we like to the corresponding entries in the vector q by taking k sufficiently large.

as kx k q



By imposing a mild condition on the transition matrix of a Markov chain, we can guarantee that the state vectors will approach a limit.


The vector q, which is called the steady-state vector of the Markov chain, is a fixed point of the transition matrix P and hence can be found by solving the homogeneous linear system

subject to the requirement that the solution be a probability vector.



When the transition matrix for a Markov chain is


Example 8Example 8

Find the steady-state vector q.



When the transition matrix for a Markov chain is


Example 9Example 9

Find the steady-state vector q.


5.2. Leontief Input5.2. Leontief Input--Output ModelsOutput Models

In 1973 the economist Wassily Leontief was awarded the Nobel prize for his work on economic modeling in which he used matrix methods to study the relationship between different sectors in an economy. In this section we will discuss some of the ideas developed by Leontief.



A simple economy might be divided into three sectors – manufacturing, agriculture, and utilities. Typically, a sector will produce certain outputs but will require inputs from the other sectors and itself.

We can imagine an economy to be a network in which inputs and outputs flow in and out of the sectors; the study of such flow is called input-output analysis.

Inputs and Outputs in An Economy

There may exist sectors that consume outputs without producing anything themselves (the consumer market, for example). Those sectors are called open sectors. Economies with no open sectors are called closed economies, and economies with one or more open sectors are called open economies.

Out primary goal will be to determine the output levels that are required for the productive sectors to sustain themselves and satisfy the demand of the open sector.



Assume that the inputs required by the productive sectors to produce one dollar’s worth of output are in accordance with the following table.

Leontief Model of An Open Economy

consumption matrix

or technology matrix

consumption vector



Suppose that the open sector wants the economy to supply


The column vector d is called output demand vector.

Suppose that the dollar values required to do this are

The column vector x is called production vector.



Since the product-producing sectors consume some of their own output, the dollar value of their output must cover their own needs plus the outside demand.

Portion of the production vector x that will be consumed by three productive sectors is


intermediate demand vector




Thus, if the outside demand vector is d, then x must satisfy the equation

which we will find convenient to rewrite as

The matrix I-C is called the Leontief matrix and (2) is called the Leontief equation.

(2)



To meet the outside demand, the vector x must satisfy the Leontief equation (2).


Example 1Example 1



The same idea apply to an open economy with n product-producing sectors.

Productive Open Economies

A production vector x that meets the demand d of the outside sector must satisfy the Leontief equation

If matrix I-C is invertible, then this equation has the unique solution.



An economy for which (I-C)-1 has nonnegative entries is said to be productive. Such economies are particularly nice because every demand can be met by some appropriate level of production.

From Theorem 3.6.7,

Productive Open Economies



The column sums of the consumption matrix C are less than 1, (I-C)-1 exists and has nonnegative entries.


Example 2Example 2


5.3. Gauss5.3. Gauss--Seidel and Jacobi IterationSeidel and Jacobi Iteration

Many mathematical models lead to large linear systems in which the coefficient matrix has a high proportion of zeros. In such cases the system and its coefficient matrix are said to be sparse.

In this section we will discuss two methods that are appropriate for linear systems with sparse coefficient matrices.



Let Ax=b be a linear system of n equations in n unknowns with an invertible coefficient matrix A. An iterative method for solving such a system is an algorithm that generates a sequence of vectors

Iterative Methods

called iterates, that converges to the exact solution x in the sense that the entries of xk can be made as close as we like to the entries of x by making k sufficiently large.



The basic procedure for designing iterative methods is to devise matrices B and c that allow the system Ax=b to be rewritten in the form

Iterative Methods

This modified equation is then solved by forming the recurrence relation.

(1)

(2)

and proceeding as follows:

1) Choose an arbitrary initial approximation x0.

2) Substitute x0 into the right side of (2) and compute the first iterate

3) Substitute x1 into the right side of (2) and compute the second iterate

4) Keep repeating the procedure of Step 2 and 3, substituting each new iterate into the right side of (2), thereby producing a third iterate, a forth iterate, and so on. Generate as many iterates as may be required to achieve the desired accuracy.



Let Ax=b be a linear system of n equations in n unknowns in which A is invertible and has nonzero diagonal entries. Let D be the n×n diagonal matrix formed from the diagonal entries of A. The matrix D is invertible since its diagonal entries are nonzero, and hence we can rewrite Ax=b as

Jacobi Iteration

(3)

(4)

The iteration algorithm that uses this formula is called Jacobi iteration or the method of simultaneous displacements.



If Ax=b is the linear system

Jacobi Iteration

then the individual equations in (3) are



Use Jacobi iteration to approximate the solution of the system

Jacobi Iteration

Example 1Example 1

Stop the process when the entries in two successive iterates are the same when rounded to four decimal places.



which we can write in matrix form as

Jacobi Iteration

Example 1Example 1



Jacobi iteration is reasonable for small linear systems but the convergence tends to be too slow for large systems.

Since the new x-values are expected to be more accurate than their predecessors, it seems reasonable that better accuracy might be obtained by using the new x-values as soon as they become available. If this is done, then the resulting algorithm is called Gauss-Seidel iterationor the method of successive displacements.

Gauss-Seidel Iteration



Gauss-Seidel Iteration

Example 1Example 1



We define a square matrix

Convergence

to be strictly diagonally dominant if the absolute value of each diagonal entry is greater than the sum of the absolute values of the remaining entries in the same row; that is



Not strictly diagonally dominant

Convergence

Example 3Example 3

Strictly diagonally dominant since



Convergence


5.4. The Power Method5.4. The Power Method

In this section we will discuss an algorithm that can be used to approximate the eigenvalue with greatest absolute value and a corresponding eigenvector.



There are many applications in which some vector x0 in Rn is multiplied repeatedly by an n×nmatrix A to produce a sequence

The Power Method

We call a sequence of this form a power sequence generated by A.



The Power Method

Example 1Example 1

then λ1=-4 is dominant.

then there isn’t dominant eigenvalue.



The Power Method

(1) can also be expressed as



The Power Method

We will not prove Theorem 5.4.2, but we can make it plausible geometrically in the 2×2 case where A is a symmetric matrix with distinct positive eigenvalue, λ1 and λ2, one of which is dominant.

Since we are assuming that A is symmetric and has distinct eigenvalues, Theorem 4.4.11 tells us that the eigenspaces corresponding to λ1 and λ2 are perpendicular lines through the origin.

Thus, the assumption that x0 is a unit vector that is not orthogonal to the eigenspace corresponding to λ1 implies that x0 does not lie in the eigenspace corresponding to λ2.



To help understand the geometric effect of multiplying x0 by A, it will be useful to split x0 into the sum

The Power Method

(4)

where v0 and w0 are the orthogonal projections of x0 on the eigenspaces of λ1 and λ2, respectively.

This enables us to express Ax0 as

which tells us that multiplying x0 by A “scales” the terms v0 and w0 in (4) by λ1 and λ2, respectively.



However, λ1 is larger than λ2, so the scaling is greater in the direction of v0 than in the direction of w0.

The Power Method

Thus, it seems reasonable that by repeatedly multiplying by A and normalizing we will produce a sequence of vector xk that lie on the unit circle and converge to a unit vector x in the eigenspace of λ1.

Moreover, if xk converges to x, then it also seems reasonable that Axk·xk will converge to

which is the dominant eigenvalue of A.



The Power Method with Euclidean Scaling

Example 2Example 2

In Example 6 of Section 4.4, we found the eigenvalues of A to be λ=1 and λ=5, so the dominant eigenvalue of A is λ=5.

We also showed that the eigenspace corresponding to λ=5 is the line in vector form as

Thus, two dominant unit eigenvectors are




Example 2Example 2




Example 2Example 2

Thus, λ(5) approximates the dominant eigenvalue to five decimal place accuracy and x5approximates the dominant eigenvector correctly to three decimal place accuracy.



max(A) denotes the maximum absolute value of the entries in x.

The Power Method with Maximum Entry Scaling

Rayleigh quotient of A



The Power Method with Maximum Entry Scaling

Example 3Example 3



If A is a symmetric matrix whose distinct eigenvalues can be arranged so that

Rate of Convergence

then the “rate” at which the Rayleigh quotients converge to the dominant eigenvalue λ1depends on the ratio

1

2



relative error

Stopping Procedures

Estimated relative error

In applications one usually knows the relative error E that can be tolerated in the domain eigenvalue.

However, there is a problem in computing the relative error since the eigenvalue λ is unknown.

A rule for deciding when to stop an iterative process is called a stopping procedure.



Many Internet search engines compare words in search phrases to words on pages and in page titles to determine a list of sites relevant to the search. Recently, the power method has been used to develop new kinds of search algorithm that are based on hyperlink between pages, rather than content.

- PageRank algorithm used in Google

- HITS(Hypertext Induced Topic Search) used in Clever of IBM

The basic idea behind both methods is to construct appropriate matrices that describe the referencing structure of pages appropriate to the search, and then use the dominant eigenvectors of those matrices to list the pages in descending order of importance according to certain criteria.

An initial collection S0 of relevant sites search set S HITS algorithm to order those sites

An Application of The Power Method to Internet Searches



To explain the HITS algorithm, suppose that the search set S contains n sites, and define the adjacent matrix for S to be the n×n matrix A=[aij] in which aij=1 if site i references site j and aij=0 if it does not.


Example 5Example 5



There are two basic roles that a site can play in the search engine process – the site may be a hub, meaning that it references many other sites, or it may be authority, meaning that it is referenced by many other sites.

In general, if A is an adjacency matrix for n Internet sites, then the column sums of A measure the authority aspect of the sites and the row sums of A measure their hub aspect.

We call the vector h0 of row sums of A the initial hub vector of A, and we call the vector a0 of column sums of A the initial authority vector of A. The entries in the hub vector are called hub weights and those in the authority vector authority weights.


Example 6Example 6



It seems reasonable that if site 1 is to be considered the greatest authority, then more weight should be given to hubs that link to that site, and if site 4 is to be considered the major hub, then more weight should be given to sites that it links to.

Once Clever has calculated the initial authority vector a0, it then uses the information in that vector to create new hub and authority vectors h1 and a1 using the formulas




which means that we can write (19) as


Similarly, sequence (18) can be expressed as

The matrices AAT and ATA are symmetric, and have positive dominant eigenvalues. Thus, Theorem 5.4.2 ensures that (20) and (21) will converge to dominant eigenvectors of ATAand AAT, respectively.

Clever uses to rank the search sites in order of importance as hubs and authorities.



Suppose that the Clever search engine produces 10 Internet sites in its search set and that the adjacency matrix for those sites is


Example 7Example 7

Use the HITS algorithm to rank the sites in decreasing order of authority for the Clever search engine.



From the entries in a10 we conclude that site 1, 6, 7, and 9 are probably irrelevant to the search and that the remaining sites should be searched in the order

Site 5, site 8, site 2, site 10, site 3 and 4 (a tie)


Example 7Example 7



Although Theorems 5.4.2 and 5.4.3 are stated for symmetric matrices, they hold for certain nonsymmetric matrices as well. For example, we will show later in the text that these theorems are true for any n×n matrix A that has n linearly independent eigenvectors and a dominant eigenvalue and for stochastic matrices.

Indeed, Theorem 5.1.3 is essentially a statement about convergence of power sequences. More precisely, it can be proved that every stochastic matrix has a dominant eigenvalue of λ=1 and that every regular stochastic matrix has a unique dominant eigenvector that is also a probability vector.

Variations of the Power Method

05. matrix modelsweb.yonsei.ac.kr/hgjung/lectures/mat203/05 matrix models.pdf · 2014. 12. 29. ·...

Documents