non stationary markov chain with constant casuative matrices

7/31/2019 Non Stationary Markov Chain With Constant Casuative Matrices

1/15

A Matrix Approach to Nonstationary ChainsAuthor(s): Frank Harary, Benjamin Lipstein and George P. H. StyanReviewed work(s):Source: Operations Research, Vol. 18, No. 6 (Nov. - Dec., 1970), pp. 1168-1181Published by: INFORMSStable URL: http://www.jstor.org/stable/169413 .Accessed: 08/05/2012 10:56

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of

content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

INFORMS is collaborating with JSTOR to digitize, preserve and extend access to Operations Research.

http://www.jstor.org
http://www.jstor.org/action/showPublisher?publisherCode=informshttp://www.jstor.org/stable/169413?origin=JSTOR-pdfhttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/stable/169413?origin=JSTOR-pdfhttp://www.jstor.org/action/showPublisher?publisherCode=informs


2/15

A MATRIX APPROACH TO NONSTATIONARY

CHAINS

Frank Harary

University of Michigan, Ann Arbor, Michigan

Benjamin Lipstein

Sullivan, Stauffer, Colwell & Bayles, Inc., New York, N. Y.

andGeorge P. H. Styan

McGill University, Montreal, Quebec, Canada

(Received December 15, 1969)

A finite discrete nonstationary Markov chain is completely characterized(after the initial probability distribution has taken effect) by its timesequence of transition probability matrices Pi. The ith causative matrixCi is defined as the product P`7' (if it exists) times Pi+. Thus, the causa-tive matrices are analogous to derivatives in calculus as an indication ofrate of change. The eigenvalues and eigenvectors of a constant causativematrix C have been found useful in their connection with the tendency ofthe chain to be convergent or divergent. Results for two-state chains arepresented in some detail. A comprehensive bibliography of papers on non-stationary chains is included.

m HE FIELD OF Markov chains applies to a wide range of subjectI matter. The assumption that the transition probabilities of a Markov

chain are stationary leads to a rich body of theorems, which serve as a good

first approximation to the real world. There are, nevertheless, very manyreal situations to which the model of a stationary chain (M\arkov chainwith stationary transition probabilities) is inappropriate.

This paper develops a matrix approach (cf. LIPSTEIN[12,13] for studyingnonstationary chains. We explore the notion of a causative matrix that,when multiplying a transition probability matrix, yields the immediatelysubsequent one. The special model involving a constant causative matrixis studied in detail.

The general case of nonstationary chains has received only little atten-

tion in the literature. HAJNAL t3'41 MOTT [15"16' and SARYMSAKOV 201seem to be the only papers in English. LINNIK,[0'1 SARYMSAKOV, 17,18,19]and SARYMSAKOV ND MUSTAFIN 21] study the subject in Russian.

All of these papers consider nonstationary chains from a point of view1168


3/15

A Matrix Approach o Nonstationary Chains 1169

different from ours. We show that the basic descriptive characteristics ofnonstationary chains can be captured and described by means of a sequenceof causative matrices.

Nonstationary chains are characterized by either convergent or diver-gent behavior. When convergent, the chain tends toward complete inde-pendence, as represented by a Bernoulli process.

We contend that nonstationary chains in general can be studied moreeffectively through identifying a functional relation between their causativematrices. The constant chain constitutes a special case of such a relation.Higher-order nonstationary chains may be susceptible to systematic studythrough the derivation of higher-order causative matrices analogous tohigher-order differences or differentials.

A stationary chain is a stochastic process containing a finite number nof states E1, E2, *, * E, such that (a) there is an initial distribution (a,, a2,***, an), where ak is the probability that the first state is Ek, and (b) thereis a transition probability matrix

pll P12 p inp P21 P22 ... P2n

Pni Pn2 ... Pnn

where pij is the conditional probability of Ej occurring given that Ei is thepresent state. In view of (a) and (b), we have: O


4/15

1170 Frank Harary, Benjamin Lipstein, and George P. H. Styan

then the chain is nonstationary. Another kind of special case results fromthe situations in which the general transition matrix at time t, Pt, has as itsentries pij=fij(t). In this case, every entry is a function of t, and theprobabilistic behavior of the entire nonstationary process is known when fis given.

But there are many situations in which only the first few transitionmatrices Pt are known, and the problem is to predict the future behaviorof the chain. This is a problem of real interest in studying chains, bothstationary and nonstationary. In the case of stationary chains, the pre-diction of future behavior of the events is well known (cf. KEMENY ANDSNELL, [9 and STYAN AND SMITH 221). The graphical structure of thechain contributes additional information, as mentioned in HARARY ANDLIPSTEIN, 5] and HARARY, NORMAN, AND CARTWRIGHT. ] Our concernis to devise methods for a similar treatment of some nonstationary chains.

CAUSATIVE MATRICES

CONSIDER A nonstationary chain, as given by a sequence of transitionmatrices P1, P2, P3, . In order to describe the change occurring fromeach of these transition matrices to the next, we introduce an accompanyingsequence of matrices Ci, C2, , which will be called causative matricesdefined by the equations

P1C1= P2, P2C2= P3, *, PtCt=Pt+l, *. (1)

Each causative matrix can be immediately expressed in terms of the transi-tion matrices, provided they are all nonsingular:

C1= P1 P2, and in general, Ct = P-1Pt+?. (2)

We emphasize that the causative matrices Ct are merely devices for de-scribing the change involved between each transition matrix and the nextone.

In these terms, a stationary chain is the special case obtained by takingevery Ct =I, the identity matrix of order n. Of course when all the transi-tion matrices are different, none of the causative matrices will be the iden-tity. We note that the assumption that every Pt is nonsingular is not astrong restriction of generality, the reason being that even a small changein the values of the entries of a singular matrix will result in nonsingularity(cf. HOUSEHOLDER [8]).

To illustrate, given the two stochastic matrices

/0.7 0 0.3\ (0.9 o 0.i\'P1= 0.2 0.8 0), P2= 0 0.8 0.2),

\.1 0 0.9/ \.3 0 0.7/


5/15


we find/1.2 0 -0.2'

Cl=P]71P2= -0.3 1.0 0.3\0.2 0 0.8

It is not entirely coincidental hat the causative matrix C1 n this example resemblesthe identity matrix n the sense that the diagonal entries are near 1 and the otherentries near 0.

The causative matrices Ct have all been the ones that multiply theprobability matrix Pt at each time stage t on the right to obtain the nextmatrix Pt+,. There is no a priori reason for choosing right multiplicationfor this purpose rather than left. Thus, we may refer to the matrices Ct asright causative matrices and introduce the corresponding sequence Dt ofleft causative matrices induced by the nonstationary chain whose probabilitymatrix sequence is P1, P2, ', by the equations

D1Pi= P2, D2P2= P3, , DtPt= Pt+,) (3)

Analogously to (2), we have, when each Pt is nonsingular,

D1= P2PL1, D2= P3P1,2 , Dt= Pt+?Pt1 . (4)

In general, the left and right causative matrices are different. Just as theaction on a given matrix M by a permutation matrix A on the left, AM,permutes the rows of M, and, on the right, MA, permutes its columns, acorresponding effect is found with left and right causative matrices empha-sizing the rows and columns, respectively. In the language of binary rela-tions, this phenomenon is known as "directional duality" (cf. HARARY,NORMAN, AND CARTWRIGHT 1).

An example nvolving 2 X2 matrices will illustrate his point:

Pi=8

?0.2) P2(0.9

0.18)020.8/ 0.2 0.8/

(1 13 0.13 ) -pp 1= (117 -0.17)

On the other hand, when the roles of Pi and P2 are interchanged n this 2 X2 ex-ample, so that

P /= 0.9 0.1 and P2-(0.8 0.2\\0.2 0.8/ 0. 2 0.8/we find that

(0.89 0 11) and Dl=P2p10=(86 0.14)

We will use right causative matrices for the remainder of this section.By a (right) constant chain we will mean a nonstationary chain P1,

P2 ,in which all the (right) causative matrices are equal; we will call


6/15

1172 Frank Harary, Benjamin ipstein, nd George P. H. Styan

this matrix C. In this case, we verify at once that

Pt+s=PtC8, 8=0, 1, (5)Since C= PP1P2 = P2 1P3, it follows at once that P3 is expressible in terms ofP1 and P2 by the equation P3 = P2P1 1P2. In general, we find that

Pt+,= PYPtlPt. (6)

Therefore, every transition matrix Pt of a constant chain may be expressedin terms of P, and P2 by the equations

Pt= ( P2P-1) t-2p2 = VAp) t-2.( ,3 7

By definition of a constant chain, every transition matrix is determined assoon as P1 and C are given, and in fact Pt = P1Ct-1 ollows from (5). Con-siderable information about a constant chain can be obtained from C alone.

To ease the algebra, we set P1 = Q. Then, if P(u, v) denotes the transi-tion probability matrix from time periods u to v, we have

P(to+r, to+r+-1) =QC), (r=O, 1, .) (8)

where Q = P =P(to, to+ 1). The starting point to is arbitrary.Hence

P(to, to+r_+1) =Q .QCQC2. Qrr=H:r (QCg) = T1, (9)

say, is the transition probability matrix from time periods to to to+r-+l.The limiting properties of a constant chain depend on the convergence ordivergence of C' and Tr, as r-* o. LIPSTEIN 121 conjectured that Crdiverged only when the largest characteristic root of C exceeded one inabsolute value, and implied that if all roots of C equalled unity, then CI.That this is not so will be illustrated in the next section for the case of 2states.

THEOREM 1. Let Q and R be stochastic matrices of order n. Then if Q isnonsingular, the causative matrices

C=QRW, D=RQ1, (10)

have row sums of unity but may have negative elements.Proof. Let e= (1, 1, *, 1)' be a column vector with each component

unity. Then Qe = Re = e. Hence Q-'e = e and so Ce = Q-1Re = Q-1e = e.Similarly, De= e. Examples of C and D with at least one element negativewere given above, just below (4).

It follows from Theorem 1 that a causative matrix has a characteristicroot of unity. When all other roots are less than one in absolute value,Cr converges to el' as r-- oo, where 1' is the left-h and characteristic vectorof C corresponding to the unit root (the righthand vector is e). LIP-


7/15


STEIN[121 suggested that in such a case Tr also converged to el'. This isimmediate only for C stochastic; it may happen that C is not stochasticbut Tr and 1' have all elements nonnegative. In this latter case, we havebeen unable to prove in general that Tr and C have the same limit, sincethe number of component matrices in the product Tr increases with r. Inthis next section we study the situation for the two-state case. The resultsof HAJNAL, [],4 MOTT, [15161 and SARYMSAKOV 1201 do not seem to assist inobtaining more general results.

The question taken up in the next section is: when does Tr cease to bestochastic (C nonstochastic)? That is, what values can C take on so thatthe chain may have a constant causative matrix?

TWO-STATE NONSTATIONARY CHAINS

IN THIS SECTION, we outline a theory of two-state nonstationary chainswith constant causative matrix, with emphasis on presenting limitingvalues. Let the nonstationary chain have two states with constant causa-tive matrix

c U()z- ) ) ( -d, d (

thus

u= [bc- (1-a) (1-d)]/(a+-b-1) (12)-(a+d-ad+bc-1)/(a+b- 1)

andX= (c+d- 1)/(a+b- 1) = [tr-(R) lJ/[tr(Q) -1] (13)

is the nonunit characteristic root of C. It follows immediately that C' hascharacteristic roots 1 and XA, o that tr(C8) = 1+ . Hence we may write

C (= i n z1-us)) (s=, 2, ... ) (14)

where we take u1 u. To evaluate us, we find from Cs+'=CCs=CsC that

us+i-us-X5+uX)=u-X+Xus. (s= 1, 2, ... ) (15)

When X=1, we obtain us+, us -1+ u, while otherwise

us=[u-X+Xs(1-u)j/(1-X). (16)Hence,

us-tr X +sks \l x1 X X /(X=1 (17)


8/15


From (17) we obtain at once

lim u = {- X / (1X) (X=U=1)

and this limit does not exist otherwise.As an immediate corollary of (17) and (18), we find that the limit of

C8 is given bylirn C=e(u-X, 1-u)/(1-X). (-1


9/15


When 0?X


10/15


We summarize the above results as:THEOREM 2. Let X be the nonunit characteristic root of the causative matrixC. Whenever 0< X< 1, H` (QC8) is stochastic f and only if C is. When-ever -1


11/15


when a+b-1>0 and O


12/15


causative matrix, which represents the rate of change from one transitionmatrix to another. This approach to nonstationary chains appears to benew, as a search of the literature shows.

d

1.0

\ /? "-_ _\ \>g (0.6, 0.9)

\\ \ -X=+1

0.5 \O \U=X I

1 \~u=Q \

(0.4,0U.) Ul+

C

0 0.5 1.0

Region C QCs stochastic

1 stochastic all s21 3 not stochastic all s4, 5 not stochastic s bounded by (32/33)

Fig. 1. Regions in the (c, d) plane for stochastic QSS.

The causative matrix contains information describing the behavior andproperties of the transition probability matrices. The eigenvalues andeigenvectors of the causative matrix reflect the character of the sequence oftransition matrices throwing light on the convergent or divergent natureof the chain.


13/15

A Matrix Approach to Nonstationary Chains 1179

This analysis has provided interesting insights into real-world phenom-ena that have been empirically validated. It has effectively provided thebasis for powerful models of previously intractable situations. From amathematical point of view, the work opens a new domain for a probabi-listic investigation of chains.

d

0.9* (0.64, 0.86)

(0.7, 0.8)

(0.73, 0.77)

IS0-O (0.8, 0.7)/SofW\~~~S

s03

QC,..., QC10 0=stochastic (1, 0.5)

QC stochastic,{ QC2, .not stochastic

{ QC, QC2 stochastic,

QC3, not stochastic

0.6 1

Fig. 2. A region in the (c, d) plane where QCS is stochastic only for s


14/15


We have concentrated here on the simple cases of constant causativematrices in the case of two states. Extensions to constant finite chainswith an arbitrary number of states as well as to other patterns of causativematrices are in progress.

REFERENCES

1. EHRENBERG, . S. C., "An Appraisal of Markov Brand-Switching Models,"J. Marketing Res. 2, 347-362 (1965).

2. , "On Clarifying M and M," J. Marketing Res. 5, 228-229 (1968).3. HAJNAL, J., "The Ergodic Properties of Nonhomogeneous Finite Markov

Chains," Proc. Cambridge Philos. Soc. 52, 67-77 (1956).4. - , "Weak Ergodicity in Nonhomogeneous Markov Chains," Proc. Cam-

bridge Philos. Soc. 54, 233-246 (1958).5. HARARY, FRANK, AND LIPSTEIN, BENJAMIN, "The Dynamics of Brand Loyalty:

a Markovian Approach," Opns. Res. 10, 19-40 (1962).6. , AND STYAN, GEORGE P. H., "A Matrix Approach to Nonstationary

Chains," Technical Report No. 120, Department of Statistics, Universityof Minnesota, May 20, 1969.

7. , NORMAN, ROBERT Z., AND CARTWRIGHT, DORWIN, Structural Models:An Introduction to the Theory of Directed Graphs, Wiley, New York, 1965.

8. HOUSEHOLDER, ALSTON S., The Theory of Matrices in Numerical Analysis,Blaisdell, New York, 1964.

9. KEMENY, OHNG., ANDSNELL, . LAURIE, inite Markov Chains, Van Nostrand,Princeton, 1960.

10. LJNNIK, Yu. V., "Inhomogeneous Markov Chains" (In Russian), Dokl. Akad.Nauk SSSR 60, 21-24 (1948).

11. , "On the Theory of Inhomogeneous Markov Chains" (In Russian),Izv. Akad. Nauk SSSR, Ser. Mat. 13, 65-94 (1949).

12. LIPSTEIN, BENJAMIN, "A Mathematical Model of Consumer Behavior, " J.Marketing Res. 2, 259-265 (1965).

13. , "Test Marketing: a Perturbation in the Market Place," ManagementSci. Ser. B 14, 437-448 (1968).14. MASSY, WILLIAM ., AND MORRISON, DONALD G., "Comments on Ehrenberg's

Appraisal of Brand-Switching Models," J. Marketing Res. 5, 225-228 (1968).15. MoTT, J. L., "Conditions for the Ergodicity of Nonhomogeneous Finite Markov

Chains," Proc. Royal Soc. Edinburgh Sect. A. 64, 369-380 (1957).16. , "The Central Limit Theorem for a Convergent Nonhomogeneous

Finite Markov Chain," Proc. Royal Soc. Edinburgh Sect. A. 65, 109-120(1959).

17. SARYMSAKOV, . A., "On the Ergodic Principle for Nonhomogeneous Markov

Chains" (In Russian), Dokl. Akad. Nauk SSSR 90, 25-28 (1953).18. , "On the Theory of Inhomogeneous Markov Chains" (In Russian)

Dokl. Akad. Nauk UzSSR, 8, 3-7 (1956).19. , "Inhomogeneous Markov Chains" (In Russian), Dokl. Akad. Nauk

SSSR 120, 465-467 (1958).


15/15


20. , "Inhomogeneous Markov Chains," Theory Prob. Apple. 6, 178-185(1961).

21. ,AND MUSTAFIN, KH. A., "On the Ergodic Theorem for InhomogeneousMarkov Chains" (In Russian), Trudy Sredneaziatskif GosudarstvennytUniversitet Tashkent, Ser. Fiz.-Mat. Nauki 74 (15), 1-37 (1957).

22. STYAN, GEORGE P. H., AND SMITH, HARRY, JR., "Markov Chains Applied toMarketing," J. Marketing Res. 1(1), 50-55 (1964).

non stationary markov chain with constant casuative matrices

Documents