the google markov chain - convergence and eigenvalues

8/13/2019 The Google Markov Chain - Convergence and Eigenvalues

1/21

U.U.D.M. Project Report 2012:14

Examensarbete i matematik, 15 hpHandledare och examinator: Jakob Bjrnberg

Juni 2012

Department of MathematicsUppsala University

The Google Markov Chain: convergencespeed and eigenvalues

Fredrik Backker


2/21

Acknowledgments

I would like to thank my supervisor Jakob Bjrnberg for helping me writing this thesis.

1


3/21

The Google arkov !hain"#onvergen#e speed and eigenvalues

Contents

1 Introduction

2 Definitions and background$.1 arkov #hains$.$ The Google %age&ank

3 Convergence speed'.1 General theory of #onvergen#e speed

'.$ !onvergen#e speed and eigenvalues of Google(s arkov !hain

4 Simulations).1 ultipli#ity of the se#ond eigenvalue).$ *uality of the limit distribution

5 Conclusion

6 eferences

Appendices

atlab+!ode

$


4/21

! Introduction

There are many different sear#h engines on the internet whi#h help us find the information we want.These sear#h engines use different methods to rank pages and display them to us in a way su#h thatthe most relevant and important information is showed first. In this thesis, we study a mathemati#almethod that is a part of how %age&ank,the ranking method for the sear#h engine Google, ranks theorder of whi#h pages are displayed in a sear#h. This method we look at uses pages as states in asto#hasti# arkov #hain where outgoing links from pages are the transitions and the #orrespondingtransition probabilities are e-ually divided among the number of outgoing links from the related

page. The transition probability matri that is given by this is then used to #ompute a stationarydistribution where the page with the largest stationary value is ranked first, the page with the se#ondlargest is ranked se#ond and so on. This method #an be put into two variants, with a dampeningfa#tor or without. The variant without a dampening fa#tor is the one we just des#ribed. In the othervariant, whi#h we study in this thesis, the dampening fa#tor /often set to 0. 23 is introdu#ed mainlyto ensure that the stationary distribution is uni-ue. This variant is #onsidered to be the most usefulone and in this thesis we take a light look at how the dampening fa#tor affe#ts the #omputation of%age&ank.4e will begin by going through some basi# definitions for arkov #hains and e plain the Google%age&ank in more detail. In the se#tion after, we go through some general theory about the rate of#onvergen#e for arkov #hains sin#e it turns out that the eigenvalues of a transition probabilitymatri is #onne#ted to the #onvergen#e speed to its steady state. 5urther, we look at the se#ondlargest eigenvalue of the Google arkov #hain and its algebrai# multipli#ity, whi#h are the mainfa#tors that affe#t the #onvergen#e rate of the #hain. 6e t, we go through some results of how these#ond eigenvalue of the Google arkov #hain is limited by the dampening fa#tor and by this,makes the #hoi#e of the dampening fa#tor very important. 4e end by doing some simulations to#he#k how different properties of %age&ank are affe#ted by #hoi#es of the dampening fa#tor and in

parti#ular, whi#h value of the dampening fa#tor that is most adapted for a fast #onvergen#e speed of

the Google arkov #hain.

'


5/21

2 Definitions and background

2"! #arkov c$ains

7 dis#rete time arkov #hain is a sto#hasti# pro#ess { X n} with finite state spa#e S that satisfiesthe arkov property"

P ( X n= xn X 0= x0 , , X n 1= xn 1)= P ( X n = xn X n1 = xn 1)for all x0 , , xnS and n1. In other words, the ne t step of a arkov #hain is independent ofthe past and only relies upon the most re#ent state.The #hain is #alled time+homogenous if the transition probabilities do not #hange over time, i.e. iffor ea#h i , jS , pij= P ( X n= j X n1 = i) does not depend on n.In this #ase the probabilities p ij are the arkov #hains transition probabilities when moving fromstate i to state j. 7lso let p ij

(m )= P ( X m+ n= j X n= i ) denote the transition probabilities in m steps,m= 0,1,$... . The probabilities #an be #olle#ted in a transition probability matri , here denoted by P "

P = p00 p01 p10 p11

This matri is #alled a sto#hasti# matri if all of the row ve#tors in it sum to one" j

p ij= 1. Thearkov #hain is said to be irreducible if it is possible to rea#h ea#h state i from any other state j, in

any number of steps. ore formally, if P ( X n= j X 0= i )> 0 for some n 0 i , j

7 state i has period k if any return to state i o##urs in multiples of k steps"k = greatest commondivisor of the set {n " P ( X n= i X 0 = i)> 0 }

If all the states in a arkov #hain has period one, it is said to be aperiodic , i.e. the greatest #ommondivisor of the return time to any state from itself is one.The following result is standard and we do not prove it.

%roposition !7 arkov #hain that is irredu#ible and aperiodi# with finite state spa#e has a uni-ue stationarydistribution 8, whi#h is a probability ve#tor su#h that = P. 7dditionally, the transition

probabilities #onverges to a steady state when the number of steps goes to infinity in the sense thatlimm

p ij(m )= j for all i,j in S .

2"2 &$e 'oogle %age ank

The Google %age&ank is one of many methods that the sear#h engine Google uses to determine theimportan#e or relevan#e of a page. This method uses a spe#ial arkov #hain whi#h is used to#ompute the rank of web pages and this rank determines in whi#h order the pages should be listed in

a sear#h in Google.

)


6/21

9et all the web pages Google #ommuni#ates with be denoted by the state spa#e W . The si:e of W isn, several billion pages. 9et =( c ij) denote the #onne#tivity matri of W , whi#h means that isa n!n matri with c ij= 1 if there is an hyperlink from page i to page j and c ij= 0 otherwise. Thenumber of outgoing links from page i are the row sums

s i= j= 1n

cij

If si= 0 , it has no outgoing links and is #alled a dangling node. 9et " =( t ij) be given byt ij= c ij / s i if s i1 and t ij= 1 /n if i is a dangling node. By this, " #an be seen as a transition

probability matri of the arkov #hain with state spa#e W . 5urthermore, to define the Googlearkov #hain we in#lude an additional parameter d , whi#h is a dampening fa#tor that #an be set

between 0 and 1. The transition probability matri of the Google arkov #hain is defined by"

P = d" +( 1 d )( 1n

) E

where E is the n!n matri with only ones. This arkov #hain #an be des#ribed as a ;randomsurfer; who, with probability d , #li#ks on an outgoing link on the #urrent web page with e-ual

probabilites or, if the page has no outgoing links, #hooses another page at random in W. 7lso, with probability 1+ d , the surfer jumps to a page at random among all the pages n. The Google arkov#hain is finite, irredu#ible and also aperiodi# depending on whi#h value d has. If d


7/21

3 Convergence speed3"! 'eneral t$eor( of convergence speed

Cin#e the Google %age&ank #onsists of many billion pages, one might would like to know how fastthis #an be #omputed. This #an be done by determining how fast the transition probability matri ofthe Google arkov #hain #onverges to its steady state as in %roposition 1. To find this rate of#onvergen#e, we need to go through some definitions and theorems.

9et # be a s-uare sto#hasti# matri of dimension m, w be a non+:ero ve#tor and $ a s#alar su#h that

# w= $ w

whi#h is e-uivalent to ( # $l )w= 0 /where l is the identity matri 3. Then $ is said to be the righteigenvalue of # #orresponding to the eigenve#tor w. In words, an eigenve#tor of a matri is a non+:ero ve#tor that remains paralell to the original ve#tor after being multiplied by the matri and theeigenvalue of that eigenve#tor is the fa#tor of whi#h it is s#aled when multiplied by the matri .

Digenve#tors #an either be left or right eigenve#tors, but the most #ommonly used is the right asdes#ribed above. 4e say that $ is a left eigenvalue if z T # = $ z T , where z is a non+:ero ve#tor /theleft eigenve#tor3. Da#h left eigenvalue is a right eigenvalue and vi#e versa, be#ause if $ %is a lefteigenvalue then

z T # = $ % z

T

( z T # )" = $ % z #" z = $ % z

( # " $ %l ) z = 00 = det ( # " $ %l )

= det ( # $ %l )" = det ( # $ %l )

This shows that $ %is also a right eigenvalue.

&$eorem ! E=1 is always an eigenvalue of a sto#hasti# m!m matri # asso#iated with the right eigenve#torv= ! with all entries e-ual to 1. If a stationary distribution e ist then the left eigenve#tor u=8.

Proof& Cin#e # ! =! and 8 #= 8.

F9et E 1,...,E m be the m eigenvalues of # , assume these eigenvalues are distin#t, and let u1,...,um be the #orresponding left eigenve#tors and v1,..., vm be the #orresponding right eigenve#tors. 4e donot prove the following well+known fa#t.

&$eorem 29et # be an irredu#ible, aperiodi# and sto#hasti# m!m matri , then E 1=1 satisfies E 1 HE iH for anyother eigenvalue E i.

4e now perform some #al#ulation that illustrate the relevan#e of the se#ond largesteigenvalue for #onvergen#e speed.


8/21

%roposition 2u1,..., um form an orthogonal set of ve#tors, and so do v1,..., vm.

Proof of Proposition ' " The e-uations for the eigenve#tors are" u i" # = $i u i" and # v j= $i v j . By

multipli#ation we find thatu i

" #v j = $ i u i" v ju i" #v j = $ j ui" v j $ i ui

" v j = $ j u i" v j( $i $ j )u i

" v j = 0u i

" v j = 0 if $i $ jand sin#e the eigenvalues are distin#t the following e-uation holds"

u i" v j= 0, if i j , 1i , jm. (1)

by this we see that eigenve#tors of different eigenvalues are orthogonal to ea#h other.F

5urther, we #an s#ale the eigenve#tors so that

u i"

vi = 1 for all i[1, m]. ($ )!olle#t the left eigenve#tors u i of # in ( so that u1 is the first #olumn in (, u $ is the se#ond, and soon. !olle#t the right eigenve#tors vi in ) the same way.

( =( u1 , u $ , , u m) , ) =( v1 , v $ , , v m)5rom /13 and /$3 we get that"

( i" ) i= 1, (' )

and also, from the theory of matri#es, that ) i ( i" = 1. 5urther, let * be a diagonal matri with the

eigenvalues of # as entries, i.e.

*=

$1 0 00 $$ 0 00 0 0 $m

Cin#e ) #onsists of the right eigenve#tors, we get the e-uation" # ) = )* () )

By /'3 and /)3 we get"( " #) = ( " )* = *

4hi#h #an be rewriten to"

# = )*( " = i= 1

m

$ i vi ui"

4e then take the power of n of # to get # n = )* n ( "

Cin#e, # n=( )*( " )( )*( " )()*( " )( )*( " )

n factors

= ) * (( " ) ) * (( " ) ) * (( " ) ) *n factors

( "

= )* n ( "

By this we get the spe#tral de#omposition # n=

i= 1

m

$ in vi ui

"


9/21

4e #an rewrite this as #

n $1n v1 u1

" =

i = $

m

$in vi u i

"

i= $

m

$inviu i

"

5urther let the eigenvalues other than E 1 , i.e. E $,E ' ,...,E m, be arranged su#h that $1 > $$ $mThe results above is an argument whi#h shows that E $ is related to the differen#e of # n $1n v1 u1" when all eigenvalues are different. In fa#t>)?, there is even a way to show that also if the

eigenvalues are not distin#t then / 3 holds.Dnsure /by rearranging the eigenvalues if ne#essary3 that, if any HE iH for iK' is e-ual to HE $H, then r i,the algebrai# multipli#ity of E i, is less than or e-ual to r $.This arrangement of the eigenvalues is ne#essary for the %erron+5robenius Theorem, where thealgebrai# multipli#ity of the se#ond eigenvalue is related to the #onvergen#e speed for a transition

probability matri to its steady state.4e state the theorem and then look at an e ample to make the rate of #onvergen#e and the algebrai#multipli#ity more #lear. &$eorem 39et the eigenve#tors be #hosen so that u i

" vi = 1, where ui is the left eigenve#tor and vi is the righteigenve#tor. Then we get the formula"

# n = $1n v1 u 1

" + + (nr $ 1 $$n). (J )

)*ample ! ates of convergence via Perron- robenius "heoremIf P is a sto#hasti#, irredu#ible and aperiodi# matri with state spa#e S =L1,..., mM.Then the firsteigenvalue is E 1=1 with eigenve#tors v1=! , u1= 8, and therefore by / 3

P n= ! " + + (nr $ 1 $$n )

and if the eigenvalues are distin#t and the absolute value of these are different we even get

P n

= ! "

+ + ( $$

n

) .By this we see that smaller HE $H gives a higher rate of #onvergen#e.If we do not arrange the eigenvalues and #ount their multipli#ity, as des#ribed above, we

might get a #onvergen#e speed that is not true. 5or e ample if the eigenvalues were e-ual to 0.2 and+0.2, and say that their algebrai# multipli#ity are ) respe#tively 1. Then, sin#e we #hoose theeigenvalue with the largest multipli#ity, we get from theorem $ that

P n= ! " + + (n) 10.2 n)If we ordered the eigenvalues so that E $=+0.2 instead, we get

P n= ! " + + (n1 1+ 0.2 n)whi#h is not true for the rate of #onvergen#e.

3"2 Convergence speed and eigenvalues of 'oogle+s #arkov C$ain

6ow that we know the importan#e of the se#ond eigenvalue we #an #ontinue by looking at someresults done by @aveliwala and Namvar>$?. These results are related to the se#ond eigenvalue of theGoogle arkov #hain transition probability matri and to the dampening fa#tor d , whi#h we willsee is relevant to the result in Theorem '.

&$eorem 4

The Google arkov #hains transition probability matri , P = d" +( 1 d )( 1n

) E , has a se#ond

eigenvalue that satisfies $$d


10/21

&$eorem 5

If there e ist at least two irredu#ible #losed subsets in " , then the se#ond eigenvalue of P is $$= d

In >$?, the result is stated and proved for any rank+one row+sto#hasti# matri E , but in our

#ase we only #onsider when E is the m!m matri with only ones. In appli#ations the most #ommon#hoi#e is to use the E we have #hosen. Opon reading >$? we found that the proofs of their results

be#ame #onsiderably simpler in our #ase and we therefore now give the proofs for these simpler#ases in detail. %roofs of both these theorems rely on the following lemma"

,emma !If v ! is a right eigenve#tor of P then v is a right eigenve#tor of " .If Pv= $v then "v = $

d v.

Proof " 9et E i denote the eigenvalues of P with asso#iated eigenve#tor vi, and let the eigenvalues of" be denoted by P i with asso#iated eigenve#tor / i for i=$,',..., m.

If we look at the e-uation for the eigenvalues of P we get"

P v i= d" v i+( 1 d )(1m

) E vi= $ i viDa#h row of E e-uals ! &. By proposition $, sin#e ! is also the first right eigenve#tor of P , we get! &vi=0 and by this, that E vi=0. The e-uation left is"

d" v i = $i vi 6e t, we #an divide by d to get"

" v i= $ i

d vi

Then we let / i=vi and P i=E iQd. &ewrite the e-uation to" / i= 0 i / i

and we see that vi is an eigenve#tor of " as well.F

4ith this we #an #ontinue to the proofs of the theorems.

Proof of theorem 1 " 5or this to be proven, we need to look at different values of d .9et E i denote the eigenvalues of % with asso#iated eigenve#tor v i, and let the eigenvalues of T bedenoted by P i for i=$,',..., m.

4hen d =0 we get the e-uation P =( 1m

) E , whi#h has E $=0 and therefore the theorem holds.

4hen d =1 we get the e-uation P =" and sin#e " is sto#hasti#, $$1. , the theorem holds for this#ase as well.To prove the theorem for the #ase when 0


11/21

Proof of theorem 2 " It is a standard fa#t /from page 1$ of >'?3 that the multipli#ity of eigenvalue 1of a sto#hasti# matri is e-ual to the number of irredu#ible #losed subsets. Co $ linearlyindependent eigenve#tors a,b of " with

"a = a ,"b = b4e want to #onstru#t a ve#tor x with " x = x and ! T x= 0.Cet x= 3a + 4b , S and are s#alars. By this we get " x = 3"a + 4"b = 3a + 4b , so " x = x forall S, .5urther, we want ! T x= 3 ! T a + 4 ! T b to be e-ual to 0. If ! T a = 0 and ! T b= 0, we #an #hoose

any S, . Atherwise, we assume ! T a 0 and #hoose 3= 4 !T b

! T a.

By this, we see that x is an eigenve#tor of " with eigenvalue 1 and from lemma 1 we get that x is an

eigenve#tor of P with eigenvalue $$d

= 1 $$= d .

F

Cin#e " most likely has at least two irredu#ible #losed subsets /as mentioned in se#tion $.$when two pages only link to ea#h other, also see figure 1 below3 it is obvious that the #hoi#e of thedampening fa#tor is related to the rate of whi#h the Google arkov #hain #onverges.

Figure 1 & ' and 5creates an irreducible closed subset, and so do 1 and 2.

To make these results on the se#ond eigenvalue more #lear we look at some small e amples.

)*ample 2!onsider a small web of ' pages with the transition probability matri

" =0 1 /$ 1 /$

1 /$ 0 1 /$1 /$ 1 /$ 0

and a dampening fa#tor set to 0. 2. Then the transition probability matri of the Google arkov

#hain in this #ase is P = 0. 20 1 /$ 1 /$

1 /$ 0 1 /$1 /$ 1 /$ 0

+0.12

'

1 1 1

1 1 1

1 1 1

=1 /$0 1R /)0 1R /)0

1R /)0 1 /$0 1R /)01R /)0 1R /)0 1 /$0

The eigenvalues of P are #omputed by det [ P $6 ' ]= 0. 4e then get from using matlab that these#ond eigenvalue is E $=- 0.)$20 and a##ording to Theorem ) $$d , whi#h is true sin#e 0.)$20 < 0. 2.

)*ample 3!onsider another small web with ' pages where the transition probability matri is

" =0 1 /$ 1 /$0 1 0

0 0 1

10


12/21


13/21

Figure 2: d=8.9 Figure 3: d=8.89 4"2 -ualit( of t$e limit distribution

The ne t -uestion would then be why d should not be set to a very small number. 7fter all, smallerd mean faster #onvergen#e. In the e treme #ase d= 0, #onvergen#e would be almost immediate.

@owever, sin#e then P =( 1m

) E the limit distribution is always uniform regardless of the stru#tureof the internet itself. Intuitively, larger d means that P is V#loser; to " and hen#e the %age&ankshould give a more realisti# pi#ture. 4e have not found any -uantitative formulation of thisintuition in the literature, and have therefore simulated to obtain the %age&ank for different d. Inthese tests we simulate transitions for networks of si:e 1000 for different d . The following figuresshows the number of times ea#h state in P is visited when simulating 10 000 transitions and

#hoosing a starting state at random, /the +a is shows the states and the y+a is shows the number ofvisits3.

Figure 3: d=8.9 Figure 4: d=8.2

1$


14/21

Figure 5: d=8.:2 Figure 6: d=8.;;

These visits are then #ounted and used to #ompute a steady state for P and in turn the %age&ank of

the pages. 5rom the figures, we see that the number of times a state is rea#hed are evenly spread forlower values of d and thereby gives a more uniform steady state than for higher d. The %age&ank

for low d would then be #al#ulated generally from ( 1m

) E and most of the stru#ture from " would be lost. To investigate this further, we ran a test with a spe#ified " /see figure 3.

Figure 7: States not illustrated? and create an irreducible subset.

This time, we simulated 1000 transitions for different values of d, and did this 1000 times for ea#hd to get the mean value for the distan#e between uniform distribution and our simulated stationarydistribution.

1'


15/21

Figure 8: d=8.9 Figure 9: d=8.2

Figure 10: d=8.:2 Figure 11: d=8.;;

In the figures above we see the number of visits for a simulation for ea#h value of d we tested andthis time it is even more #lear that lower values of d gives a more uniform steady state. Thedistan#es we measured between uniform distribution and our steady state for these d shows that ourstatement is #orre#t"

Walue of d 0.1 0.2 0. 2 0.RR

Xistan#e"

i= 1

m

( i 1m)$

0.00$ 0.00'0 0.002' 0.0 R

Cmaller values of d give shorter distan#e to the uniform distribution and to make this obvious, we

plotted the distan#e from simulations for values of d between 0.01 and 0.RR /the +a is shows thevalue of d and the y+a is shows the distan#e3.

1)


16/21

Figure 12: The distan#e for different values of d

5 Conclusion

5rom the results in se#tion ', we have learned that the Google arkov #hains se#ond largesteigenvalue and the algebrai# multipli#ity of it, is dire#tly affe#ting the #onvergen#e speed of the#hain to a stationary distribution. 5urthermore we have seen that the dampening fa#tor restrains these#ond largest eigenvalue to be less than or e-ual to the value of the dampening fa#tor or to bee-ual to the dampening fa#tor in the #ase when there e ists two or more #losed irredu#ible subsetsin the set of pages we use. In se#tion ) we see, from different simulations of networks, that differentvalues of the dampening fa#tor does not #hange the multipli#ity of the se#ond eigenvalue andtherefore a #hoi#e of a very small dampening fa#tor would give faster #onvergen#e speed thanlarger #hoi#es, su#h as 0. 2. But from tests of the -uality of the limit distribution, we dis#over thatsetting the dampening fa#tor to a low value will #hange the stru#ture of the transition probabilitymatri of the Google arkov #hain and transform the limit distribution into being a uniformdistribution. Cin#e outgoing links from pages are the main fa#tor in this method of #omputing aGoogle %age&ank, a stationary distribution whi#h is uniform would mean that all of the pages havethe same %age&ank. By this, we would have re#eived a fast #omputation of a %age&ank where wehave lost almost all of the information from our original network and this would not be very useful.

5rom these results we see that setting the dampening fa#tor to 0. 2, as the #reators of Google did,might give us a good #ombination of good -uality of the limit distribution and fast #onvergen#espeed for the Google arkov #hain.

12


17/21


18/21

#atlab.Code

% Program for determining the multiplicity of the second eigenvalue.% n is the number of smaller matrices put into the diagonal in a bigger matrix.% p is prob of a link% d is the dampening factor

% loop is the number simulations function [P] = randNet(n,p,d,loop

!"#!=$eros( ,loop & for k= 'loop&

N= )n&!=$eros(N &for i= ' *=rand(n & *=(*+=p)ones(n & a=(i- )n &

b=a n- & !(a'b,a'b =*&end for i= 'N !(i,i = & if (sum(!(i,' == =randi([ ,N- ] & if +i !(i, = & else !(i, = & end endend

!=!./repmat(sum(!,0 , , )n &

P=d)! ( -d /( )n )ones( )n & "ig=eig(P & 1"# = $eros(si$e(eig(P &for i = ' si$e(eig(P

1"#(i =(abs("ig(i 2 d- . &

end !"#!( ,k =sum(1"# - &sum(1"# -end hist(!"#!

1


19/21

% Program for determining Pagerank and distance to uniform distribution.% n is the number of smaller matrices put into the diagonal in a bigger matrix.% p is the probabilit3 of a link.% d is the dampening factor.% trans is the number of transitions.

function [P] = Pagerank(n,p,d,trans)

N= )n&!=$eros(N &for i= ' *=rand(n & *=(*+=p)ones(n & a=(i- )n &

b=a n- & !(a'b,a'b =*&end for i= 'N !(i,i = &

if (sum(!(i,' == =randi([ ,N- ] & if +i !(i, = & else !(i, = & end endend

!=!./repmat(sum(!,0 , , )n &

P=d ) ! ( -d /( )n ) ones( )n & nmb=trans& states=$eros( ,nmb &states( =randi([ , )n] &

le4els=cumsum(P,0 &

pi=$eros( , )n &count= &for i= 'nmb-

u=rand& = &

5hile u2le4els(states(i , = & end & states(i = & for k= ' )n if states(i ==k count=count & pi(k =count& end end &

end &

1


20/21


21/21

end end &end &

pi=pi/sum(pi &unif=ones( , / &!"#!( ,l =norm(pi-unif &end &

dampen( ,m =mean(!"#! &

d4al( ,m =d&d=d- .endplot(d4al( '66 ,dampen( '66

$0

the google markov chain - convergence and eigenvalues

Documents