the google markov chain - convergence and eigenvalues

Upload: brpinheiro1

Post on 04-Jun-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/13/2019 The Google Markov Chain - Convergence and Eigenvalues

    1/21

    U.U.D.M. Project Report 2012:14

    Examensarbete i matematik, 15 hpHandledare och examinator: Jakob Bjrnberg

    Juni 2012

    Department of MathematicsUppsala University

    The Google Markov Chain: convergencespeed and eigenvalues

    Fredrik Backker

  • 8/13/2019 The Google Markov Chain - Convergence and Eigenvalues

    2/21

    Acknowledgments

    I would like to thank my supervisor Jakob Bjrnberg for helping me writing this thesis.

    1

  • 8/13/2019 The Google Markov Chain - Convergence and Eigenvalues

    3/21

    The Google arkov !hain"#onvergen#e speed and eigenvalues

    Contents

    1 Introduction

    2 Definitions and background$.1 arkov #hains$.$ The Google %age&ank

    3 Convergence speed'.1 General theory of #onvergen#e speed

    '.$ !onvergen#e speed and eigenvalues of Google(s arkov !hain

    4 Simulations).1 ultipli#ity of the se#ond eigenvalue).$ *uality of the limit distribution

    5 Conclusion

    6 eferences

    Appendices

    atlab+!ode

    $

  • 8/13/2019 The Google Markov Chain - Convergence and Eigenvalues

    4/21

    ! Introduction

    There are many different sear#h engines on the internet whi#h help us find the information we want.These sear#h engines use different methods to rank pages and display them to us in a way su#h thatthe most relevant and important information is showed first. In this thesis, we study a mathemati#almethod that is a part of how %age&ank,the ranking method for the sear#h engine Google, ranks theorder of whi#h pages are displayed in a sear#h. This method we look at uses pages as states in asto#hasti# arkov #hain where outgoing links from pages are the transitions and the #orrespondingtransition probabilities are e-ually divided among the number of outgoing links from the related

    page. The transition probability matri that is given by this is then used to #ompute a stationarydistribution where the page with the largest stationary value is ranked first, the page with the se#ondlargest is ranked se#ond and so on. This method #an be put into two variants, with a dampeningfa#tor or without. The variant without a dampening fa#tor is the one we just des#ribed. In the othervariant, whi#h we study in this thesis, the dampening fa#tor /often set to 0. 23 is introdu#ed mainlyto ensure that the stationary distribution is uni-ue. This variant is #onsidered to be the most usefulone and in this thesis we take a light look at how the dampening fa#tor affe#ts the #omputation of%age&ank.4e will begin by going through some basi# definitions for arkov #hains and e plain the Google%age&ank in more detail. In the se#tion after, we go through some general theory about the rate of#onvergen#e for arkov #hains sin#e it turns out that the eigenvalues of a transition probabilitymatri is #onne#ted to the #onvergen#e speed to its steady state. 5urther, we look at the se#ondlargest eigenvalue of the Google arkov #hain and its algebrai# multipli#ity, whi#h are the mainfa#tors that affe#t the #onvergen#e rate of the #hain. 6e t, we go through some results of how these#ond eigenvalue of the Google arkov #hain is limited by the dampening fa#tor and by this,makes the #hoi#e of the dampening fa#tor very important. 4e end by doing some simulations to#he#k how different properties of %age&ank are affe#ted by #hoi#es of the dampening fa#tor and in

    parti#ular, whi#h value of the dampening fa#tor that is most adapted for a fast #onvergen#e speed of

    the Google arkov #hain.

    '

  • 8/13/2019 The Google Markov Chain - Convergence and Eigenvalues

    5/21

    2 Definitions and background

    2"! #arkov c$ains

    7 dis#rete time arkov #hain is a sto#hasti# pro#ess { X n} with finite state spa#e S that satisfiesthe arkov property"

    P ( X n= xn X 0= x0 , , X n 1= xn 1)= P ( X n = xn X n1 = xn 1)for all x0 , , xnS and n1. In other words, the ne t step of a arkov #hain is independent ofthe past and only relies upon the most re#ent state.The #hain is #alled time+homogenous if the transition probabilities do not #hange over time, i.e. iffor ea#h i , jS , pij= P ( X n= j X n1 = i) does not depend on n.In this #ase the probabilities p ij are the arkov #hains transition probabilities when moving fromstate i to state j. 7lso let p ij

    (m )= P ( X m+ n= j X n= i ) denote the transition probabilities in m steps,m= 0,1,$... . The probabilities #an be #olle#ted in a transition probability matri , here denoted by P "

    P = p00 p01 p10 p11

    This matri is #alled a sto#hasti# matri if all of the row ve#tors in it sum to one" j

    p ij= 1. Thearkov #hain is said to be irreducible if it is possible to rea#h ea#h state i from any other state j, in

    any number of steps. ore formally, if P ( X n= j X 0= i )> 0 for some n 0 i , j

    7 state i has period k if any return to state i o##urs in multiples of k steps"k = greatest commondivisor of the set {n " P ( X n= i X 0 = i)> 0 }

    If all the states in a arkov #hain has period one, it is said to be aperiodic , i.e. the greatest #ommondivisor of the return time to any state from itself is one.The following result is standard and we do not prove it.

    %roposition !7 arkov #hain that is irredu#ible and aperiodi# with finite state spa#e has a uni-ue stationarydistribution 8, whi#h is a probability ve#tor su#h that = P. 7dditionally, the transition

    probabilities #onverges to a steady state when the number of steps goes to infinity in the sense thatlimm

    p ij(m )= j for all i,j in S .

    2"2 &$e 'oogle %age ank

    The Google %age&ank is one of many methods that the sear#h engine Google uses to determine theimportan#e or relevan#e of a page. This method uses a spe#ial arkov #hain whi#h is used to#ompute the rank of web pages and this rank determines in whi#h order the pages should be listed in

    a sear#h in Google.

    )

  • 8/13/2019 The Google Markov Chain - Convergence and Eigenvalues

    6/21

    9et all the web pages Google #ommuni#ates with be denoted by the state spa#e W . The si:e of W isn, several billion pages. 9et =( c ij) denote the #onne#tivity matri of W , whi#h means that isa n!n matri with c ij= 1 if there is an hyperlink from page i to page j and c ij= 0 otherwise. Thenumber of outgoing links from page i are the row sums

    s i= j= 1n

    cij

    If si= 0 , it has no outgoing links and is #alled a dangling node. 9et " =( t ij) be given byt ij= c ij / s i if s i1 and t ij= 1 /n if i is a dangling node. By this, " #an be seen as a transition

    probability matri of the arkov #hain with state spa#e W . 5urthermore, to define the Googlearkov #hain we in#lude an additional parameter d , whi#h is a dampening fa#tor that #an be set

    between 0 and 1. The transition probability matri of the Google arkov #hain is defined by"

    P = d" +( 1 d )( 1n

    ) E

    where E is the n!n matri with only ones. This arkov #hain #an be des#ribed as a ;randomsurfer; who, with probability d , #li#ks on an outgoing link on the #urrent web page with e-ual

    probabilites or, if the page has no outgoing links, #hooses another page at random in W. 7lso, with probability 1+ d , the surfer jumps to a page at random among all the pages n. The Google arkov#hain is finite, irredu#ible and also aperiodi# depending on whi#h value d has. If d

  • 8/13/2019 The Google Markov Chain - Convergence and Eigenvalues

    7/21

    3 Convergence speed3"! 'eneral t$eor( of convergence speed

    Cin#e the Google %age&ank #onsists of many billion pages, one might would like to know how fastthis #an be #omputed. This #an be done by determining how fast the transition probability matri ofthe Google arkov #hain #onverges to its steady state as in %roposition 1. To find this rate of#onvergen#e, we need to go through some definitions and theorems.

    9et # be a s-uare sto#hasti# matri of dimension m, w be a non+:ero ve#tor and $ a s#alar su#h that

    # w= $ w

    whi#h is e-uivalent to ( # $l )w= 0 /where l is the identity matri 3. Then $ is said to be the righteigenvalue of # #orresponding to the eigenve#tor w. In words, an eigenve#tor of a matri is a non+:ero ve#tor that remains paralell to the original ve#tor after being multiplied by the matri and theeigenvalue of that eigenve#tor is the fa#tor of whi#h it is s#aled when multiplied by the matri .

    Digenve#tors #an either be left or right eigenve#tors, but the most #ommonly used is the right asdes#ribed above. 4e say that $ is a left eigenvalue if z T # = $ z T , where z is a non+:ero ve#tor /theleft eigenve#tor3. Da#h left eigenvalue is a right eigenvalue and vi#e versa, be#ause if $ %is a lefteigenvalue then

    z T # = $ % z

    T

    ( z T # )" = $ % z #" z = $ % z

    ( # " $ %l ) z = 00 = det ( # " $ %l )

    = det ( # $ %l )" = det ( # $ %l )

    This shows that $ %is also a right eigenvalue.

    &$eorem ! E=1 is always an eigenvalue of a sto#hasti# m!m matri # asso#iated with the right eigenve#torv= ! with all entries e-ual to 1. If a stationary distribution e ist then the left eigenve#tor u=8.

    Proof& Cin#e # ! =! and 8 #= 8.

    F9et E 1,...,E m be the m eigenvalues of # , assume these eigenvalues are distin#t, and let u1,...,um be the #orresponding left eigenve#tors and v1,..., vm be the #orresponding right eigenve#tors. 4e donot prove the following well+known fa#t.

    &$eorem 29et # be an irredu#ible, aperiodi# and sto#hasti# m!m matri , then E 1=1 satisfies E 1 HE iH for anyother eigenvalue E i.

    4e now perform some #al#ulation that illustrate the relevan#e of the se#ond largesteigenvalue for #onvergen#e speed.

  • 8/13/2019 The Google Markov Chain - Convergence and Eigenvalues

    8/21

    %roposition 2u1,..., um form an orthogonal set of ve#tors, and so do v1,..., vm.

    Proof of Proposition ' " The e-uations for the eigenve#tors are" u i" # = $i u i" and # v j= $i v j . By

    multipli#ation we find thatu i

    " #v j = $ i u i" v ju i" #v j = $ j ui" v j $ i ui

    " v j = $ j u i" v j( $i $ j )u i

    " v j = 0u i

    " v j = 0 if $i $ jand sin#e the eigenvalues are distin#t the following e-uation holds"

    u i" v j= 0, if i j , 1i , jm. (1)

    by this we see that eigenve#tors of different eigenvalues are orthogonal to ea#h other.F

    5urther, we #an s#ale the eigenve#tors so that

    u i"

    vi = 1 for all i[1, m]. ($ )!olle#t the left eigenve#tors u i of # in ( so that u1 is the first #olumn in (, u $ is the se#ond, and soon. !olle#t the right eigenve#tors vi in ) the same way.

    ( =( u1 , u $ , , u m) , ) =( v1 , v $ , , v m)5rom /13 and /$3 we get that"

    ( i" ) i= 1, (' )

    and also, from the theory of matri#es, that ) i ( i" = 1. 5urther, let * be a diagonal matri with the

    eigenvalues of # as entries, i.e.

    *=

    $1 0 00 $$ 0 00 0 0 $m

    Cin#e ) #onsists of the right eigenve#tors, we get the e-uation" # ) = )* () )

    By /'3 and /)3 we get"( " #) = ( " )* = *

    4hi#h #an be rewriten to"

    # = )*( " = i= 1

    m

    $ i vi ui"

    4e then take the power of n of # to get # n = )* n ( "

    Cin#e, # n=( )*( " )( )*( " )()*( " )( )*( " )

    n factors

    = ) * (( " ) ) * (( " ) ) * (( " ) ) *n factors

    ( "

    = )* n ( "

    By this we get the spe#tral de#omposition # n=

    i= 1

    m

    $ in vi ui

    "

  • 8/13/2019 The Google Markov Chain - Convergence and Eigenvalues

    9/21

    4e #an rewrite this as #

    n $1n v1 u1

    " =

    i = $

    m

    $in vi u i

    "

    i= $

    m

    $inviu i

    "

    5urther let the eigenvalues other than E 1 , i.e. E $,E ' ,...,E m, be arranged su#h that $1 > $$ $mThe results above is an argument whi#h shows that E $ is related to the differen#e of # n $1n v1 u1" when all eigenvalues are different. In fa#t>)?, there is even a way to show that also if the

    eigenvalues are not distin#t then / 3 holds.Dnsure /by rearranging the eigenvalues if ne#essary3 that, if any HE iH for iK' is e-ual to HE $H, then r i,the algebrai# multipli#ity of E i, is less than or e-ual to r $.This arrangement of the eigenvalues is ne#essary for the %erron+5robenius Theorem, where thealgebrai# multipli#ity of the se#ond eigenvalue is related to the #onvergen#e speed for a transition

    probability matri to its steady state.4e state the theorem and then look at an e ample to make the rate of #onvergen#e and the algebrai#multipli#ity more #lear. &$eorem 39et the eigenve#tors be #hosen so that u i

    " vi = 1, where ui is the left eigenve#tor and vi is the righteigenve#tor. Then we get the formula"

    # n = $1n v1 u 1

    " + + (nr $ 1 $$n). (J )

    )*ample ! ates of convergence via Perron- robenius "heoremIf P is a sto#hasti#, irredu#ible and aperiodi# matri with state spa#e S =L1,..., mM.Then the firsteigenvalue is E 1=1 with eigenve#tors v1=! , u1= 8, and therefore by / 3

    P n= ! " + + (nr $ 1 $$n )

    and if the eigenvalues are distin#t and the absolute value of these are different we even get

    P n

    = ! "

    + + ( $$

    n

    ) .By this we see that smaller HE $H gives a higher rate of #onvergen#e.If we do not arrange the eigenvalues and #ount their multipli#ity, as des#ribed above, we

    might get a #onvergen#e speed that is not true. 5or e ample if the eigenvalues were e-ual to 0.2 and+0.2, and say that their algebrai# multipli#ity are ) respe#tively 1. Then, sin#e we #hoose theeigenvalue with the largest multipli#ity, we get from theorem $ that

    P n= ! " + + (n) 10.2 n)If we ordered the eigenvalues so that E $=+0.2 instead, we get

    P n= ! " + + (n1 1+ 0.2 n)whi#h is not true for the rate of #onvergen#e.

    3"2 Convergence speed and eigenvalues of 'oogle+s #arkov C$ain

    6ow that we know the importan#e of the se#ond eigenvalue we #an #ontinue by looking at someresults done by @aveliwala and Namvar>$?. These results are related to the se#ond eigenvalue of theGoogle arkov #hain transition probability matri and to the dampening fa#tor d , whi#h we willsee is relevant to the result in Theorem '.

    &$eorem 4

    The Google arkov #hains transition probability matri , P = d" +( 1 d )( 1n

    ) E , has a se#ond

    eigenvalue that satisfies $$d

  • 8/13/2019 The Google Markov Chain - Convergence and Eigenvalues

    10/21

    &$eorem 5

    If there e ist at least two irredu#ible #losed subsets in " , then the se#ond eigenvalue of P is $$= d

    In >$?, the result is stated and proved for any rank+one row+sto#hasti# matri E , but in our

    #ase we only #onsider when E is the m!m matri with only ones. In appli#ations the most #ommon#hoi#e is to use the E we have #hosen. Opon reading >$? we found that the proofs of their results

    be#ame #onsiderably simpler in our #ase and we therefore now give the proofs for these simpler#ases in detail. %roofs of both these theorems rely on the following lemma"

    ,emma !If v ! is a right eigenve#tor of P then v is a right eigenve#tor of " .If Pv= $v then "v = $

    d v.

    Proof " 9et E i denote the eigenvalues of P with asso#iated eigenve#tor vi, and let the eigenvalues of" be denoted by P i with asso#iated eigenve#tor / i for i=$,',..., m.

    If we look at the e-uation for the eigenvalues of P we get"

    P v i= d" v i+( 1 d )(1m

    ) E vi= $ i viDa#h row of E e-uals ! &. By proposition $, sin#e ! is also the first right eigenve#tor of P , we get! &vi=0 and by this, that E vi=0. The e-uation left is"

    d" v i = $i vi 6e t, we #an divide by d to get"

    " v i= $ i

    d vi

    Then we let / i=vi and P i=E iQd. &ewrite the e-uation to" / i= 0 i / i

    and we see that vi is an eigenve#tor of " as well.F

    4ith this we #an #ontinue to the proofs of the theorems.

    Proof of theorem 1 " 5or this to be proven, we need to look at different values of d .9et E i denote the eigenvalues of % with asso#iated eigenve#tor v i, and let the eigenvalues of T bedenoted by P i for i=$,',..., m.

    4hen d =0 we get the e-uation P =( 1m

    ) E , whi#h has E $=0 and therefore the theorem holds.

    4hen d =1 we get the e-uation P =" and sin#e " is sto#hasti#, $$1. , the theorem holds for this#ase as well.To prove the theorem for the #ase when 0

  • 8/13/2019 The Google Markov Chain - Convergence and Eigenvalues

    11/21

    Proof of theorem 2 " It is a standard fa#t /from page 1$ of >'?3 that the multipli#ity of eigenvalue 1of a sto#hasti# matri is e-ual to the number of irredu#ible #losed subsets. Co $ linearlyindependent eigenve#tors a,b of " with

    "a = a ,"b = b4e want to #onstru#t a ve#tor x with " x = x and ! T x= 0.Cet x= 3a + 4b , S and are s#alars. By this we get " x = 3"a + 4"b = 3a + 4b , so " x = x forall S, .5urther, we want ! T x= 3 ! T a + 4 ! T b to be e-ual to 0. If ! T a = 0 and ! T b= 0, we #an #hoose

    any S, . Atherwise, we assume ! T a 0 and #hoose 3= 4 !T b

    ! T a.

    By this, we see that x is an eigenve#tor of " with eigenvalue 1 and from lemma 1 we get that x is an

    eigenve#tor of P with eigenvalue $$d

    = 1 $$= d .

    F

    Cin#e " most likely has at least two irredu#ible #losed subsets /as mentioned in se#tion $.$when two pages only link to ea#h other, also see figure 1 below3 it is obvious that the #hoi#e of thedampening fa#tor is related to the rate of whi#h the Google arkov #hain #onverges.

    Figure 1 & ' and 5creates an irreducible closed subset, and so do 1 and 2.

    To make these results on the se#ond eigenvalue more #lear we look at some small e amples.

    )*ample 2!onsider a small web of ' pages with the transition probability matri

    " =0 1 /$ 1 /$

    1 /$ 0 1 /$1 /$ 1 /$ 0

    and a dampening fa#tor set to 0. 2. Then the transition probability matri of the Google arkov

    #hain in this #ase is P = 0. 20 1 /$ 1 /$

    1 /$ 0 1 /$1 /$ 1 /$ 0

    +0.12

    '

    1 1 1

    1 1 1

    1 1 1

    =1 /$0 1R /)0 1R /)0

    1R /)0 1 /$0 1R /)01R /)0 1R /)0 1 /$0

    The eigenvalues of P are #omputed by det [ P $6 ' ]= 0. 4e then get from using matlab that these#ond eigenvalue is E $=- 0.)$20 and a##ording to Theorem ) $$d , whi#h is true sin#e 0.)$20 < 0. 2.

    )*ample 3!onsider another small web with ' pages where the transition probability matri is

    " =0 1 /$ 1 /$0 1 0

    0 0 1

    10

  • 8/13/2019 The Google Markov Chain - Convergence and Eigenvalues

    12/21

  • 8/13/2019 The Google Markov Chain - Convergence and Eigenvalues

    13/21

    Figure 2: d=8.9 Figure 3: d=8.89 4"2 -ualit( of t$e limit distribution

    The ne t -uestion would then be why d should not be set to a very small number. 7fter all, smallerd mean faster #onvergen#e. In the e treme #ase d= 0, #onvergen#e would be almost immediate.

    @owever, sin#e then P =( 1m

    ) E the limit distribution is always uniform regardless of the stru#tureof the internet itself. Intuitively, larger d means that P is V#loser; to " and hen#e the %age&ankshould give a more realisti# pi#ture. 4e have not found any -uantitative formulation of thisintuition in the literature, and have therefore simulated to obtain the %age&ank for different d. Inthese tests we simulate transitions for networks of si:e 1000 for different d . The following figuresshows the number of times ea#h state in P is visited when simulating 10 000 transitions and

    #hoosing a starting state at random, /the +a is shows the states and the y+a is shows the number ofvisits3.

    Figure 3: d=8.9 Figure 4: d=8.2

    1$

  • 8/13/2019 The Google Markov Chain - Convergence and Eigenvalues

    14/21

    Figure 5: d=8.:2 Figure 6: d=8.;;

    These visits are then #ounted and used to #ompute a steady state for P and in turn the %age&ank of

    the pages. 5rom the figures, we see that the number of times a state is rea#hed are evenly spread forlower values of d and thereby gives a more uniform steady state than for higher d. The %age&ank

    for low d would then be #al#ulated generally from ( 1m

    ) E and most of the stru#ture from " would be lost. To investigate this further, we ran a test with a spe#ified " /see figure 3.

    Figure 7: States not illustrated? and create an irreducible subset.

    This time, we simulated 1000 transitions for different values of d, and did this 1000 times for ea#hd to get the mean value for the distan#e between uniform distribution and our simulated stationarydistribution.

    1'

  • 8/13/2019 The Google Markov Chain - Convergence and Eigenvalues

    15/21

    Figure 8: d=8.9 Figure 9: d=8.2

    Figure 10: d=8.:2 Figure 11: d=8.;;

    In the figures above we see the number of visits for a simulation for ea#h value of d we tested andthis time it is even more #lear that lower values of d gives a more uniform steady state. Thedistan#es we measured between uniform distribution and our steady state for these d shows that ourstatement is #orre#t"

    Walue of d 0.1 0.2 0. 2 0.RR

    Xistan#e"

    i= 1

    m

    ( i 1m)$

    0.00$ 0.00'0 0.002' 0.0 R

    Cmaller values of d give shorter distan#e to the uniform distribution and to make this obvious, we

    plotted the distan#e from simulations for values of d between 0.01 and 0.RR /the +a is shows thevalue of d and the y+a is shows the distan#e3.

    1)

  • 8/13/2019 The Google Markov Chain - Convergence and Eigenvalues

    16/21

    Figure 12: The distan#e for different values of d

    5 Conclusion

    5rom the results in se#tion ', we have learned that the Google arkov #hains se#ond largesteigenvalue and the algebrai# multipli#ity of it, is dire#tly affe#ting the #onvergen#e speed of the#hain to a stationary distribution. 5urthermore we have seen that the dampening fa#tor restrains these#ond largest eigenvalue to be less than or e-ual to the value of the dampening fa#tor or to bee-ual to the dampening fa#tor in the #ase when there e ists two or more #losed irredu#ible subsetsin the set of pages we use. In se#tion ) we see, from different simulations of networks, that differentvalues of the dampening fa#tor does not #hange the multipli#ity of the se#ond eigenvalue andtherefore a #hoi#e of a very small dampening fa#tor would give faster #onvergen#e speed thanlarger #hoi#es, su#h as 0. 2. But from tests of the -uality of the limit distribution, we dis#over thatsetting the dampening fa#tor to a low value will #hange the stru#ture of the transition probabilitymatri of the Google arkov #hain and transform the limit distribution into being a uniformdistribution. Cin#e outgoing links from pages are the main fa#tor in this method of #omputing aGoogle %age&ank, a stationary distribution whi#h is uniform would mean that all of the pages havethe same %age&ank. By this, we would have re#eived a fast #omputation of a %age&ank where wehave lost almost all of the information from our original network and this would not be very useful.

    5rom these results we see that setting the dampening fa#tor to 0. 2, as the #reators of Google did,might give us a good #ombination of good -uality of the limit distribution and fast #onvergen#espeed for the Google arkov #hain.

    12

  • 8/13/2019 The Google Markov Chain - Convergence and Eigenvalues

    17/21

  • 8/13/2019 The Google Markov Chain - Convergence and Eigenvalues

    18/21

    #atlab.Code

    % Program for determining the multiplicity of the second eigenvalue.% n is the number of smaller matrices put into the diagonal in a bigger matrix.% p is prob of a link% d is the dampening factor

    % loop is the number simulations function [P] = randNet(n,p,d,loop

    !"#!=$eros( ,loop & for k= 'loop&

    N= )n&!=$eros(N &for i= ' *=rand(n & *=(*+=p)ones(n & a=(i- )n &

    b=a n- & !(a'b,a'b =*&end for i= 'N !(i,i = & if (sum(!(i,' == =randi([ ,N- ] & if +i !(i, = & else !(i, = & end endend

    !=!./repmat(sum(!,0 , , )n &

    P=d)! ( -d /( )n )ones( )n & "ig=eig(P & 1"# = $eros(si$e(eig(P &for i = ' si$e(eig(P

    1"#(i =(abs("ig(i 2 d- . &

    end !"#!( ,k =sum(1"# - &sum(1"# -end hist(!"#!

    1

  • 8/13/2019 The Google Markov Chain - Convergence and Eigenvalues

    19/21

    % Program for determining Pagerank and distance to uniform distribution.% n is the number of smaller matrices put into the diagonal in a bigger matrix.% p is the probabilit3 of a link.% d is the dampening factor.% trans is the number of transitions.

    function [P] = Pagerank(n,p,d,trans)

    N= )n&!=$eros(N &for i= ' *=rand(n & *=(*+=p)ones(n & a=(i- )n &

    b=a n- & !(a'b,a'b =*&end for i= 'N !(i,i = &

    if (sum(!(i,' == =randi([ ,N- ] & if +i !(i, = & else !(i, = & end endend

    !=!./repmat(sum(!,0 , , )n &

    P=d ) ! ( -d /( )n ) ones( )n & nmb=trans& states=$eros( ,nmb &states( =randi([ , )n] &

    le4els=cumsum(P,0 &

    pi=$eros( , )n &count= &for i= 'nmb-

    u=rand& = &

    5hile u2le4els(states(i , = & end & states(i = & for k= ' )n if states(i ==k count=count & pi(k =count& end end &

    end &

    1

  • 8/13/2019 The Google Markov Chain - Convergence and Eigenvalues

    20/21

  • 8/13/2019 The Google Markov Chain - Convergence and Eigenvalues

    21/21

    end end &end &

    pi=pi/sum(pi &unif=ones( , / &!"#!( ,l =norm(pi-unif &end &

    dampen( ,m =mean(!"#! &

    d4al( ,m =d&d=d- .endplot(d4al( '66 ,dampen( '66

    $0