hw4 is due today
TRANSCRIPT
HW4 is due today
Project report due: Monday December 10 at 11:59am Pacific Time Note this is 1 day earlier than originally announced! Hand-in a paper copy and submit a PDF at
http://snap.stanford.edu/submit/
Project poster session: Monday December 10 12:00-3:00pm
11/29/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 1
CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University
http://cs224w.stanford.edu
The link prediction task: Given G[t0,t0β] a graph on edges
up to time t0β output a ranked list L of links (not in G[t0,t0β]) that are predicted to appear in G[t1,t1β]
Evaluation: n = |Enew|: # new edges that appear during
the test period [t1,t1β] Take top n elements of L and count correct edges
11/29/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 3
[LibenNowell-Kleinberg β03]
G[t0, tβ0] G[t1, tβ1]
Predict links in a evolving collaboration network
Core: Since network data is very sparse Consider only nodes with in-degree and
out-degree of at least 3 11/29/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 4
[LibenNowell-Kleinberg β03]
Methodology: For each pair of nodes (x,y) compute score c(x,y) For example: # of common neighbors c(x,y) of x and y
Sort pairs (x,y) by the decreasing score c(x,y) Note: Only consider/predict edges where
both endpoints are in the core (deg. > 3)
Predict top n pairs as new links See which of these links actually
appear in G[t1, tβ1]
11/29/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 5
X
Different scoring functions c(x,y) Graph distance: (negated) Shortest path length Common neighbors: |Ξ π₯ β© Ξ(π¦)| Jaccardβs coefficient: Ξ π₯ β© Ξ π¦ /|Ξ π₯ βͺ Ξ(π¦)| Adamic/Adar: β 1/log |Ξ(π§)|π§βΞ π₯ β©Ξ(π¦) Preferential attachment: |Ξ π₯ | β |Ξ(π¦)| PageRank: ππ₯(π¦) + ππ¦(π₯) ππ₯ π¦ β¦ stationary distribution weight of y under the random walk: with prob. 0.15, jump to x with prob. 0.85, go to random neighbor of current node
Then, for a particular choice of c(Β·) For every pair of nodes (x,y) compute c(x,y) Sort pairs (x,y) by the decreasing score c(x,y) Predict top n pairs as new links
11/29/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 6
[LibenNowell-Kleinberg β03]
Ξ(x) β¦ neighbors of node x
11/29/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 7
[LibenNowell-Kleinberg β 03]
Performance score: Fraction of new edges that are guessed correctly.
11/29/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 8
Can we learn to predict new friends? Facebookβs People You May Know Letβs look at the data: 92% of new friendships on
FB are friend-of-a-friend More common friends helps
11/29/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 9
[WSDM β11]
w
v
u
z
Goal: Recommend a list of possible friends Supervised machine learning setting: Labeled training examples: For every user π have a list of others she
will create links to {π£1 β¦ π£π} in the future Use FB network from May 2012 and {π£1 β¦ π£π}
are the new friendships you created since then These are the βpositiveβ training examples
Use all other users as βnegativeβ example
Task: For a given node π score nodes {π£1 β¦ π£π}
higher than any other node in the network 10 11/29/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu
βpositiveβ nodes βnegativeβ nodes
s
Green nodes are the nodes
to which s creates links in
the future
s
How to combine node/edge features and the network structure? Estimate strength of each friendship (π’, π£) using: Profile of user π’, profile of user π£ Interaction history of users π’ and π£ This creates a weighted graph Do Personalized PageRank from π
and measure the βproximityβ (the visiting prob.) of any other node π€ from π Sort nodes π€ by decreasing
βproximityβ 11/29/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 11
βpositiveβ nodes βnegativeβ nodes
s
Let π be the center node Let ππ½(π’, π£) be a function that
assigns strength πππ to edge π,π ππ’π’ = ππ½ π’, π£ = exp ββ π½π β xπ’π’ ππ πππ is a feature vector of (π,π)
Features of node π’ Features of node π£ Features of edge (π’, π£) Note: π· is the weight vector we will later estimate!
Do Random Walk with Restarts from π where transitions are according to edge strengths ππ’π’
11/29/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 12
[WSDM β11]
βpositiveβ nodes βnegativeβ nodes
s
How to estimate edge strengths? How to set parameters Ξ² of fΞ²(u,v)?
Idea: Set π½ such that it (correctly) predicts the known future labels
11/29/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 13
s
Network
s
Set edge strengths auv = fΞ²(u,v)
Random Walk with Restarts on the weighted graph.
Each node w has a PageRank proximity pw
Sort nodes by the decreasing PageRank
score pw
Recommend top k nodes with the highest proximity pw to node s
πππ β¦. Strength of edge (π,π) Random walk transition matrix:
PageRank transition matrix:
Where with prob. πΌ we jump back to node π
Compute PageRank vector: π = ππ π
Rank nodes π€ by decreasing ππ€
11/29/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 14
[WSDM β11]
βpositiveβ nodes βnegativeβ nodes
s
Positive examples π« = {π π, β¦ ,π π}
Negative examples π³ = {πππππ πππ ππ}
What do we want?
Note: Exact solution to this problem may not exist So we make the constrains βsoftβ (i.e., optional)
15
[WSDM β11]
11/29/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu
βpositiveβ nodes βnegativeβ nodes
s
We prefer small weights π½
Every positive example has to have higher PageRank score than every negative example
Want to minimize: Loss: β(π₯) = 0 if π₯ < 0, or π₯2 else
16
[WSDM β11]
11/29/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
pl=pd pl < pd pl > pd
Loss
Penalty for violating the constraint that ππ > ππ
How to minimize π(π·) ?
Both pl and pd depend on Ξ² Given Ξ² assign edge weights ππ’π£ = ππ½(π’, π£) Using π = [ππ’π£] compute PageRank scores ππ€ Rank nodes by the decreasing score
Idea: Start with some random π½(0) Evaluate the derivative of πΉ(π½) and
do a small step in the opposite direction π½(π‘+1) = π½(π‘) β π ππΉ π½(π‘)
ππ½
Repeat until convergence
17
[WSDM β11]
11/29/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu
s
π½(0)
π(π·) π½(50)
π½(100)
Whatβs the derivative ππΉ π½(π‘)
ππ½ ?
We know: that is So:
11/29/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 18
[WSDM β11]
Easy!
We just got: Few details: Computing ππππ’/ππ½ is easy. Remember:
We want πππππ½
but it appears on both sides of the equation. Notice the whole thing looks like a PageRank equation: π₯ = π β π₯ + π§
As with PageRank we can use the power-iteration to solve it:
Start with a random ππππ½
(0)
Then iterate: ππππ½
(π‘+1)= π β ππ
ππ½
(π‘)+
ππππππ½
β π
11/29/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 19
[WSDM β11]
ππ’π’ = ππ½ π’, π£
= exp βοΏ½π½π β xπ’π’ ππ
To optimize π(π·), use gradient descent: Pick a random starting point π½(0) Using current π½(π‘) compute edge strenghts and
the transition matrix π Compute PageRank scores π Compute the gradient with
respect to weight vector π½(π‘) Update π½(π‘+1)
11/29/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 20
Iteration, (t)
Loss
, h(
Β·)
Facebook Iceland network 174,000 nodes (55% of population) Avg. degree 168 Avg. person added 26 friends/month
For every node s: Positive examples: π· = { new friendships π created in Nov β09 } Negative examples: πΏ = { other nodes π did not create new links to } Limit to friends of friends: On avg. there are 20,00 FoFs (maximum is 2 million)!
21
[WSDM β11]
11/29/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu
s
Node and Edge features for learning: Node: Age, Gender, Degree Edge: Age of an edge, Communication, Profile
visits, Co-tagged photos Evaluation: Precision at top 20 We produce a list of 20 candidates By taking top 20 nodes π₯ with highest PageRank score ππ₯
Measure to what fraction of these nodes π actually links to
11/29/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 22
Facebook: predict future friends Adamic-Adar already works great Supervised Random Walks (SRW) gives slight
improvement
11/29/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 23
Arxiv Hep-Ph collaboration network: Poor performance of unsupervised methods SRW gives a boost of 25%!
11/29/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 24
11/29/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 25
Take a user (yellow circle) and discover her social circles:
Why is it useful? To organize friend lists Control privacy
and access settings Filter content
Facebook, Twitter, Google+: Groups, lists, circles
11/29/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 26
Given ego node π’ and a network of her friends Find circles! Use network as well as user profile information For each circle we want to know why it is there!
11/29/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 27
Suppose we know all the circles πͺπ in the ego-network
We model the prob. of edge π,π where: π(π₯,π¦)β¦ is a feature vector describing (π₯,π¦) Are π₯ and π¦ from same school, same town, same age, ...
ππβ¦ parameter vector that we aim to estimate High ππ[π] means being similar in feature π is important for circle π
11/29/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 28
π(π)β¦ feature vector of edge π = (π₯,π¦) Are π₯ and π¦ from same
school, same town, same age, ...
ππβ¦ circle node βsimilarityβ parameter High ππ[π] means being
similar in feature π is important for circle π
11/29/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 29
Two ways to create feature vectors π(π,π)
11/29/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 30
π(π₯,π¦)
π(π₯,π¦)
Given an ego-graph G And edges features π(π,π) Want to discover: Circle node memberships πΆ and Circle parameters ππ
such that we maximize the likelihood:
11/29/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 31
To see the details of this is accomplished see: Discovering Social Circles in Ego Networks by J. McAuley, J.L. http://arxiv.org/abs/1210.8182
Facebook: Ask people to go through their
friend lists and hand label the circles
11/29/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 32
~30% circles donβt overlap ~30% overlap ~30% are nested