who will follow whom? exploiting semantics for link prediction in attention-information networks

Who will follow whom?Exploiting Semantics for Link Prediction in

Attention-Information NetworksInternational Semantic Web Conference 2012. Boston, US

Matthew Rowe1 Milan Stankovic2,3 Harith Alani4

1School of Computing and Communications, Lancaster University, Lancaster, UK

2Hypios Research, 187 rue du Temple, 75003 Paris, France

3Universit Paris-Sorbonne, 28 rue Serpente, 75006 Paris

4Knowledge Media Institute, The Open University, Milton Keynes, UK

@mrowebot | [email protected]://www.matthew-rowe.com | http://www.lancs.ac.uk/staff/rowem/

http://www.matthew-rowe.com

http://www.lancs.ac.uk/staff/rowem/

Background Problem Formulation Approach Experiments Summary

Attention Information Networks

I The intersection of information and social networks[Yin et al., 2011]:

I Users can follow other users: u subscribes to vI u = Follower, v = Followee

I User u is paying attention to the content from user v

u v

I Users become ’Information Hubs’ [Romero and Kleinberg, 2010]I Tune in to get real time event information

I E.g. #Sandy, #Arabspring, #Londonriots

I People become social sensors

u v

I Attention is paid to the information that users publish

Who will follow whom? Exploiting Semantics for Link Prediction 2 / 22


Attention Economics

I Large uptake/adoption of Attention-Information Networks:I 31.9% increase in Twitter users in 2011

I Attention becomes a limited commodityI “What counts now is what is most scarce now, namely attention.”

[Goldhaber, 1997]

I Users must consider who they wish to subscribe toI Whose content do I wish to receive?I Who interests me?

I If we can understand who will follow whom & follower decisions:I Predict social capital based on expected network growth;I Facilitate audience building

I Of interest to Digital Marketing firms - i.e. boosting client’s presence



Outline

Problem FormulationRelated WorkFollower-Decision HypothesesFormulating the Problem

ApproachFeaturesConcept Disambiguation with User Contexts

ExperimentsDatasetExperimental SetupResults: Prediction AccuracyResults: Follower-Decision Patterns



Related Work

I Network-topology approaches [Golder and Yardi, 2010,Yin et al., 2011, Backstrom and Leskovec, 2011]:

I Path structures, common followers and common friends

I Local metadata approaches [Schifanella et al., 2010,Leroy et al., 2010, Brzozowski and Romero, 2011]:

I Common tags (Flickr, YouTube), group information (on Flickr)

I Local metadata approaches use tags or group memberships, but noconcepts

I No examination of the follower-decision behaviour patterns

I No exploration of divergent follower decision behaviour



Follower-Decision Hypotheses

H1. Following a user is performed when there is a topicalaffinity between the follower and the followee[Schifanella et al., 2010] found social and topical homophily to becorrelated on Flickr

H2. Users who do not focus on specific topics do not basetheir follower-decisions on topical information but on socialfactorsUnfocussed users show divergent decision behaviour

H3. Users who are more socially connected are driven by socialrather than topical factorsHigh-degree users are driven by social network effects



Formulating the Problem

I A directed social network is a graph: G = 〈V ,E 〉, where:I V denotes the set of users (nodes), and;I E is the set of edges (〈u, v〉 ∈ E) between nodes.

I meaning that u follows v

I An egocentric social network (egonet) of u is denoted by Γ(u)I Γ−(u) denotes in the follower network (incoming edges)I Γ+(u) denotes in the followee network (outgoing edges)

I A given user u is provided with a set of recommended users R(u)I R(u) ∩ Γ+(u) = ∅

I Goal: induce a function between users and recommendations:f : V × R → {0, 1}



Predicting Links in Attention-Information Networks

I Given our problem setting we want to:

1. Identify the best performing general model;2. Explore follower-decision behaviour and how this differs

I Problem is a binary classification task: pairwise features between uand each of his recommendations (v ∈ R(u))

I To enable accurate prediction and explore different factors behindlink creation we implement:

I Social features: based on the network-structureI Topical features: based on content published by u and vI Visibility features: based on the user noticing a followed

I We now explain the various features which are computed between uand v ∈ R(u)...



Social

Social features account for the topology of the network and the existenceof edges present within the network prior to recommendations

I Mutual Followers Count: Measures the overlap of the follower sets(i.e. the set of users connecting into a given user) between u and v .

I Mutual Followees Count Measures the overlap of the followee sets

I Mutual Friends Count Measures the overlap of the friends setsI i.e. Friendship is denoted by a bi-directional edge between nodes

I Mutual Neighbours Measures the overlap of the ego-centricnetworks of u and v whilst ignoring the directions of the links in thenetwork[Zhou et al., 2009, Yin et al., 2011, Backstrom and Leskovec, 2011]



Topical (I)

In Attention-Information Networks users pay attention the content ofother users

I Topical features use: a) tags, b) concept bags, c) concept graphs

I Tag Vectors: Examining tag/keyword overlap between u and v[Schifanella et al., 2010]

I Cosine Similarity: between the tag vectors of u and v

I Concept Bags: Examining overlap of conceptsI Return concepts from user content, then derive the concept bag

vectorI Cosine Similarity: similarity between the concept bag vectors of u

and vI Jensen-Shannon Divergence: probability distribution divergence

between concept bag vectors of u and vI Greater divergence means greater dissimilarity between topics



Topical (II)

u v

C1

C2 C3

I Concept Graphs: Semantic relatedness of users using graph-basedmetrics

I Measure distances between concepts from tags of u and v : d(ci , cj)I Distance measures have two varieties, based on input tags:I 1. Tag Intersection: Intersection of the tag sets of u and vI 2. All Tags: All tags from the tag sets of u and vI Measured three distances measures for d(ci , cj) using the above sets:

I Shortest Path: least number of steps from ci to cj (Bellman-Fordalgorithm)

I Hitting Time: number of steps for a random walker to leave ci andreach cj [Fouss et al., 2007]

I Commute Time: number of steps for a random walker to leave ciand reach cj , and then return to ci



Visibility

The presence of information published by a prospective followee couldinfluence users in their follower-decisions

I Retweet Count: total number of times a given user (v) has beenretweeted by members of the followee network belonging to u

I Mention Count: total number of times a given user (v) has beenmentioned by members of the followee network belonging to u

I Comment Count: total number of times a given user (v) has hadhis content commented on by members of the followee networkbelonging to u

I Weighted Counts: weight each count by reply-frequency withego-network member



Features Summary

Type Feature Name Output DomainSocial Mutual Followers Count {0} ∪N+

Mutual Followees Count {0} ∪N+

Mutual Friends Count {0} ∪N+

Mutual Neighbours Count {0} ∪N+

Topical Tag Vectors - Cosine [0, 1]Concept Bags - Cosine [0, 1]Concept Bags - JS-Divergence R

+

Concept Graphs - Int - Shortest Path N+

Concept Graphs - All - Shortest Path N+

Concept Graphs - Int - Hitting Time R+

Concept Graphs - All - Hitting Time R+

Concept Graphs - Int - Commute Time R+

Concept Graphs - All - Commute Time R+

Visibility Retweet Count {0} ∪N+

Mention Count {0} ∪N+

Comment Count {0} ∪N+

Weighted Retweet Count {0} ∪R+

Weighted Mention Count {0} ∪R+

Weighted Comment Count {0} ∪R+



Concept Disambiguation with User Contexts

I Distances across the concept graph capture semantic relatedness

I Distance metrics require a mapping between a tag and a concept...I Polysemy Problem: one tag can be mapped to multiple concepts

I [Cantador et al., 2011] propose ‘distributional aggregation’ tochoose the most representative tag for a web resource:

I Voting mechanism: Tag usage frequency amongst a collection ofusers

I Our voting mechanism: concept frequency given the userI For a given tag: count candidate concept frequency in concept bag

CTu , choose the most frequent



Dataset

I Knowledge Discovery and Data mining (KDD) Cup 2012 FollowerPrediction task

I Chinese microblogging platform Tencent WeiboI Users, recommendations, and outcomesI Follow-graph of usersI Set of tags found within each user’s contentI Tag-categorisation data and category graph

●

●

●

●●

●●●●

●●●●●●

●●●●●

●

●●

●●●●●●●●

●●●

●●●●

●●

●

●

●

●●

●●●●

●●

●

●●●●

●●

●

●

●

●●●●

●●

●●●

●●

●●●●●●●●●●

1 2 5 10 20 50

110

100

1000

categories (n)

Fre

quen

cy (

c(n)

)

(a) Categories per Tag

●

●

●

●

●●●●

●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●

●●

●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●

●

●●●●●●●●●●●●●●

●

●

●

●●●●●●●●●●●●●●●●●●●

●●

●●●●

●

●

●●●●●●

●●●●

●

●

●●●●●●●●

●●●●●●●●●●●●●●

●●●●●●●●

●●

●

●●●●●

●

●●●●●●

●

●

●

●●●●●●

●

●●●

●●●●

●

●●●●●●●●

●

●

●●●

●

●●

●●

●

●

●●

●

●

●●●

●

●

●

●●

●●●●

●

●●

●●●●

●●

●●●●●●

●●●●●

●●●

●

●

●●●●

●●●●●●●

●

●

●

●●

●

●

●

●

●

●●

●●●●●●

●

●●

●●

●

●●

●●

●●●●●

●

●

●

●

●●●

●●●

●

●

●

●●●●

●●

●●●

●

●●

●

●

●

●●●●●●●

●●●

●

●

●●

●●●●

●●

●

●●

●

●●●

●●

●●

●●

●●●●

●

●

●●

●●●

●●●●●●●

●●

●●●●

●●●●

●●

●

●

●

●●

●●

●

●

●

●

●

●

●

●●●

●

●●

●●●●

●●

●●●

●

●●●●

●

●●●●●●

●●

●●●●●●●

●

●●●●●

●●

●●●●●●●●●●●●

●●

●●●

●

●●●●●●●●●●●

●

●●●●●●●●●●●●●●●

●

●●●●●●●●●●

●

●●●●●●●●●●●●

1 5 50 500

110

010

000

recommendations (n)

Fre

quen

cy (

c(n)

)

(b) Recommend’ per User



Experimental Setup

I 1. General Follower Prediction: seeking a follower modelI Randomly selected 10% of users and built pairwise feature vectors

I 2. Binned Follower Prediction: seeking behaviour-specific modelsI Divided users into 10 bins based on: a) concept-bag entropy, b)

out-degreeI Selected all the users from low and high bins, built feature vectors

I Divided each dataset into an 80:20% split for training and testing

I For each experiment:

1. Model Selection2. Pattern Analysis

I Evaluation Measures:

1. Area Under the receiver operator characteristic Curve (AUC)2. Matthews correlation coefficient (MCC)



Results: Prediction Accuracy

I General Follower Prediction ModelI Topical features significantly betterI Models significantly outperform the random model

I Binned Follower Prediction ModelsI Concept entropy: low - topical features; high - social featuresI Degree: low and high - topical features

I Visibility features have little effect on predictions (majority are zero)

Full Entropy − Low Entropy − High Degree − Low Degree − High

0.0

0.2

0.4

0.6

0.8

1.0

SocialTopicalVisibilityAll

(c) AUC

Full Entropy − Low Entropy − High Degree − Low Degree − High

−0.

10.

00.

10.

20.

30.

4

SocialTopicalVisibilityAll

(d) MCC



Results: Follower-Decision Patterns

Connections are formed...

I In the General Follower Prediction Model when:I users share neighboursI users are closer in terms of the subjects they discuss

I In the Binned Follower Prediction ModelI for low entropy and low degree users when:

I same feature pattern as the general modelI for high entropy users when:

I users have an overlap of subscribersI tags differ, but similar concepts!

I for high degree users when:I users listen to the same peopleI users share a topical affinity with the same pattern as the general

model



Findings

I General behaviour pattern: topical homphilyI [Schifanella et al., 2010] found socially close users to have high tag

cosineI Our approach detects latent patterns based on concept graphs

I On common followers:I [Golder and Yardi, 2010, Brzozowski and Romero, 2011] found

mutual audience to correlate with link creationI We find that: mutual followers should be reduced in the general

model

I On common neighbours:I [Leroy et al., 2010] found an increase in mutual neighbours to

correlate with link creationI Similar effect in our findings

I Divergent behaviour for high entropy users: suggests a need forbespoke models



Conclusions

I Our approach for link prediction outperforms: a) a random baseline,b) existing network-structure approaches

I General follower-decision model identified topical homophily effects

I Accounting for behaviour uncovered different follower-decisions:I Unfocussed users follow users with whom they have conceptual

affinityI Concept-graphs allowed for latent effects to be identified

I Applicable over the linked data graph

I Can improve recommendations by accounting for behaviour andbuilding bespoke models:

I Growing the platform’s network and increasing social capitalI Understand who will follow whom, and audience growth



Future Work

I Apply our approach over Twitter and YouTube: are findingsconsistent?

I Extract concepts from content, measure distances across the LinkedData graph

I Inclusion of more nuanced user behaviourI Conjecture: performance is conditioned on time-sensitive user

behaviourI User Churn: detecting the complement of link creation

I 25 days of Twitter logs show this (red):

∆(u) = |Γ−t′ (u)| − |Γ−

t (u)| (1)

010

0020

0030

0040

0050

0060

00

∆

c(∆)

−40 −20 0 20 40



Questions

Twitter: @mrowebot

Email: [email protected]

WWW: http://www.matthew-rowe.comWWW: http://www.lancs.ac.uk/staff/rowem/


http://www.matthew-rowe.com

http://www.lancs.ac.uk/staff/rowem/

who will follow whom? exploiting semantics for link prediction in attention-information networks

Technology