pagerank identifying key users in social networks student : ivan todorović, 3231/2014 mentor :...
TRANSCRIPT
PageRankIdentifying key users in social networksStudent : Ivan Todorović, 3231/2014Mentor : Prof. Dr Veljko Milutinović
Introduction
• Social Networks – Connecting people
• Sustainable revenues
• Full advertising potential
• Key Users
• Novel PageRank
2/19
What is a Key User ?
• Large community
• Affects a large number of persons
• Unlikely to live OSN
• Pay for Premium services
3/19
Users’ Connectivity in OSN
• Structural characteristics of the network
• Well-connected users
• Social Graph
• Centrality measures– Degree– Closeness– Betweenness
4/19
Users’ Communication Activity
• Exchange of information
• User interaction
• Activity Graph
• Strong/Weak connection
5/19
PageRank
• An algorithm used by Google• PageRank is a link analysis algorithm• Outputs a probability distribution• Apply to any graph or network• Personalized PageRank is used by Twitter
6/19
Novel PageRank
• Identify key users
• First step– Derive a weighted activity graph
• Second step– Determine users’ centrality scores
7/19
Weighted Activity Graph
• Users who actually communicate
• Graph Links
• Informational and Normative influence
8/19
Weighted Activity Graph
• Graph representation– Symmetric adjacency matrix
• Weight of an undirected activity link
Cij – number of communication activities (i j)
Cji – number of communication activities (j i)
• Activity Graph
n – Number of users
9/19
Users’ Centrality Scores
• PageRank used by Google
N – Total number of webpages
Oj – Number of outgoing links from page j
Bi – Set of web pages pointing to web page i
d – dampening factor (usually set to 0.85)
• Novel PageRank
• Fi – Set of users connected to i
10/19
Demonstration and Evaluation
• Facebook dataset – New Orleans– Set of users (63,731)
– Set of social links (817,090)
– Communication activity
– 832,277 wall posts
– BFS Crawler
11/19
Demonstration and Evaluation
12/19
Pros and Cons
• Great results
• Complexity O(n²)
• Social and Activity Graph
• Offline contacts
• Direction of posts/messages
• Privacy risks
13/19
Conclusion
• Potential to generate sustainable revenues
• Easy to implement
• Efficient
14/19
Improvements
• Text Mining to detect influence
• Scan user messages
• Detect positive/negative user response
• Use it to form directed activity graph
15/19
Improvements
16/19
Hey, check this movie(…)
Well, I don’t like comedy
moves
Okay, maybe we could watch
this one (…)
That trailer looks really
good
A
B
A
B
Detected negative response
Influence confirmed
Improvements
• Distributed PageRank algorithm
• Monte Carlo approximation
• Perform K random walks in parallel– Walk to a random neighbor (probability 1- Ɛ)– Terminate in current node (probability Ɛ)
• After walk termination– Each node computes its PageRank value
• Complexity O(log n / Ɛ)17/19
Literature
• Antonio Caso, Silvia Rossi, “Users Ranking in Online Social Networks to Support POI Selection in Small Groups”, University of Naples
• Wikipedia, “PageRank”, http://en.wikipedia.org/wiki/PageRank, December 2014.
• Julia Heidemann, Mathias Klier,Florian Probst, “Identifying Key Users in Online Social Networks – PageRank Based Approach”, Research Paper, University of Augsburg, University of Innsbruck
18/19
Thank you for your attention
Questions ?
19/19