structure, tie persistence and event detection in large phone and sms networks
DESCRIPTION
Structure, Tie Persistence and Event Detection in Large Phone and SMS Networks. How to predict whether a link will persist in the future?. Event Detection. Tie Persistence. Phone and SMS network. Tie Persistence. Around 2M users, 50M edges, 500M phone calls/SMS - PowerPoint PPT PresentationTRANSCRIPT
Structure, Tie Persistence and Event Detection in Large Phone and SMS Networks
Leman Akoglu and Bhavana Dalvi{lakoglu , bbd} @cs.cmu.edu
Carnegie Mellon University and iLab
Dataset used for this work was provided by iLab at Carnegie Mellon University.
• Tie Persistence (TP) : It is the stability of ties across time as number of time-ticks in which a link is observed, over the total number P of time-ticks.
• User Perseverance (UP) : Perseverance of a user is defined as the average of the persistence of all his/her ties.
Tie Persistence
(2) Event Detection: • Define a sliding window of size W (set to 5 days) • Generate a correlation matrix C, with Cij being Pearson’s correlation
between the time series of pair (i,j)over window W. • Largest eigenvector of C give the “activity” of each node. • Compare “activity” vectors over time by taking dot product score Z (1 if same, 0 if perpendicular –flag for small Z)
Structure Analysis
Phone and SMS network
Tie Attributes Node Attributes
Reciprocity (R) : 1 if the tie is reciprocal in time tick
Degree (K)
Topological Overlap (TO) : Cluster Coefficient (C) :
User reciprocity (r) : Faction of ties containing both incoming and outgoing calls
# common neighbours
Node degree
# triads in which node is involved
Tie Persistence Event Detection
Methodology(1) Feature extraction: • Characterize nodes
with 12 network-features F: degree (number of contacts), total weight (phone call duration), …
• One TxN time-series matrix per feature, T=183 days N=1,8M users
(left) Z score vs time with W=5 and F=inweight (number of calls received).Top 10 days with the largest Z score is highlighted in red bars. (middle) u(t) vs r(t-1) for each node at T=Dec 26th. Top 5 nodes with the largest change is marked with red stars. (right) inweight vs time for the top 5 nodes marked –notice the change in calling behavior during the Christmas week.
• How to predict whether a link will persist in the future?
• Which link and node attributes are important in prediction?
Around 2M users, 50M edges, 500M phone
calls/SMS6 months data
Tie strength based on (a) # SMS (b) # Phone calls (c) Duration of phone calls