user modeling on demographic attributes in big mobile social networks · 2020-03-05 · user...
TRANSCRIPT
![Page 1: User Modeling on Demographic Attributes in Big Mobile Social Networks · 2020-03-05 · User Modeling on Demographic Attributes in Big Mobile Social Networks Yang Yang Northwestern](https://reader034.vdocuments.net/reader034/viewer/2022042112/5e8e10b06fa17553352ce5c0/html5/thumbnails/1.jpg)
User Modeling on Demographic Attributes in Big
Mobile Social Networks
Yang YangNorthwestern University
User Modeling on Demographic Attributes in Big Mobile Social Networks.
Yuxiao Dong, Nitesh V. Chawla, Jie Tang, Yang Yang, Yang Yang.
ACM TOIS 2017
Inferring User Demographics and Social Strategies in Mobile Social Networks.
Yuxiao Dong, Yang Yang, Jie Tang, Yang Yang, Nitesh V. Chawla.
ACM KDD 2014
![Page 2: User Modeling on Demographic Attributes in Big Mobile Social Networks · 2020-03-05 · User Modeling on Demographic Attributes in Big Mobile Social Networks Yang Yang Northwestern](https://reader034.vdocuments.net/reader034/viewer/2022042112/5e8e10b06fa17553352ce5c0/html5/thumbnails/2.jpg)
The Era of Digitally Networked World
http://wearesocial.com/uk/blog/2018/01/digital-in-2018-global-overview1
![Page 3: User Modeling on Demographic Attributes in Big Mobile Social Networks · 2020-03-05 · User Modeling on Demographic Attributes in Big Mobile Social Networks Yang Yang Northwestern](https://reader034.vdocuments.net/reader034/viewer/2022042112/5e8e10b06fa17553352ce5c0/html5/thumbnails/3.jpg)
2
As of 2018, there were 5.135 billion mobile subscriptions, large global penetration.
Users average 22 calls, 23 messages, and 110 status checks per day[2].
1. http://www.dailymail.co.uk/sciencetech/article-2449632/How-check-phone-The-average-person-does-110-times-DAY-6-seconds-evening.html
2. https://www.enisa.europa.eu/media/press-releases/using-national-roaming-to-mitigate-mobile-network-outages201d-new-report-by-eu-cyber-security-agency-enisa
![Page 4: User Modeling on Demographic Attributes in Big Mobile Social Networks · 2020-03-05 · User Modeling on Demographic Attributes in Big Mobile Social Networks Yang Yang Northwestern](https://reader034.vdocuments.net/reader034/viewer/2022042112/5e8e10b06fa17553352ce5c0/html5/thumbnails/4.jpg)
Big Mobile Network Data
♣ A nation-wide large mobile communication data
• Over 7 million users: male 55% / Female 45%
• Over 1 billion call & message records between Aug. and Sep. 2008
• Reciprocal, undirected, and weighted networks: CALL & SMS
3 Europe and Mobile (CALL) population pyramids.
Overrepresented
Underrepresented
Underrepresented
![Page 5: User Modeling on Demographic Attributes in Big Mobile Social Networks · 2020-03-05 · User Modeling on Demographic Attributes in Big Mobile Social Networks Yang Yang Northwestern](https://reader034.vdocuments.net/reader034/viewer/2022042112/5e8e10b06fa17553352ce5c0/html5/thumbnails/5.jpg)
User Profiling on Demographics
4
![Page 6: User Modeling on Demographic Attributes in Big Mobile Social Networks · 2020-03-05 · User Modeling on Demographic Attributes in Big Mobile Social Networks Yang Yang Northwestern](https://reader034.vdocuments.net/reader034/viewer/2022042112/5e8e10b06fa17553352ce5c0/html5/thumbnails/6.jpg)
Human Social Needs & Social Strategies
5
• Human needs are defined according to the existential categories of
– being, having, doing, and interacting[1].
• Two basic social needs are to[2]
– Meet new people
– Strengthen existing relationships
• Social strategies are used by people to meet social needs[1,2,3].
– What are the social strategies of people with different demographics?
– Demographics: gender, age, social status, etc.
1. http://en.wikipedia.org/wiki/Fundamental_human_needs
2. M.J. Piskorski. Social strategies that work. Harvard Business Review. Nov. 2011.
3. V. Palchykov, K. Kaski, J. Kertesz, A.-L. Barabasi, R. I. M. Dunbar. Sex differences in intimate relationships. Scientific Reports 2012.
![Page 7: User Modeling on Demographic Attributes in Big Mobile Social Networks · 2020-03-05 · User Modeling on Demographic Attributes in Big Mobile Social Networks Yang Yang Northwestern](https://reader034.vdocuments.net/reader034/viewer/2022042112/5e8e10b06fa17553352ce5c0/html5/thumbnails/7.jpg)
How do people of different gender and age
connect & interact with each other?
6
![Page 8: User Modeling on Demographic Attributes in Big Mobile Social Networks · 2020-03-05 · User Modeling on Demographic Attributes in Big Mobile Social Networks Yang Yang Northwestern](https://reader034.vdocuments.net/reader034/viewer/2022042112/5e8e10b06fa17553352ce5c0/html5/thumbnails/8.jpg)
Micro: Ego, Social Tie, & Triad
7
![Page 9: User Modeling on Demographic Attributes in Big Mobile Social Networks · 2020-03-05 · User Modeling on Demographic Attributes in Big Mobile Social Networks Yang Yang Northwestern](https://reader034.vdocuments.net/reader034/viewer/2022042112/5e8e10b06fa17553352ce5c0/html5/thumbnails/9.jpg)
clu
ste
ring
coeffic
ient
Ego Networks
8
♣ Younger people are active in broadening their social circles, while
older people tend to maintain smaller but more closed connections.
Results in the CALL network, and similar observations are also found from SMS.
![Page 10: User Modeling on Demographic Attributes in Big Mobile Social Networks · 2020-03-05 · User Modeling on Demographic Attributes in Big Mobile Social Networks Yang Yang Northwestern](https://reader034.vdocuments.net/reader034/viewer/2022042112/5e8e10b06fa17553352ce5c0/html5/thumbnails/10.jpg)
How many different triadic social circles do we have?
♣ People expand both same-gender and opposite-gender social groups.
Results in the CALL network, and similar observations are also found from SMS.
9
![Page 11: User Modeling on Demographic Attributes in Big Mobile Social Networks · 2020-03-05 · User Modeling on Demographic Attributes in Big Mobile Social Networks Yang Yang Northwestern](https://reader034.vdocuments.net/reader034/viewer/2022042112/5e8e10b06fa17553352ce5c0/html5/thumbnails/11.jpg)
Demographic Triad Distribution
♣ The opposite-gender social groups disappear.
♣ The same-gender social groups last for a lifetime.
vs.
Results in the CALL network, and similar observations are also found from SMS.
10
![Page 12: User Modeling on Demographic Attributes in Big Mobile Social Networks · 2020-03-05 · User Modeling on Demographic Attributes in Big Mobile Social Networks Yang Yang Northwestern](https://reader034.vdocuments.net/reader034/viewer/2022042112/5e8e10b06fa17553352ce5c0/html5/thumbnails/12.jpg)
Null Model
♣ Users’ gender and age are randomly shuffled
♣ Randomly shuffle 10,000 times
♣ x: empirical result from real data
♣ 𝑥: shuffled results
♣ 𝜇 𝑥 : the average of shuffled data
♣ 𝜎( 𝑥): the standard deviation of shuffled data
♣ 𝑧 𝑥 : z-score 𝑧 𝑥 =𝑥 − 𝜇(𝑥)
𝜎( 𝑥)
11
![Page 13: User Modeling on Demographic Attributes in Big Mobile Social Networks · 2020-03-05 · User Modeling on Demographic Attributes in Big Mobile Social Networks Yang Yang Northwestern](https://reader034.vdocuments.net/reader034/viewer/2022042112/5e8e10b06fa17553352ce5c0/html5/thumbnails/13.jpg)
Demographic Triad Distribution
Results in the CALL network, and similar observations are also found from SMS.
♣ 𝑥: empirical result
from real data♣ 𝑧 𝑥 : z-score
z > 3.3overrepresented
z < -3.3underrepresented
12
♣ The results are statistically significant
♣ 𝜇 𝑥 : the average
of shuffled data
![Page 14: User Modeling on Demographic Attributes in Big Mobile Social Networks · 2020-03-05 · User Modeling on Demographic Attributes in Big Mobile Social Networks Yang Yang Northwestern](https://reader034.vdocuments.net/reader034/viewer/2022042112/5e8e10b06fa17553352ce5c0/html5/thumbnails/14.jpg)
How frequently do you call your mom vs. your significant other?
♣ Interactions between young girls and boys are much more frequent
than those between two girls or two boys.
vs.
Results in the CALL network, and similar observations are also found from SMS.
Color:
#calls/per month
13
![Page 15: User Modeling on Demographic Attributes in Big Mobile Social Networks · 2020-03-05 · User Modeling on Demographic Attributes in Big Mobile Social Networks Yang Yang Northwestern](https://reader034.vdocuments.net/reader034/viewer/2022042112/5e8e10b06fa17553352ce5c0/html5/thumbnails/15.jpg)
Social Tie Strength
♣ Cross-generation interactions between two females are more frequent than those
between two males or one male and one female.
Results in the CALL network, and similar observations are also found from SMS.
14
e.g., mom--daughter
e.g., mom--son
dad--daughter
e.g., dad--son
![Page 16: User Modeling on Demographic Attributes in Big Mobile Social Networks · 2020-03-05 · User Modeling on Demographic Attributes in Big Mobile Social Networks Yang Yang Northwestern](https://reader034.vdocuments.net/reader034/viewer/2022042112/5e8e10b06fa17553352ce5c0/html5/thumbnails/16.jpg)
Social Strategies across the Lifespan
More stableFewer friends
Younger Older
more friendssame-gender
opposite-genderfewer friends
only same-gender
closed circles
15
![Page 17: User Modeling on Demographic Attributes in Big Mobile Social Networks · 2020-03-05 · User Modeling on Demographic Attributes in Big Mobile Social Networks Yang Yang Northwestern](https://reader034.vdocuments.net/reader034/viewer/2022042112/5e8e10b06fa17553352ce5c0/html5/thumbnails/17.jpg)
Can we know who we are based on
our social networks?
16
![Page 18: User Modeling on Demographic Attributes in Big Mobile Social Networks · 2020-03-05 · User Modeling on Demographic Attributes in Big Mobile Social Networks Yang Yang Northwestern](https://reader034.vdocuments.net/reader034/viewer/2022042112/5e8e10b06fa17553352ce5c0/html5/thumbnails/18.jpg)
Network Mining and Learning Paradigm
Network Mining Tasks♣ node attribute inference
♣ community detection
♣ similarity search
♣ link prediction
♣ social recommendation
♣ …
Node Centralities:o degree
o betweenness
o clustering coefficient
o PageRank
o Eigenvector
o …
hand-crafted feature matrix
feature engineering machine learning models
17
![Page 19: User Modeling on Demographic Attributes in Big Mobile Social Networks · 2020-03-05 · User Modeling on Demographic Attributes in Big Mobile Social Networks Yang Yang Northwestern](https://reader034.vdocuments.net/reader034/viewer/2022042112/5e8e10b06fa17553352ce5c0/html5/thumbnails/19.jpg)
Predicting User Demographic Attributes
♣ Infer Users’ Gender Y and Age Z Separately.
o Model correlations between gender Y and attributes X;
o Model correlations between age Z and attributes X;
bag of
nodes
bag of labels
18
![Page 20: User Modeling on Demographic Attributes in Big Mobile Social Networks · 2020-03-05 · User Modeling on Demographic Attributes in Big Mobile Social Networks Yang Yang Northwestern](https://reader034.vdocuments.net/reader034/viewer/2022042112/5e8e10b06fa17553352ce5c0/html5/thumbnails/20.jpg)
Demographic Prediction
♣ Infer Users’ Gender Y and Age Z Simultaneously.
o Model correlations between gender Y and attributes X, Network G and Y;
o Model correlations between age Z and attributes X, Network G and Z;
o Model interrelations between Y and Z;
19
![Page 21: User Modeling on Demographic Attributes in Big Mobile Social Networks · 2020-03-05 · User Modeling on Demographic Attributes in Big Mobile Social Networks Yang Yang Northwestern](https://reader034.vdocuments.net/reader034/viewer/2022042112/5e8e10b06fa17553352ce5c0/html5/thumbnails/21.jpg)
WhoAmI Method
Attribute factor f()
Dyadic factor g()Triadic factor h()
Joint Distribution:
Code is available at: http://arnetminer.org/demographic
Modeling
traditional features X
Modeling interrelations
between gender and age
Modeling social strategies on
social tie
Modeling social strategies on
social triad
Random variable Y: Gender
Random variable Z: Age
20
![Page 22: User Modeling on Demographic Attributes in Big Mobile Social Networks · 2020-03-05 · User Modeling on Demographic Attributes in Big Mobile Social Networks Yang Yang Northwestern](https://reader034.vdocuments.net/reader034/viewer/2022042112/5e8e10b06fa17553352ce5c0/html5/thumbnails/22.jpg)
WhoAmI: Objective Function
Objective function:
Model learning:
gradient descentCircles?
Loopy Belief Propagation
K. P. Murphy, Y. Weiss, M. I. Jordan. Loopy Belief Propagation for Approximate Inference: Am Empirical Study. In UAI’99
Code is available at: http://arnetminer.org/demographic21
![Page 23: User Modeling on Demographic Attributes in Big Mobile Social Networks · 2020-03-05 · User Modeling on Demographic Attributes in Big Mobile Social Networks Yang Yang Northwestern](https://reader034.vdocuments.net/reader034/viewer/2022042112/5e8e10b06fa17553352ce5c0/html5/thumbnails/23.jpg)
22
Experiments: Feature Definition
♣ Given one node v and its ego network:
• Individual feature:
• Individual attribute: degree, neighbor connectivity, clustering coefficient, embeddedness and weighted degree.
• Friend feature:
• Friend attribute: # of connections to female/male, young/young-adult/middle-age/senior friends (from labeled friends).
• Dyadic factor: both labeled and unlabeled friends for social tie structures in v’s ego network.
• Circle feature:
• Circle attribute: # of demographic triads, i.e., v-FF, v-FM, v-MM; v-AA, v-AB, v-AC, v-AD, v-BB, v-BC, v-BD, v-CC, v-
CD, v-DD. (A/B/C/C denote the young/young-adult/middle-age/senior)
• Triadic factor: both labeled and unlabeled friends for social triad structures in v’s ego network.
♣ LCR/SVM/NB/RF/Bag/RBF:
• Individual/Friend/Circle Attributes
♣ FGM/DFG
• Individual/Friend/Circle Attributes
• Structure feature: Dyadic factors
• Structure feature: Triadic factors
![Page 24: User Modeling on Demographic Attributes in Big Mobile Social Networks · 2020-03-05 · User Modeling on Demographic Attributes in Big Mobile Social Networks Yang Yang Northwestern](https://reader034.vdocuments.net/reader034/viewer/2022042112/5e8e10b06fa17553352ce5c0/html5/thumbnails/24.jpg)
WhoAmI: Experiments
♣ Data: mobile phone userso >1.09 million users in CALL
o >304 thousand users in SMS
o 50% as training data
o 50% as test data
♣ Evaluation Metrics:o Weighted Precision
o Weighted Recall
o Weighted F1 Measure
o Accuracy
♣ Baselines:o LRC: Logistic Regression
o SVM: Support Vector Machine
o NB: Naïve Bayes
o RF: Random Forest
o BAG: Bagged Decision Tree
o RBF: Gaussian Radial Basis NN
o FGM: Factor Graph Model
o DFG (WhoAmI)
23
![Page 25: User Modeling on Demographic Attributes in Big Mobile Social Networks · 2020-03-05 · User Modeling on Demographic Attributes in Big Mobile Social Networks Yang Yang Northwestern](https://reader034.vdocuments.net/reader034/viewer/2022042112/5e8e10b06fa17553352ce5c0/html5/thumbnails/25.jpg)
Demographic Predictability
♣ Predictability of User Demographic Profiles
o The proposed WhoAmI (DFG) outperforms baselines by up to 10% in
terms of F1-Measure.
o We can infer 80% of users’ gender from the CALL network
o We can infer 73% of users’ age from the SMS network
o The phone call behavior reveals more user gender than text messaging
o The text messaging behavior reveals more user age than phone call
24
![Page 26: User Modeling on Demographic Attributes in Big Mobile Social Networks · 2020-03-05 · User Modeling on Demographic Attributes in Big Mobile Social Networks Yang Yang Northwestern](https://reader034.vdocuments.net/reader034/viewer/2022042112/5e8e10b06fa17553352ce5c0/html5/thumbnails/26.jpg)
Application 1: Postpaid Prepaid
♣ Postpaid mobile users are required to create an account by providing
detailed demographic information (e.g., name, age, gender, etc.).
♣ Prepaid services (pay-as-you-go) allow users to be anonymous --- no need
to provide any user-specific information.
o 95% of mobile users in India
o 80% of mobile users in Latin America
o 70% of mobile users in China
o 65% of mobile users in Europe
o 33% of mobile users in the United States
♣ Train the model on postpaid users and infer prepaid users’ demographics
25
![Page 27: User Modeling on Demographic Attributes in Big Mobile Social Networks · 2020-03-05 · User Modeling on Demographic Attributes in Big Mobile Social Networks Yang Yang Northwestern](https://reader034.vdocuments.net/reader034/viewer/2022042112/5e8e10b06fa17553352ce5c0/html5/thumbnails/27.jpg)
Application 1: Postpaid Prepaid
CALL Gender CALL Age SMS Gender SMS Age
♣ Slide the training ratio to match proportion of postpaid users per country
♣ Train the model on postpaid users and infer prepaid users’ demographics
26
![Page 28: User Modeling on Demographic Attributes in Big Mobile Social Networks · 2020-03-05 · User Modeling on Demographic Attributes in Big Mobile Social Networks Yang Yang Northwestern](https://reader034.vdocuments.net/reader034/viewer/2022042112/5e8e10b06fa17553352ce5c0/html5/thumbnails/28.jpg)
Application 2: Coupled Networks
2015.08.08 10:30
2015.08.08 10:48
2015.08.08 11:29
2015.08.08 11:01
……
2016.01.01 00:00
……
Coupled Demographic Prediction
27
![Page 29: User Modeling on Demographic Attributes in Big Mobile Social Networks · 2020-03-05 · User Modeling on Demographic Attributes in Big Mobile Social Networks Yang Yang Northwestern](https://reader034.vdocuments.net/reader034/viewer/2022042112/5e8e10b06fa17553352ce5c0/html5/thumbnails/29.jpg)
Coupled Network Data
♣ Real-world large mobile communication data
• Over 1 billion call & message records between Aug. to Sep. 2008
• Undirected and weighted networks
• Three major mobile operators Ea, Eb, Ec
k: average degree
cc: clustering coefficient
ac: associative coefficient
28
![Page 30: User Modeling on Demographic Attributes in Big Mobile Social Networks · 2020-03-05 · User Modeling on Demographic Attributes in Big Mobile Social Networks Yang Yang Northwestern](https://reader034.vdocuments.net/reader034/viewer/2022042112/5e8e10b06fa17553352ce5c0/html5/thumbnails/30.jpg)
WhoAmI: Distributed Coupled Learning
MPI based
29
![Page 31: User Modeling on Demographic Attributes in Big Mobile Social Networks · 2020-03-05 · User Modeling on Demographic Attributes in Big Mobile Social Networks Yang Yang Northwestern](https://reader034.vdocuments.net/reader034/viewer/2022042112/5e8e10b06fa17553352ce5c0/html5/thumbnails/31.jpg)
Coupled Demographic Prediction
♣ Train the model on my own users and infer the demographics of my competitor’ users.
♣ Infer 73~79% of gender information and 66~70% of age of a competitor’s users.
30
![Page 32: User Modeling on Demographic Attributes in Big Mobile Social Networks · 2020-03-05 · User Modeling on Demographic Attributes in Big Mobile Social Networks Yang Yang Northwestern](https://reader034.vdocuments.net/reader034/viewer/2022042112/5e8e10b06fa17553352ce5c0/html5/thumbnails/32.jpg)
31
♣ Discover the evolution of social strategies across lifespan
♣ Propose Probabilistic Graphical Model---Multi-Label Factor
Graph (WhoAmI)---for node attribute prediction in networks
♣ Demonstrate the predictability of users’ gender and age
from mobile communication networks & two applications in
telecommunications.
![Page 33: User Modeling on Demographic Attributes in Big Mobile Social Networks · 2020-03-05 · User Modeling on Demographic Attributes in Big Mobile Social Networks Yang Yang Northwestern](https://reader034.vdocuments.net/reader034/viewer/2022042112/5e8e10b06fa17553352ce5c0/html5/thumbnails/33.jpg)
Thank you!
32
User Modeling on Demographic Attributes in Big Mobile Social Networks.
Yuxiao Dong, Nitesh V. Chawla, Jie Tang, Yang Yang, Yang Yang.
In ACM Transactions on Information Systems, 2017 (TOIS 2017).
Inferring User Demographics and Social Strategies in Mobile Social Networks.
Yuxiao Dong, Yang Yang, Jie Tang, Yang Yang, Nitesh V. Chawla.
In ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2014 (KDD’14).
Code is available at: http://arnetminer.org/demographic