predictive semantic social media analysis david a. ostrowski

16
Predictive Semantic Social Media Analysis David A. Ostrowski System Analytics and Environmental Sciences Research and Advanced Engineering Ford Motor Company

Upload: mizell

Post on 15-Jan-2016

22 views

Category:

Documents


0 download

DESCRIPTION

Predictive Semantic Social Media Analysis David A. Ostrowski System Analytics and Environmental Sciences Research and Advanced Engineering Ford Motor Company. Social media. Influential Sample of the web News driven CRM Real-time Less biased Unique opportunities for analytics. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Predictive Semantic Social Media Analysis David A. Ostrowski

Predictive Semantic Social Media Analysis

David A. Ostrowski System Analytics and Environmental Sciences

Research and Advanced Engineering

Ford Motor Company

Page 2: Predictive Semantic Social Media Analysis David A. Ostrowski

Social media

• Influential• Sample of the web

– News driven• CRM

– Real-time– Less biased

• Unique opportunities for analytics

Page 3: Predictive Semantic Social Media Analysis David A. Ostrowski

Opportunities

• Old Model– Reactionary

• Damage control• Inquiries• Confirm positive reaction

• New Model– Preemptive

• Focused engagement– Promotions– Events– Media

• Anticipatory

Page 4: Predictive Semantic Social Media Analysis David A. Ostrowski

Social Dimensions

• Describes affiliations across a network

• Values / Community

• Reinforced by relationships

• Utilize to predict purchase behavior

Page 5: Predictive Semantic Social Media Analysis David A. Ostrowski

Relational Learning

• ‘Birds of a Feather’

• Leverage each local network to semantic understanding

• Relational Learning =>Social dimensions

Page 6: Predictive Semantic Social Media Analysis David A. Ostrowski

Framework Overview

• Relational learning– Strengthen representation– Support knowledge

• Unsupervised classification– Generation of dimensions

• Supervised classification– Dimensions => behavior

Movies Television Shows associationsschools

Fb identifier Fb identifier Fb identifier

Political affiliations Issues positions

values

Buying habits

Religious views

Page 7: Predictive Semantic Social Media Analysis David A. Ostrowski

Framework Overview

Localnetwork

taxonomylabels

SocialDimension

RNclassification

K-meanscluster

features

Supv.classification

behaviorsfeatures

Higher level features

Page 8: Predictive Semantic Social Media Analysis David A. Ostrowski

Case Study One

• 4000 facebook identifiers

• Associations to two vehicle lines

• Question:– What can we extract to characterize between these

two purchase behaviors

Page 9: Predictive Semantic Social Media Analysis David A. Ostrowski

Relational Learning Step

• Extracted data from FB

• Consolidated interests

• Applied the RN algorithm

• Guided by taxonomy

45 50 55 60 65 70 75 80 85 90

0

10

20

30

40

50

60

70

80

90

100

Facebook Accounts

missing labels (normalized)

Acc

ura

cy

RNBayesk-Means

Page 10: Predictive Semantic Social Media Analysis David A. Ostrowski

Preliminary cluster statistics

1 2 3 4 5 6veh1 k=3 46 39 13veh2 k=3 21 42 36veh1 k=4 44 16 12 26veh2 k=4 14 27 24 32veh1 k=5 21 8 1 0.3 45veh2 k=5 35 22 12 15 14veh1 k=6 7 43 6 13 9 19veh2 k=6 20 14 16 8 9 35

normalized differences between vehicle lines

Page 11: Predictive Semantic Social Media Analysis David A. Ostrowski

Extracted social dimensions

• Applied feature sets to k-means (3-6)

• Each classification attempt to characterize between vehicle line and a social dimension (value / interest ..)

• All classification to be considered towards behavioral training

• Also considered community detection– Via maximization of a modularity matrix via leading eigenvectors

Page 12: Predictive Semantic Social Media Analysis David A. Ostrowski

Applied Supervised Classification for the Behavior prediction

•Applied sets through three Machine Learning algorithm

•Simple Bayesprecision .7 , recall .69

• Weightily Averaged One-dependence Estimators(WAODE)precision .69 recall .70

•J48precision .69 recall .70

Page 13: Predictive Semantic Social Media Analysis David A. Ostrowski

Case Study 2

• 20000 Facebook IDs across four vehicle lines

• Relational modeling– Similar performance as first case study

• Social Dimensions generated for k=(3-7)– Not as much separation after k=6 clustering

• Precision recall (among simple bayes, WAODE, J48).469, .483.591, .588.534, .536

Page 14: Predictive Semantic Social Media Analysis David A. Ostrowski

Next Steps

• Institutionalization– Extract / define exactly what our dimensions are

explaining in our data sets.

• Relate to specific association – Values– community

Page 15: Predictive Semantic Social Media Analysis David A. Ostrowski

Q/ASee me for friends and neighbors discount…. [email protected]

Page 16: Predictive Semantic Social Media Analysis David A. Ostrowski

Appendix (software)

• ‘R’ igraph• ‘R’ km module• Weka• Ruby -Watir