final_project

21

Upload: ashwin-dinoriya

Post on 13-Apr-2017

147 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Final_Project
Page 2: Final_Project

Market Idea

Consider, we have a client who wants to increase their sales and target its customers with promotional offers. We would provide

them with twitter analytics to find 10 most influential people within a range that can influence the target customers.

Page 3: Final_Project

Tools Used

Page 4: Final_Project

Data Extraction and Visualization

Page 5: Final_Project

Data Extraction and Visualization

Page 6: Final_Project

Data Extraction and Visualization

Pre-processing

Page 7: Final_Project

Visualizations

Cluster Dendogram

Page 8: Final_Project

Influential Users

Page 9: Final_Project

Sentimental Analysis

Page 10: Final_Project

Machine Learning

• Experimenting Twitter based Sentimental

Analysis

• A lexicon based approach has high precision

but low coverage

• A data-driven machine learning approach

• Learns from a corpus of annotated tweets

Page 11: Final_Project

Step 1: Collect data

Step 2: Preprocessing text data in R

Step 3: Feature hashing and Filtered based feature selection

Step 4: Split the data into train and test

Step 5: Train model for prediction

Step 6: Evaluate model performance

Machine Learning Steps

Page 12: Final_Project

Boosting Decision Tree

Logistic Regression

Support Vector Machine

Validation Score

Page 13: Final_Project

Screen Name

RedEyeInc alexmaxha.. SchadenJ.. burritojust.. MikeIsaac treyturner.. FoodPorn.. beastlyBE.. harrisj ayooocam0K

10K

20K

30K

40K

50K

60K

70K

80K

Followers Count

32,545

76,388

13,719

61,953

37,113

14,183

19,334

16,364

66,022

74,091

Top 10 By Number of FollowersScreen Name

RedEyeInc

alexmaxham

SchadenJake

burritojustice

MikeIsaac

treyturnerband

FoodPorn_MX

beastlyBETCH19

harrisj

ayooocam

Sum of Followers Count for each Screen Name. Color shows details about Screen Name. The data is filtered on Action (ScreenName), which keeps 855 members. The view is filtered on Screen Name, which keeps 10 of 855 members.

Top Influential People

Page 14: Final_Project

0K 5K 10K 15K 20K 25K 30K 35K 40K 45K 50K 55K 60K 65K 70K 75K 80KFollowers Count

0K

200K

400K

600K

800K

1000K

1200K

Statuses Count

All Followers vs Statuses

0K 5K 10K 15K 20K 25K 30K 35K 40K 45K 50K 55K 60K 65K 70K 75K 80KNumber of Followers

0K

200K

400K

600K

800K

1000K

1200K

Statuses Count

Top 10 Followers vs Statuses

Screen NameRedEyeInc

alexmaxham

SchadenJake

burritojustice

MikeIsaac

treyturnerband

FoodPorn_MX

beastlyBETCH19

harrisj

ayooocam

Screen Name

Top 10 by SUM([FollowersCount])

Limit

All

Followers vs Status Count

Page 15: Final_Project

Created

23 24 25 26 27 28 29

ChipotleCompany

QdobaCompany

ChipotleCompany

QdobaCompany

ChipotleCompany

QdobaCompany

ChipotleCompany

QdobaCompany

ChipotleCompany

QdobaCompany

ChipotleCompany

QdobaCompany

ChipotleCompany

QdobaCompany

0

200

400

600

800

1000

1200

1400

1600

1800

2000

2200

2400

Value

2,156

217

365

528

635

744

1,021

198

2,245

487

1,242

266

1,827

700

Qdoba vs Chipotle Weekly Tweets

Measure NamesChipotle Company

Qdoba Company

Chipotle Company and Qdoba Company for each Created Day. Color shows details about Chipotle Company and Qdoba Company.

vs

Page 16: Final_Project

Created

23 24 25 26 27 28 29

0K

2K

4K

6K

8K

10K

12K

14K

16K

18K

Retweet Count

Weekly Tweet BoxPlot

Retweet Count for each Created Day. Color shows details about Screen Name. De-tails are shown for Screen Name.

Retweet Analysis

Page 17: Final_Project

Implementation Using

Amazon Cloud Formation Stack

Amazon EMR Clusters

Mapper in Python

Master Slave

ReduceOutput

Page 18: Final_Project

Implementation Using

OUTPUT:

Page 19: Final_Project

Power of Sentiments

Page 20: Final_Project

Power of Sentiments

Page 21: Final_Project

#ThankYou