real-time twitter sentiment analysis and image recognition with apache nifi

20
1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Timothy Spann 2016 Future of Data – Princeton Meetup Real-time Twitter Sentiment Analysis and Image Recognition

Upload: timothy-spann

Post on 16-Apr-2017

665 views

Category:

Engineering


7 download

TRANSCRIPT

Page 1: Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi

1 ©HortonworksInc.2011– 2016.AllRightsReserved

TimothySpann2016FutureofData– PrincetonMeetup

Real-time Twitter Sentiment Analysis and Image Recognition

Page 2: Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi

2 ©HortonworksInc.2011– 2016.AllRightsReserved

TheFutureofData:ActionableIntelligence

D A T A I N M O T I O N

STO

RA

GE

STO

RA

GE

GROUP 2GROUP 1

GROUP 4GROUP 3

D A T A A T R E S T

INTERNETOF

ANYTHING

Hortonworks’uniqueapproachtodata-in-motionanddata-at-restpowersActionableIntelligence

Page 3: Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi

3 ©HortonworksInc.2011– 2016.AllRightsReserved

DATAATREST

DATAINMOTION

ACTIONABLEINTELLIGENCE

MODERNDATAAPPLICATIONS

ActionableIntelligencefromConnectedDataPlatforms

Capturingperishableinsightsfromdatainmotion

Ensuringrich,historicalinsightsondataatrest

Necessaryformoderndataapplications

HortonworksDataFlow

HortonworksDataPlatform

Page 4: Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi

4 ©HortonworksInc.2011– 2016.AllRightsReserved

Data-in-motion:HortonworksDataFlowPoweredbyApacheNiFi

Collect,conductandcuratereal-timedata

End-to-endsecuritywithencryptionandrules

Traceabilityandreal-timeprovenance

DeliversInstant,PerishableInsights

HortonworksDataFlow

Data-in-motion

Page 5: Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi

5 ©HortonworksInc.2011– 2016.AllRightsReserved

Accumulate,Analyze,ActonAllData

CentralizedArchitectureforMulti-tenancy

EnterpriseOperations,GovernanceandSecurity

DeliversRichHistoricalInsights

Data-at-rest:HortonworksDataPlatformPoweredbyApacheHadoop

HortonworksDataPlatform

Data-at-rest

Page 6: Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi

6 ©HortonworksInc.2011– 2016.AllRightsReserved

à https://hortonworks.com/hadoop-tutorial/learning-ropes-apache-nifi/

à https://github.com/jfrazee/awesome-nifi

à https://dzone.com/articles/getting-started-with-apache-nifi-and-hdf

à https://nifi.apache.org/docs.html

à https://community.hortonworks.com/articles/4356/getting-started-with-nifi-expression-language-and.html

à https://dzone.com/articles/hdf-20-flow-processing-real-time-tweets-from-strat

à https://community.hortonworks.com/articles/64069/converting-a-large-json-file-into-csv.html

à https://community.hortonworks.com/articles/64122/incrementally-streaming-rdbms-data-to-your-hadoop.html

LearningMore

Page 7: Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi

7 ©HortonworksInc.2011– 2016.AllRightsReserved

http://hortonworks.com/blog/hdf-2-0-flow-processing-real-time-tweets-strata-hadoop-slack-tensorflow-phoenix-zeppelin/

Page 8: Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi

8 ©HortonworksInc.2011– 2016.AllRightsReserved

FlowFile:Eachpieceof"UserData"(i.e.,datathattheuserbringsintoNiFiforprocessinganddistribution)isreferredtoasaFlowFile.AFlowFile ismadeupoftwoparts:AttributesandContent.TheContentistheUserDataitself.Attributesarekey-valuepairsthatareassociatedwiththeUserData.

Processor:TheProcessoristheNiFicomponentthatisresponsibleforcreating,sending,receiving,transforming,routing,splitting,merging,andprocessingFlowFiles.ItisthemostimportantbuildingblockavailabletoNiFiuserstobuildtheirdataflows.

https://nifi.apache.org/docs/nifi-docs/html/getting-started.htmlhttps://nifi.apache.org/docs/nifi-docs/html/overview.htmlhttp://www.slideshare.net/aldrinpiri/apache-nifi-crash-course-san-jose-hadoop-summit-66967077https://hortonworks.com/hadoop-tutorial/learning-ropes-apache-nifi/

Quick Terms and Reference

Page 9: Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi

9 ©HortonworksInc.2011– 2016.AllRightsReserved

InvokeHttp

GetTwitter

Input

Page 10: Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi

10 ©HortonworksInc.2011– 2016.AllRightsReserved

RouteOnAttribute

ExecuteStreamCommand

UpdateAttribute

Processing

Page 11: Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi

11 ©HortonworksInc.2011– 2016.AllRightsReserved

PutHDFS:HaveaccesstoyourHadoopHDFSfromtheNIFIboxandhavethisconfiguration:/etc/hadoop/conf/core-site.xml .Alsocreateadirectorytouselikehdfs dfs –mkdir /nifi-place

PutSQL: Createaconnectionpool,knowyourJDBCinformation.

1. Phoenix: URL=jdbc:phoenix:clusterzookeeper:2181:/hbase-unsecureorg.apache.phoenix.jdbc.PhoenixDriver file:///opt/demo/phoenix-client.jarUser=rootpool=2YouwillneedtheJDBCJARonthelocalfilesystem.

2. MySQL:jdbc:mysql://mysqlserver-1:3306/datacom.mysql.jdbc.Driver/usr/share/java/mysql-connector-java.jar

PutSlackNeedtogetyourwebhook URLfromyourslacksite.Youcangotoslack.com andgetyourownfree

roomtotestwith.https://api.slack.com/incoming-webhooks

Output

Page 12: Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi

12 ©HortonworksInc.2011– 2016.AllRightsReserved

pythonclassify_image.py --image_file /opt/demo/dronedataold/Bebop2_20160920083655-0400.jpgsolardish,solarcollector,solarfurnace(score=0.98316)windowscreen(score=0.00196)manholecover(score=0.00070)radiator(score=0.00041)doormat,welcomemat(score=0.00041)

bazel-bin/tensorflow/examples/label_image/label_image --image=/opt/demo/dronedataold/Bebop2_20160920083655-0400.jpgtensorflow/examples/label_image/main.cc:204]solardish(577):0.983162Itensorflow/examples/label_image/main.cc:204]windowscreen(912):0.00196204Itensorflow/examples/label_image/main.cc:204]manholecover(763):0.000704005Itensorflow/examples/label_image/main.cc:204]radiator(571):0.000408321Itensorflow/examples/label_image/main.cc:204]doormat(972):0.000406186

Local TensorFlow via Python or C++ Binary

Page 13: Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi

13 ©HortonworksInc.2011– 2016.AllRightsReserved

/opt/demo/sentiment/run.shpython/opt/demo/sentiment/sentiment.py "$@”

fromnltk.sentiment.vader importSentimentIntensityAnalyzerimportsyssid =SentimentIntensityAnalyzer()ss =sid.polarity_scores(sys.argv[1])print('Compound{0}Negative{1}Neutral{2}Positive{3}'.format(

ss['compound'],ss['neg'],ss['neu'],ss['pos']))

or

ifss['compound']==0.00: print('Neutral')elif ss['compound']<0.00: print('Negative')else: print('Positive')

Local Sentiment Analysis via Python

Page 14: Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi

14 ©HortonworksInc.2011– 2016.AllRightsReserved

https://pip.pypa.io/en/latest/installing/http://www.nltk.org/install.html

wget https://bootstrap.pypa.io/get-pip.pypythonget-pip.pysudo pip install -U nltksudo pip install -U numpy

Installing NLTK for Python 2.7

InstallingTensorFlow isaverydifficultexercise,aftergettingNLTKyoucanstarttheprocess.YouwillneedmostofthedevelopmenttoolsforPython,C,C++,Bezel,Pipandmore.AbeefymachinewithalotofRAM,CPUsandGPUswouldbeuseful.

Checkoutmyinstallarticleforaguide:https://dzone.com/articles/deep-learning-resources

Installing TensorFlow for Python 2.7

Page 15: Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi

15 ©HortonworksInc.2011– 2016.AllRightsReserved

Results

Page 16: Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi

16 ©HortonworksInc.2011– 2016.AllRightsReserved

Results

Page 17: Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi

17 ©HortonworksInc.2011– 2016.AllRightsReserved

Installation

Download the binary from here: http://hortonworks.com/downloads/#dataflowOr here:https://nifi.apache.org/download.htmlOr on Mac:brew install nifi

https://nifi.apache.org/docs/nifi-docs/html/getting-started.html#starting-nifi

bin/nifi.sh startbin/nifi.sh install (now it’s installed as a service on Linux)

Page 18: Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi

18 ©HortonworksInc.2011– 2016.AllRightsReserved

Contact:

[email protected]/futureofdata-princeton

community.hortonworks.com/users/9304/tspann.html

Page 19: Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi

19 ©HortonworksInc.2011– 2016.AllRightsReserved

HortonworksCommunityConnection

Read access for everyone, join to participate and be recognized

• FullQ&APlatform(likeStackOverflow)

• KnowledgeBaseArticles

• CodeSamplesandRepositories

Page 20: Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi

20 ©HortonworksInc.2011– 2016.AllRightsReserved

CommunityEngagement

Participate now at: community.hortonworks.com©HortonworksInc.2011– 2015.AllRightsReserved

4,000+RegisteredUsers

10,000+Answers

15,000+TechnicalAssets

One Website!