predicting consumer behaviour via hadoop

27
Slide ‹#› © 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Predicting Consumer Behaviour via Hadoop

Upload: skillspeed

Post on 13-Jan-2017

263 views

Category:

Technology


5 download

TRANSCRIPT

Page 1: Predicting Consumer Behaviour via Hadoop

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Predicting ConsumerBehaviour via Hadoop

Page 2: Predicting Consumer Behaviour via Hadoop

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Session Objectives

In this session you will understand

ᗍ Big Data and Hadoopᗍ HDFSᗍ MapReduce with examples and Scenariosᗍ Predictive Analytics and its processᗍ Three Pillars of Predictive Analyticsᗍ Applications of Predictive Analytics

Page 3: Predicting Consumer Behaviour via Hadoop

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Big Data and its Challenges

Big data is the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applicationsSystems / Enterprises generate huge amount of data from Terabytes to and even Petabytes of information

It’s very difficult to manage such huge data……

Page 4: Predicting Consumer Behaviour via Hadoop

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Who Generates Big Data?

Have you ever wondered how Google, Facebook or LinkedIn manages to store and utilize the huge data?Today, it is becoming a problem for all of us to manage such BIG DATA….

Page 5: Predicting Consumer Behaviour via Hadoop

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Hadoop and its CharacteristicsApache Hadoop is a framework that allows the distributed processing of large data sets across clusters of commodity computers using a simple programming model

It is an Open-source Data Management technology with scale-out storage and distributed processing

Hadoop CharacteristicsFlexible

Reliable

Economical

Scalable

Page 6: Predicting Consumer Behaviour via Hadoop

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Hadoop Ecosystem

Flume Sqoop

Import Or Export

Unstructured or Semi-Structured data Structured Data

Apache Oozie (Workflow)

HDFS(Hadoop Distributed File System)

Pig LatinData Analysis

HiveDW System

MapReduce Framework HBase

Other YARN

Frameworks (MPI, GIRAPH)

YARNCluster Resource Management

Page 7: Predicting Consumer Behaviour via Hadoop

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Data(Sources, Types, Forms)

Capture Predict

• Data Mining• Text Mining• Statistical Analytics

Act

Act on the model

Predictive Analysis

Page 8: Predicting Consumer Behaviour via Hadoop

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Why Predictive Analytics?

ᗍ Predictive analytics automatically synthesizes big data, mathematical sciences, business rules, and machine learning to make predictions and then suggests decision options to take advantage of a future opportunity

ᗍ The purpose of predictive analytics is to tell you what will happen in the future

ᗍ Predictive Analytics is branch of the Data Mining process

ᗍ An example of using predictive analytics is optimizing customer relationship management systems 

Page 9: Predicting Consumer Behaviour via Hadoop

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Monitor Progress

Implement Results

Draw Conclusions

Run Analysis

Check the data fits the tool

Draw Hypothesis

Implement Results

Extract data needed

Predictive Analytics – Process

Page 10: Predicting Consumer Behaviour via Hadoop

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Three Pillars of Predictive Analytics

Predictive Operational Analyticsᗍ Plan ᗍ Manageᗍ Maximize

Predictive Threat and Fraud Analyticsᗍ Monitor ᗍ Detect ᗍ Control

Predictive Customer Analyticsᗍ Acquire ᗍ Grow ᗍ Retain

Page 11: Predicting Consumer Behaviour via Hadoop

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Most Common Predictive Modelling Tasks

ᗍ Classificationᗍ Clusteringᗍ Associationᗍ Detectionᗍ Estimation and Time Seriesᗍ Link Analysisᗍ Web and Text Mining

Page 12: Predicting Consumer Behaviour via Hadoop

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Applications of Predictive Analytics ᗍ Analytical customer relationship management (CRM)ᗍ Clinical decision support systemsᗍ Customer retentionᗍ Direct marketingᗍ Fraud detectionᗍ Risk managementᗍ Underwriting

Page 13: Predicting Consumer Behaviour via Hadoop

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

What is Predictive Analytics all about?

Predictive analytics is really about solving problems with data

Predictive Analytics is the technology that learns from experience(data) to predict the future behaviour of individuals in order to drive better decisions

Predictive Analytics helps to connect data to effective action by drawing reliable conclusions about current conditions and future events

Enables businesses to use predictive models to exploit patterns found in historical data to identify potential risks and opportunities before they occur

Page 14: Predicting Consumer Behaviour via Hadoop

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Map Reduce – Scenario

Let us consider a real life scenario to understand the importance of “Map Reduce” in Hadoop

Suppose, you are the handling a project which

has x tasks and takes 100 hours for one resource to

complete

1 x 100 = 100 hours

100/10(resources) = 10 hours

Page 15: Predicting Consumer Behaviour via Hadoop

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Similarly,

= 100 hours 100/10 = 10 hours

Map Reduce – Scenario

Page 16: Predicting Consumer Behaviour via Hadoop

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

More Scenarios on Map-Reduce

Problem Statement:Find maximum stock market levels recorded in a span of 5 years

Problem Statement:De-identify personal identifier information

Page 17: Predicting Consumer Behaviour via Hadoop

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Traditional Solution

matchesSplit Data

VeryBig

Data

Allmatches

grep

grep

grep

cat

grep

:

matches

matches

matches

Split Data

Split Data

Split Data

Page 18: Predicting Consumer Behaviour via Hadoop

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

MapReduce Solution

VeryBig

Input

Split Data

Allmatches

:

Split Data

Split Data

Split Data

MAP

REDUCE

MapReduce Framework

Page 19: Predicting Consumer Behaviour via Hadoop

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

MapReduce Advantages

Two biggest advantages:

ᗍ Takes processing to the dataᗍ Allows processing data in

parallela b

c

Map TaskHDFS Block

Data Center

Rack

Node

Page 20: Predicting Consumer Behaviour via Hadoop

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

MapReduce Flow

1. Input data is present in data nodes2. Map tasks = Input Splits3. Mappers produce intermediate data4. Data exchanged among nodes in “shuffling”5. All data of same key goes to same reducer6. Reducer output stored at output location

Node 1

INPUT DATA

Map

Node 2

Map

Node 1

Reduce

Node 1

Reduce

Page 21: Predicting Consumer Behaviour via Hadoop

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Job Trends – Hadoop

Page 22: Predicting Consumer Behaviour via Hadoop

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Course Topics

Module 1Introduction to Big Data and Hadoop

Module 2HDFS Internals,

Hadoop Configurations and Data Loading

Module 3Introduction to Map

Reduce

Module 4Advanced Map Reduce

Concepts

Module 5Introduction to Pig

Module 6Advanced Pig and

Introduction to Hive

Module 7Advanced Hive

Concepts

Module 8Extending Hive and HBase Introduction

Module 9Advanced HBase and

Oozie Introduction

Module 10Project Set-up

Discussion

Page 23: Predicting Consumer Behaviour via Hadoop

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Why SkillSpeed?

Course Curriculum

from Industry Experts

Instructor Led Live Virtual Sessions

Lifetime access to Course

Content via LMS

100% Placement Assistance

24x7 Support

24x7

Page 24: Predicting Consumer Behaviour via Hadoop

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Corporate Partners

Page 25: Predicting Consumer Behaviour via Hadoop

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Lines open 24/7

To know more about the course, Please contact:

IND+91-90660-20904 USA1866-607-6547 (Toll Free)

Or reach us [email protected]

Contact us..

Page 26: Predicting Consumer Behaviour via Hadoop

Slide ‹#›© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Image References

Images Credits: Google, Facebook and LinkedIn LOGO and Snapshots

http://findicons.com/icon/66444/user_grouphttp://www.virtualizor.com/tour

https://accounts.it.et.byu.edu/

http://www.clipartsfree.net/tag/server.html

http://www.gopixpic.com/16/time-clock-icon-png-download

http://blog.smartbear.com/requirements/how-to-interview-users-to-find-out-what-they-really-want/

http://www.lincs.fr/research/areas/big-data/

http://www.counsellingpages.co.uk/

http://langfordsconsultancy.com/langfords-training-support-package/

http://cbsepathshala.blogspot.in/2012/05/physics-class-x-chapter-electricity.html

http://mmatycoon.com/tycoontimes/tycoontimesstory.php?SID=1010

Page 27: Predicting Consumer Behaviour via Hadoop