big data analytics

30
Copyright © 2013 EMC Corporation. All Rights Reserved. Big Data Analytics Big Data Analytics Big Data Analytics Big Data Analytics ICCBDA 2013 : International Conference on Cloud and Big Data Analytics David Dietrich David Dietrich David Dietrich David Dietrich, EMC Education Services 8 February , 2013 @imdaviddietrich My Blog : http://infocus.emc.com/author/david_dietrich

Upload: emc-academic-alliance

Post on 15-May-2015

5.266 views

Category:

Technology


0 download

DESCRIPTION

Keynote talk by David Dietrich, EMC Education Services at ICCBDA 2013 : International Conference on Cloud and Big Data Analytics http://twitter.com/imdaviddietrich http://infocus.emc.com/author/david_dietrich/

TRANSCRIPT

Page 1: Big Data Analytics

Copyright © 2013 EMC Corporation. All Rights Reserved.

Big Data Analytics Big Data Analytics Big Data Analytics Big Data Analytics

ICCBDA 2013 : International Conference on Cloud and Big Data Analytics

David DietrichDavid DietrichDavid DietrichDavid Dietrich, EMC Education Services

8 February , 2013

@imdaviddietrich

My Blog : http://infocus.emc.com/author/david_dietrich

Page 2: Big Data Analytics

Copyright © 2013 EMC Corporation. All Rights Reserved.

Agenda

In other words…

• Level setting on Big Data

• Emerging Need for Advanced Analytics

• Road ahead & Skill development

2

Page 3: Big Data Analytics

Copyright © 2013 EMC Corporation. All Rights Reserved.

Big Data

Page 4: Big Data Analytics

Copyright © 2013 EMC Corporation. All Rights Reserved. 4

Page 5: Big Data Analytics

Copyright © 2013 EMC Corporation. All Rights Reserved. 5

Page 6: Big Data Analytics

Copyright © 2013 EMC Corporation. All Rights Reserved.

Big Data

• Large Volumes

• New Sources

• Low Latencies

Key Characteristics

• New Platforms

• New Roles

• New Techniques

Implications for the Enterprise

6

Page 7: Big Data Analytics

Copyright © 2013 EMC Corporation. All Rights Reserved.

Four Main Types of Data Structures

7

http://www.google.com/#hl=en&sugexp=kjrmc&cp=8&gs_id=2m&xhr=t&q=data+scientist&

pq=big+data&pf=p&sclient=psyb&source=hp&pbx=1&oq=data+sci&aq=0&aqi=g4&aql=f&gs

_sm=&gs_upl=&bav=on.2,or.r_gc.r_pw.,cf.osb&fp=d566e0fbd09c8604&biw=1382&bih=651

The Red Wheelbarrow, by

William Carlos Williams

View � Source

Structured Data

Semi-Structured Data

Quasi-Structured Data

Unstructured Data

Page 8: Big Data Analytics

Copyright © 2013 EMC Corporation. All Rights Reserved.

Opportunities for a New Approach to Analytics

Big Data Ecosystem

8

AnalyticServices Advertising

LawEnforcement

Media

BanksGovernment

DeliveryService

PrivateInvestigators/Lawyers

Marketers Employers

Individual

Da t aDa t aDa t aDa t aUs e r s /Buye rsUs e r s /Buye rsUs e r s /Buye rsUs e r s /Buye rs

Websites

Information Brokers

MediaArchives

Credit Bureaus

ListBrokers

CatalogCo-Ops

RetailPhone/TV

Government Internet

Medical

Financial

Da t aDa t aDa t aDa t aCo l l e c t o r sCo l l e c t o r sCo l l e c t o r sCo l l e c t o r s

D a t aDa t aDa t aDa t aDe v i ce sDe v i ce sDe v i ce sDe v i ce s

Da t aDa t aDa t aDa t aAgg r ega to r sAgg r ega to r sAgg r ega to r sAgg r ega to r s

1

23

4

Page 9: Big Data Analytics

Copyright © 2013 EMC Corporation. All Rights Reserved.

Industries Are Broadly Embracing Data Science

RetailRetailRetailRetail

•CRM – Customer Scoring

•Store Siting and Layout

•Fraud Detection / Prevention

•Supply Chain Optimization

Advertising & Public RelationsAdvertising & Public RelationsAdvertising & Public RelationsAdvertising & Public Relations

•Demand Signaling

•Ad Targeting

•Sentiment Analysis

•Customer Acquisition

Financial ServicesFinancial ServicesFinancial ServicesFinancial Services

•Algorithmic Trading

•Risk Analysis

•Fraud Detection

•Portfolio Analysis

Media & TelecommunicationsMedia & TelecommunicationsMedia & TelecommunicationsMedia & Telecommunications

•Network Optimization

•Customer Scoring

•Churn Prevention

•Fraud Prevention

ManufacturingManufacturingManufacturingManufacturing

•Product Research

•Engineering Analytics

•Process & Quality Analysis

•Distribution Optimization

EnergyEnergyEnergyEnergy

•Smart Grid

•Exploration

GovernmentGovernmentGovernmentGovernment

•Market Governance

•Counter-Terrorism

•Econometrics

•Health Informatics

Healthcare & Life SciencesHealthcare & Life SciencesHealthcare & Life SciencesHealthcare & Life Sciences

•Pharmaco-Genomics

•Bio-Informatics

•Pharmaceutical Research

•Clinical Outcomes Research

9

Page 10: Big Data Analytics

Copyright © 2013 EMC Corporation. All Rights Reserved.

Emerging Need for Advanced Analytics

Page 11: Big Data Analytics

Copyright © 2013 EMC Corporation. All Rights Reserved.

DriverDriverDriverDriver Examples Examples Examples Examples

Desire to optimize business operations

Sales, pricing, profitability, efficiency

Desire to identify business risk Customer churn, fraud, default

Predict new business opportunities Upsell, cross-sell, best new customer prospects

Comply with laws or regulatory requirements

Anti-Money Laundering, Fair Lending, Basel II

Business Drivers for Advanced Analytics

1

2

3

4

Current Business Problems Provide Opportunities for Organizations to

Become More Analytical & Data Driven

11

Page 12: Big Data Analytics

Copyright © 2013 EMC Corporation. All Rights Reserved.

High

FuturePast TIME

BUSINESS

VALUE

Business

Intelligence

Predictive Analytics & Data Mining Predictive Analytics & Data Mining Predictive Analytics & Data Mining Predictive Analytics & Data Mining (Data Science)(Data Science)(Data Science)(Data Science)

Typical Techniques & Data Types

• Optimization, predictive modeling, forecasting, statistical analysis

• Structured/unstructured data, many types of sources, very large data sets

Common Questions

• What if…..?• What’s the optimal scenario for our

business ? • What will happen next? What if these

trends continue? Why is this happening?

Business Intelligence Business Intelligence Business Intelligence Business Intelligence

Typical Typical Typical Typical Techniques & Techniques & Techniques & Techniques & Data TypesData TypesData TypesData Types

• Standard and ad hoc reporting, dashboards, alerts, queries, details on demand

• Structured data, traditional sources, manageable data sets

Common Common Common Common Questions Questions Questions Questions

• What happened last quarter?• How many did we sell?• Where is the problem? In which

situations?

Data

Science

Low

Big Data Requires New Approaches to AnalyticsData Science & Big Data Analytics

12

Page 13: Big Data Analytics

Copyright © 2013 EMC Corporation. All Rights Reserved.

Churn Analysis for Mobile Telco

13

Synopsis: A Mobile Telco wanted to

understand why it’s losing customers

Business challenge: Proactively detect

mobile phone customers at risk of

canceling contracts (customer churn) to

retain customers and protect revenue

Traditional Approach to Churn Analysis

• Look at spending patterns

• Review recurrent problems

Approach with Big Data

• Analyze call history data

• Treat call history as a social network

Cell phone history portrayed as a social network

Page 14: Big Data Analytics

Copyright © 2013 EMC Corporation. All Rights Reserved. 14

Month 1

Example of Cell Phone Cancellation Outbreak

Page 15: Big Data Analytics

Copyright © 2013 EMC Corporation. All Rights Reserved. 15

Month 2

Example of Cell Phone Cancellation Outbreak

Page 16: Big Data Analytics

Copyright © 2013 EMC Corporation. All Rights Reserved. 16

Month 3

Example of Cell Phone Cancellation Outbreak

Page 17: Big Data Analytics

Copyright © 2013 EMC Corporation. All Rights Reserved. 17

Month 4

Example of Cell Phone Cancellation Outbreak

Page 18: Big Data Analytics

Copyright © 2013 EMC Corporation. All Rights Reserved. 18

High risk cell phone churners can now be identified in 1 hour, saving $40 MM in first year

If we had known two customers’

calling networks…

Could we have prevented five more

from leaving?

Using Social Network Analysis to Improve Churn Prediction

Page 19: Big Data Analytics

Copyright © 2013 EMC Corporation. All Rights Reserved.

Road ahead & Skill development

Page 20: Big Data Analytics

Copyright © 2013 EMC Corporation. All Rights Reserved.

Growth of Data Scientist Opportunities

Source: McKinsey Global Institute

Big data: The next frontier for innovation, competition and productivity

May 2011 Source: McKinsey Global Institute ; Big data: The next frontier for innovation, competition

and productivity, May 2011

Job Trends from Indeed.com

• “A significant constraint on realizing value from

big data will be a shortage of talent,

particularly of people with deep expertise in

statistics and machine learning, and the

managers and analysts who know how to

operate companies by using insights from big

data."

• By 2018...the United States alone faces a

shortage of 140,000 to 190,000 people with

analytical expertise and 1.5 million managers

and analysts with the skills to understand and

make decisions based on the analysis of big

data.

“Average "data scientist" salaries for job postings

nationwide are 55% higher than average salaries for

all job postings nationwide.”

20

Page 21: Big Data Analytics

Copyright © 2013 EMC Corporation. All Rights Reserved.

People & Skills

Three Key Roles of the New Data Ecosystem

Note: Figures above reflect a projected talent gap in US in 2018, as shown in McKinsey May 2011 article Big Data: The next frontier for innovation, competition, and productivity

Role

Deep Analytical TalentDeep Analytical TalentDeep Analytical TalentDeep Analytical Talent

Data Savvy ProfessionalsData Savvy ProfessionalsData Savvy ProfessionalsData Savvy Professionals

Technology & Data EnablersTechnology & Data EnablersTechnology & Data EnablersTechnology & Data Enablers

Data ScientistsProjected U.S. talent gap: 140,000 to 190,000

Projected U.S. talent gap: 1.5 million

21

Page 22: Big Data Analytics

Copyright © 2013 EMC Corporation. All Rights Reserved.

Profile of a Data Scientist

Curious & Creative

Technical

Quantitative

Communicative& Collaborative

Skeptical

22

Page 23: Big Data Analytics

Copyright © 2013 EMC Corporation. All Rights Reserved.

Data Science and Big Data AnalyticsCourse and EMCDSA Certification

Course OverviewCourse OverviewCourse OverviewCourse Overview

• “Open” curriculum

• Practitioner’s approach

• Enables Enables Enables Enables immediate participation on immediate participation on immediate participation on immediate participation on

analytics analytics analytics analytics projectsprojectsprojectsprojects

• Prepares for EMC Proven

Professional Data Science Associate

(EMCDSA) Certification

DetailsDetailsDetailsDetails

23

Page 24: Big Data Analytics

Copyright © 2013 EMC Corporation. All Rights Reserved.

Skills Matrix, Based on Recent Students

Technical Ability

Recent STEM

Grads

Business

Intelligence

Professionals, IT

Quantitative Analysts,

Statisticians,

Business and data analysts

Quantitative

Skills

Data

Scientists

24

Page 25: Big Data Analytics

Copyright © 2013 EMC Corporation. All Rights Reserved.

Specific Data Science Skills & Traits

1

2

3

4

5

EDW

Apply data science methods in their current roles

25

Page 26: Big Data Analytics

Copyright © 2013 EMC Corporation. All Rights Reserved.

Others Ways to Learn about Big Data Analytics

Formal Training

• EMC Data Science & Big Data Analytics

course

• STEM graduate programs and certificates

• Conferences on Analytics (Strata, PAW, ACM,

ACL, INFORMS, ICCBDA….)

• Free Massive Open Online Courses (MOOCs)

� 6 – 12 week online courses

� Coursera, Udacity, Udemy, edX, iTunesU,

Khan Academy

Informal Training

• Look for opportunities to try out your skills,

your day job provides this

• Offer to help on projects,

opportunistically…Every team is looking for

people with these skills right now

26

Page 27: Big Data Analytics

Copyright © 2013 EMC Corporation. All Rights Reserved.

Leverage The Wisdom of Crowds

• Social Media

• Volunteer to help

• Try Contests

� Kaggle, Innocentive

27

Page 28: Big Data Analytics

Copyright © 2013 EMC Corporation. All Rights Reserved.

Key Takeaways

• Analyzing big data provides significant

opportunity for deriving new value

• To do this, organizations will need to enrich the

skill sets of their analysts and emerging data

scientists

• Take advantage of EMC’s Data Science Associate

course or other opportunities to grow your skills

28

Page 29: Big Data Analytics

Copyright © 2013 EMC Corporation. All Rights Reserved.

Questions?

Additional Resources:Additional Resources:Additional Resources:Additional Resources:

1. My Blog on Data Science & Big Data Analytics:

http://infocus.emc.com/author/david_dietrich/

2. Blog on applying Data Analytics Lifecycle to measuring

innovation data:

http://stevetodd.typepad.com/my_weblog/data-science-and-

big-data-curriculum/

3. EMC Education Services curriculum on Data Science & Big

Data Analytics:

http://education.emc.com/guest/campaign/data_science.aspx

29

David Dietrich

@imdaviddietrich

Page 30: Big Data Analytics

Copyright © 2013 EMC Corporation. All Rights Reserved.

Thank You!

30