big data use cases

Post on 04-Dec-2014

7.064 Views

Category:

Technology

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

Everyone is awash in the new buzzword, Big Data, and it seems as if you can’t escape it wherever you go. But there are real companies with real use cases creating real value for their businesses by using big data. This talk will discuss some of the more compelling current or recent projects, their architecture & systems used, and successful outcomes.

TRANSCRIPT

1

Big Data Use Cases*

DevNexus Conference2/18/2013

*Fully buzzword-compliant title

2

whoami• Brad Anderson• Solutions Architect at MapR (Atlanta)• ATLHUG co-chair• NoSQL East Conference 2009• “boorad” most places (twitter, github)• banderson@maprtech.com

3

Service Bureau

Client/Server

Application Service Provider

Cloud

B2B

Software-as-a-Service

Virtualization

Social Media

Mobile

Web 2.0

4

BIG DATA

5

6

Business Value

7

Business Value

8

Big Data is not new!but the tools are.

9

Ship the Function to the Data

SAN/NAS

data data data

data data data

data data data

data data data

data data data

function

RDBMS

Traditional Architecture

data

function

data

function

data

function

data

function

data

function

data

function

data

function

data

function

data

function

data

function

data

function

data

function

Distributed Computing

10

Variation: Multiple MapReducesExample: Fraud Detection in User Transactions

LDA training

Transaction data

LDA scoring

HBase /MapR M7 Edition

G2 score

Candidate events for analyst review

95 %-ile LDA anomaly

MapReduce

http://en.wikipedia.org/wiki/Latent_Dirichlet_allocation

11

MapR Distribution for Apache Hadoop

Complete Hadoop distribution

Comprehensive management suite

Industry-standard interfaces

Enterprise-grade dependability

Higher performance

Pig

Hive

HBase

Mahout

Oozie

Whirr

Map Reduce

Cascading

Nagios

Ganglia

MapR Control System

MapR Data Platform

MapR Control System

MapR Data Platform

Flume

Sqoop

HCatalog

Zookeeper

Avro

Map

Reduc

e

12

Big Data Ecosystem

13

Use Case Company Data Source(s) Technique(s) Business Value

14

Proactive Monitoring

15

Server Telemetry Monitoring Logs Network Flow

Data Sources

16

Pattern Recognition Proactive Monitoring Early Alert Delivery

Techniques

17

Business Value

18

Telecommunications Giant

ETL Offload

19

Customer Records Contract Data Purchase Orders Call Center

Data SourcesTelecommunications

20

Techniques

AnalyticsETL

Telecommunications

21

Techniques

+

ETL (Hadoop) Analytics (Teradata)

Telecommunications

22

Business ValueTelecommunications

23

Customer Purchase History Merchant Designations Merchant Special Offers

Data Sources

Credit CardIssuer

24

Techniques

PurchaseHistory

Merchant Information

Merchant Offers

RecommendationEngine Results

(Mahout)

PresentationData Store

(DB2)

App

App

App

App

App

Hadoop Export(4 hrs)

Import(4 hrs)

Credit CardIssuer

25

Techniques

PurchaseHistory

Merchant Information

Merchant Offers

RecommendationEngine Results

(Mahout)

RecommendationSearch Index

(Solr)

App

App

App

App

App

Hadoop

IndexUpdate(2 min)

Credit CardIssuer

26

Business Value

Credit CardIssuer

27

Idle Alerts

Waste & Recycling Leader

28

Truck Geolocation Data– 20,000 trucks– 5 sec interval

Landfill Geographic Boundaries

Data Sources

29

Techniques

TruckGeolocation

Data

Realtime Stream Computation(Storm)

Batch Computation(MapReduce)

ImmediateAlerts

Tax ReductionReporting

HadoopStorage

Shortest PathGraph Algorithm

Route Optimization

30

Business Value

31

Fraud DetectionData Lake

32

Anti-Money Laundering Consumer Transactions

Data Sources

33

TechniquesAnti-Money Laundering

SystemConsumer Transactions

System

34

Techniques

AML

Consumer Transactions

Data Lake(Hadoop)

Suspicious Events

Latent Dirichlet Allocation,Bayesian Learning Neural Network,

Peer Group Analysis

Analyst

35

Business Value

36

Machine LearningSearch Relevance

DNA Matching

37

Birth, Death, Census, Military, Immigration records

Search Behavior Activity DNA SNP (snips)

Data Sources

38

Techniques Record Linking Search Relevance Clickstream Behavior Security Forensics DNA Matching

39

Business Value

40

Traffic Analytics

41

Inrix Road Segment Data– Avg Speed / minute / segment– Reference Speeds

Road Segment Geolocation Data

Data Sources

42

Techniques Bottleneck Detection Algorithm Time Offset Correlations– Alternate Routes

Predictive Congestion Analysis– Growth & Term Assumptions

43

44

45

Business Value

46

Similar Characteristics Lots of Data Structured, Semi-Structured, Unstructured Varied Systems Interoperating

– Hadoop, Storm, Solr, MPP, Visualizations

Increase Revenue Decrease Costs

47

Thank You

top related