real-time network analytics with storm

31
REAL-TIME NETWORK ANALYTICS WITH STORM Mauricio Vacas Fausto Inestroza Sonali Parthasarathy

Upload: mandek

Post on 07-Jan-2016

21 views

Category:

Documents


1 download

DESCRIPTION

REAL-TIME NETWORK ANALYTICS WITH STORM. Mauricio Vacas Fausto Inestroza Sonali Parthasarathy. The Team. Mauricio Vacas Big Data Architect. Anita Mehrotra Data Scientist. Krista Schnell Visualization. Fausto Inestroza Big Data Architect. Sonali Parthasarathy Real-Time Processing. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: REAL-TIME NETWORK ANALYTICS WITH  STORM

REAL-TIME NETWORK ANALYTICS WITH STORM

Mauricio VacasFausto Inestroza

Sonali Parthasarathy

Page 2: REAL-TIME NETWORK ANALYTICS WITH  STORM

Mauricio VacasBig Data Architect

Sonali ParthasarathyReal-Time Processing

Fausto InestrozaBig Data Architect

Anita MehrotraData Scientist

Susie LuVisualization

Krista SchnellVisualization

Rick DrushalEngineering Lead

John AkredProduct Lead

The Team

Page 3: REAL-TIME NETWORK ANALYTICS WITH  STORM

WHY REAL-TIME?

Page 4: REAL-TIME NETWORK ANALYTICS WITH  STORM

Distributed Analytics

Real-Time Data Ingestion

Model Prototyping

Exploratory Analytics

Real-Time Rule Execution

PROCESS

UNDERSTAND

REACT

Page 5: REAL-TIME NETWORK ANALYTICS WITH  STORM

Accenture Cloud Platform

Recommender as a Service

Recommender as a Service

……

Network Analytics Services

Network Analytics Services

Big Data Platform

Page 6: REAL-TIME NETWORK ANALYTICS WITH  STORM

Drivers

consumer devices

video usage

Issues

Operational Costs

Understanding service quality degradation

Inefficient capacity planning

Page 7: REAL-TIME NETWORK ANALYTICS WITH  STORM

INGEST PROCESS

VISUALIZE

ANALYZE

STORE

Page 8: REAL-TIME NETWORK ANALYTICS WITH  STORM

WHY STORM?

Page 9: REAL-TIME NETWORK ANALYTICS WITH  STORM

Scalability

Reliability

Data types, size, velocity

Mission critical data

Processing, computation, etc.

Time series / pattern analysis

Fault-tolerance

What do we need?

Multiple use cases

Page 10: REAL-TIME NETWORK ANALYTICS WITH  STORM

How do we get this from Storm?

Processing guarantees

Low-level Primitives

Parallelization

Robust fail-over strategies

Scalability

Reliability

Fault-tolerance

Processing, computation, etc.

Page 11: REAL-TIME NETWORK ANALYTICS WITH  STORM

PRIMITIVES

Page 12: REAL-TIME NETWORK ANALYTICS WITH  STORM

Stream

Spout

Bolt

TopologySuboptimal network speed, geospatial analysis

Request info (IP, user-agent, etc)

Pull messages from distributed queue

Sessionization, speed calculation

Tuple Tuple

Page 13: REAL-TIME NETWORK ANALYTICS WITH  STORM

PARALLELISM

Page 14: REAL-TIME NETWORK ANALYTICS WITH  STORM

Nimbus Zookeeper

Supervisor

WT T

WT T

Supervisor

WT T

WT T

Page 15: REAL-TIME NETWORK ANALYTICS WITH  STORM

Topology

Worker Process

Task

Task

Task

Task

Executor Executor

Page 16: REAL-TIME NETWORK ANALYTICS WITH  STORM

FAULT TOLERANCE

Page 17: REAL-TIME NETWORK ANALYTICS WITH  STORM

Nimbus

Supervisor

WT T

WT T

Supervisor

WT T

WT T

Supervisor

WT

W

TTT

TT

TT

Page 18: REAL-TIME NETWORK ANALYTICS WITH  STORM

RELIABILITY

Page 19: REAL-TIME NETWORK ANALYTICS WITH  STORM

IP2IP2

IP3

IP1

A

Page 20: REAL-TIME NETWORK ANALYTICS WITH  STORM

IP2IP2

IP3

IP1

A

Page 21: REAL-TIME NETWORK ANALYTICS WITH  STORM

SUBOPTIMAL NETWORK SPEED TOPOLOGY

AN EXAMPLE

Page 22: REAL-TIME NETWORK ANALYTICS WITH  STORM

KafkaSpout

Pre-process SessionizeCalculate N/W

Speed per Session

Update Speed per IP

Identify Suboptimal

Speed

Store in Cassandra

Cassandra

Tuple (ip 1) Tuple (ip 1) Tuple (ip 1) Tuple (ip 1) Tuple (ip 1) Tuple (ip 1)Tuple (ip 1)

Page 23: REAL-TIME NETWORK ANALYTICS WITH  STORM

Cassandra

KafkaSpout

Pre-process SessionizeCalculate N/W

Speed per Session

Update Speed per IP

Identify Suboptimal

Speed

Store in Cassandra

Tuple (ip 2)Tuple (ip 2)Tuple (ip 2)

Tuple (ip 1)Tuple (ip 1)Tuple (ip 1)

Tuple (ip 1)

Parallelism

Tuple (ip 1) Tuple (ip 1) Tuple (ip 1) Tuple (ip 1) Tuple (ip 1)

Tuple (ip 2) Tuple (ip 2) Tuple (ip 2) Tuple (ip 2) Tuple (ip 2) Tuple (ip 2)

Page 24: REAL-TIME NETWORK ANALYTICS WITH  STORM

Cassandra

KafkaSpout

Pre-process SessionizeCalculate N/W

Speed per Session

Update Speed per IP

JoinCompare

SpeedStore in

Cassandra

Speed by Location

Stream 1

Stream 2

KafkaSpout

Tuple (ip 1)

Branching and Joins

Tuple (ip 1/NY) Tuple (ip 1/NY)

Tuple (NY)

Page 25: REAL-TIME NETWORK ANALYTICS WITH  STORM

RULE EXECUTION

Page 26: REAL-TIME NETWORK ANALYTICS WITH  STORM

Drools

METHOD 1Storm

METHOD 2Storm + Drools

Page 27: REAL-TIME NETWORK ANALYTICS WITH  STORM

KafkaSpout

Pre-process SessionizeCalculate N/W

Speed per Session

Update Speed per IP

Identify Suboptimal

Speed

Store in Cassandra

Cassandra

Drools

Storm + Drools

Page 28: REAL-TIME NETWORK ANALYTICS WITH  STORM

Copyright © 2012 Accenture All rights reserved. 28

Integration with Cassandra

Cassandra Optimal for time series data

Near-linear scalable

Low read/write latency

Custom BoltUses Hector API to access Cassandra

Creates dynamic columns per request

Stores relevant network data

Page 29: REAL-TIME NETWORK ANALYTICS WITH  STORM

Copyright © 2012 Accenture All rights reserved. 29

Lessons Learned

• Rebalance Topology

• Tweak Parallelism in bolt

•Isolation of Topologies

• Use TimeUUIDUtils

• Log4j level set to INFO by default

Page 30: REAL-TIME NETWORK ANALYTICS WITH  STORM

Copyright © 2012 Accenture All rights reserved. 30

DEMO

Page 31: REAL-TIME NETWORK ANALYTICS WITH  STORM

Copyright © 2012 Accenture All rights reserved. 31

Next Steps

• Trident

• Externalizing Rules

• Predictive Models

• Real-Time Notifications