ignacio mulas viela_massively distributed analytics_dataconomy_stockholm

16
Ignacio Mulas Viela Ericsson Research Massively distributed analytics

Upload: dataconomy

Post on 19-Jul-2015

594 views

Category:

Data & Analytics


1 download

TRANSCRIPT

Page 1: Ignacio Mulas Viela_Massively Distributed Analytics_Dataconomy_Stockholm

Ignacio Mulas Viela

Ericsson Research

Massively distributed analytics

Page 2: Ignacio Mulas Viela_Massively Distributed Analytics_Dataconomy_Stockholm

Big Data, Stockholm | Ericsson Internal | 2015-04-01 | Page 2

About myself…

Name: Ignacio Mulas Viela

Contact channels:

[email protected]

@ immulvi

Experience @ Ericsson

Current: Researcher and project leader in

Cloud Analytics in the Machine Learning

group

Previous: Big Data, Cloud and Automation

Page 3: Ignacio Mulas Viela_Massively Distributed Analytics_Dataconomy_Stockholm

Big Data, Stockholm | Ericsson Internal | 2015-04-01 | Page 3

› Industry evolution

› Data economics

› Architectures

› Conclusions

› Q&A

agenda

Page 4: Ignacio Mulas Viela_Massively Distributed Analytics_Dataconomy_Stockholm

Vision 2020

Networked society50 billion connected devices

Page 5: Ignacio Mulas Viela_Massively Distributed Analytics_Dataconomy_Stockholm

Big Data, Stockholm | Ericsson Internal | 2015-04-01 | Page 5

PACE OF CHANGE

1 billion connected places

50 billion connected things

PLACES

PEOPLE

THINGS

Page 6: Ignacio Mulas Viela_Massively Distributed Analytics_Dataconomy_Stockholm

Big Data, Stockholm | Ericsson Internal | 2015-04-01 | Page 6

› Technological challenges

– Data sources are geographically distributed

› … And highly cross-referenced

– No unique identifier

– Timing and order carries meaning

– Demanding delay constraint:

› For radio optimization: ms

› Core network optimization: ms to s

› Software services: s to min

› Interpretation challenges

– Low-level data -> hard to associate with business-level

meanings

– IoT/M2M data is

› Structured in the small picture: protocol level

› Unstructured in the big picture: Heterogeneous

environment use different formats/protocols

challenges

Page 7: Ignacio Mulas Viela_Massively Distributed Analytics_Dataconomy_Stockholm

Big Data, Stockholm | Ericsson Internal | 2015-04-01 | Page 7

“IoT ties to big data as the massive amount of rapidly moving and

freely available data that potentially serves a valuable and unique

need in the marketplace, but is extremely expensive and

difficult to mine by traditional means”

By Techrepublic

Source: http://www.techrepublic.com/blog/big-data-analytics/big-data-defined/

Page 8: Ignacio Mulas Viela_Massively Distributed Analytics_Dataconomy_Stockholm

Big Data, Stockholm | Ericsson Internal | 2015-04-01 | Page 8

Source (year 2011): http://radar.oreilly.com/2011/08/building-data-startups.html

James Hamilton network costs in Amazon (minute 4 to 15): https://www.youtube.com/watch?x-yt-ts=1422579428&v=JIQETrFC_SQ&feature=player_embedded&x-yt-cl=85114404

Data economics

Page 9: Ignacio Mulas Viela_Massively Distributed Analytics_Dataconomy_Stockholm

Big Data, Stockholm | Ericsson Internal | 2015-04-01 | Page 9

Wait a moment…

Huge data volumes, time constraints and distributed data…

Are you crazy?

Page 10: Ignacio Mulas Viela_Massively Distributed Analytics_Dataconomy_Stockholm

Big Data, Stockholm | Ericsson Internal | 2015-04-01 | Page 10

Centralized architecture

RAW DATA

PROCESSED DATA

TOOLS

Page 11: Ignacio Mulas Viela_Massively Distributed Analytics_Dataconomy_Stockholm

Big Data, Stockholm | Ericsson Internal | 2015-04-01 | Page 11

De-Centralized architecture

RAW DATA

PROCESSED DATA

TOOLS

Page 12: Ignacio Mulas Viela_Massively Distributed Analytics_Dataconomy_Stockholm

Big Data, Stockholm | Ericsson Internal | 2015-04-01 | Page 12

DISTRIBUTED architecture

RAW & PROC. DATA

PROCESSED DATA

TOOLS

Page 13: Ignacio Mulas Viela_Massively Distributed Analytics_Dataconomy_Stockholm

Big Data, Stockholm | Ericsson Internal | 2015-04-01 | Page 13

› Centralized:

– “Move everything there, I have capacity to analyze everything!”

– Stream and compute

› Decentralized:

– “You can do this task but when you know about that, let me handle it!”

– Stream, compute and stream

› Distributed:

– “We live in harmony with no bosses”

– Compute and stream

Analytics principles

High

Low

BA

ND

WID

TH

CP

U D

EN

SIT

Y

Page 14: Ignacio Mulas Viela_Massively Distributed Analytics_Dataconomy_Stockholm

Big Data, Stockholm | Ericsson Internal | 2015-04-01 | Page 14

› Moving the data around is expensive

› Define concrete requirements

– “I am doing analytics” is not enough…

› One approach don’t rule them all

– Go with the simplest solution that matches your needs

conclusions

Page 15: Ignacio Mulas Viela_Massively Distributed Analytics_Dataconomy_Stockholm

Also, feel free to contact me if you have suggestions/comments/questions @ [email protected]

Thank you!quESTIONS?

Page 16: Ignacio Mulas Viela_Massively Distributed Analytics_Dataconomy_Stockholm