operational aspects of big data

Post on 25-Dec-2014

120 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

The presentaion discuss what is big data and best practices for operating big data operation The speaker is the CTO of myThings. It was presented in June 10, 2014 conference "Best practices for SaaS Operaion" sponsered by MoovingON www.moovingon.com

TRANSCRIPT

Operational aspects of Big Data

• Yoav Chernobroda - CTO

All copyrights reserved to myThings LTD

Few facts about myThings

Online ad retargeting for desktop, mobile

web and mobile apps

First on Fast 50 companies

150 employees

R&D in Ramat Hachayal + 15 regional

offices

Big data at a scale

20TB / day

300M uniques /month

All copyrights reserved to myThings LTD

Personalized retargeting

She is tagged with myThings’

smart tag, browse site but leave

without completing purchase

When she later visits any

desktop or mobile site on the ad

network, she is targeted with an

ad

User visits e-commerce site but

quits without converting1 2

myThings creates, in real time,

a personalized ad– custom-made

based on consumer intent data,

with product info, image

3A personalized ad is

presented4When user clicks she is

taken back to product

page to complete

purchase

5

All copyrights reserved to myThings LTD

RTB retargeting

e-commerce

site

RTB

Exchange

Google ad exchange

Consumer DB

Media Service

Tag Service

RTB service

myThings platform

Content site

Visits

Reads

All copyrights reserved to myThings LTD

The big data challenge

All copyrights reserved to myThings LTD

The (sad) truth

All copyrights reserved to myThings LTD

Big data is not about large data volumes

All copyrights reserved to myThings LTD

Classic definition

The 3 V’s• Volume (terra / peta / zeta / … bytes)

• Variety• The relational model does not hold

• Velocity• Traditional relational db are not scalable enough

• Technology is built around linear scalability

• Examples:– Predictive analytics

– Recommendation engines

– Customer retention, churn analysis

– Social graph analysis

– Fraud detection

All copyrights reserved to myThings LTD

My definition

Big data

Operational view

Business intel.

view

Predictive

modeling view

Real time

decisions

All copyrights reserved to myThings LTD

The big data challenge

Business value– Do we solve the right problem?

– How does it help our business?Data quality– Do we have the right data?

Organization roles– Collaboration

Culture– Process oriented vs. iterative exploratory

– Organizational fitOperational and infrastructure– Will get to it in a moment …

All copyrights reserved to myThings LTD

myThings big data architecture

All copyrights reserved to myThings LTD

Operational challenges

• Cost effective architecture

• Real time vs. near RT vs. offline processing

• Linear scalability

• Data routing infrastructure• Data retention and backup• Open source components

• Hadoop, Kafka, Storm, Cassandra, …

• Cost monitoring• Skillful devops – the human factor

All copyrights reserved to myThings LTD

Recommended reading

Nathan Marz

Originator of Storm and

Cascalog

The lambda architecture

top related