"real time streaming of big data", shimon tolts, general manager, data solutions at...

18
Shimon Tolts General Manager, Data Solutions ironSource Atom Data Flow Management

Upload: dataconomy-media

Post on 16-Apr-2017

95 views

Category:

Data & Analytics


1 download

TRANSCRIPT

Page 1: "Real time streaming of BIG data", Shimon Tolts, General Manager, Data Solutions at IronSource

Shimon Tolts General Manager, Data Solutions

ironSource Atom

Data Flow Management

Page 2: "Real time streaming of BIG data", Shimon Tolts, General Manager, Data Solutions at IronSource

About ironSource: ironSource in Numbers

700Employees

Sep. 2010

Established

50%R&D Employees

700Advertisers

80KPartnered Apps

100MDevices using ironSource

solutions shipping in 2016

TEL AVIV ISRAEL

SAN FRANCISCO UNITED STATES

NEW YORK UNITED STATES

LONDON UNITED KINGDOM

BANGALORE INDIA

HONG KONG CHINA

KIEV UKRAINE

BEIJING CHINA SHANGHAI CHINA

Page 3: "Real time streaming of BIG data", Shimon Tolts, General Manager, Data Solutions at IronSource

About ironSource: ironSource Hypergrowth

People Reached Each Month

4200Apps Installed Every Minutewith the ironSource Platform

Registered & Analyzed Data EventsEvery Month

160B

800M

50B

0

100B

150B

200B

Jun 201

5

Jul 201

5

Aug 201

5

Sep 201

5

Oct 201

5

Nov 201

5

Dec 201

5

Jan 201

6

Feb 201

6

Mar 201

6

Apr 201

6

May 201

6

Page 4: "Real time streaming of BIG data", Shimon Tolts, General Manager, Data Solutions at IronSource

We needed a way to manage this data:

Our Business Challenge

ProcessCollect Store

Page 5: "Real time streaming of BIG data", Shimon Tolts, General Manager, Data Solutions at IronSource
Page 6: "Real time streaming of BIG data", Shimon Tolts, General Manager, Data Solutions at IronSource

Collection

● Multi region layer - Latency based

routing

● Low latency from client to Atom servers

● High Availability - AWS regions does

fail!

● Storing raw data + headers upon

receiving

Page 7: "Real time streaming of BIG data", Shimon Tolts, General Manager, Data Solutions at IronSource

Data Enrichment● Enrich data before storing in your Data

Lake and/or Warehouse○ IP to Country○ Currency conversion ○ Decrypt data○ User Agent parsing - OS, Browser, Device...

● Any custom logic you would like! - fully extendible

Page 8: "Real time streaming of BIG data", Shimon Tolts, General Manager, Data Solutions at IronSource

Data Targets● Near real-time data insertion - 1

minute!● Stream data to Google Storage and/or

AWS S3● Smart insertion of data into AWS

Redshift○ Set the amount of parallel copys○ Configure priority on tables

● BigQuery - Streaming data using batch files import (saves 20% cost)

Page 9: "Real time streaming of BIG data", Shimon Tolts, General Manager, Data Solutions at IronSource
Page 10: "Real time streaming of BIG data", Shimon Tolts, General Manager, Data Solutions at IronSource

Micro-Services Architecture● Everything is a service● Decoupling● Distributed systems

Separate lifecycle● Communication using RESTful /

Queue / Streams

Page 11: "Real time streaming of BIG data", Shimon Tolts, General Manager, Data Solutions at IronSource

Docker● Linux Container● Save provisioning time● Infrastructure as code● Dev-Test-Production - identical

container● Ship easily

Page 12: "Real time streaming of BIG data", Shimon Tolts, General Manager, Data Solutions at IronSource

Cloud infrastructure● Pay as you go - (grow)● SaaS services ● Auto-scaling-groups● DynamoDB● RDS *SQL● Redshift data warehouse

Page 13: "Real time streaming of BIG data", Shimon Tolts, General Manager, Data Solutions at IronSource

Continuous Integration● From commit to production● Jenkins commit hook● Git branching model● AWS dynamic slaves● Unit tests● Docker builds● Updating live environment

Page 14: "Real time streaming of BIG data", Shimon Tolts, General Manager, Data Solutions at IronSource

Diagram

Page 15: "Real time streaming of BIG data", Shimon Tolts, General Manager, Data Solutions at IronSource

Partners

Page 16: "Real time streaming of BIG data", Shimon Tolts, General Manager, Data Solutions at IronSource

Everybody needs a data pipeline

The AWS platform allowed us to build on top of it with an infrastructure that is exactly tailored to our clients’ needs.

Maximum FlexibilityAny data, from any source,

in any format.

Infinite ScalabilityAdapt to your evolving

needs with a pay-as-you-go model.

Own Your DataWe manage the flow,

the data is yours.

Page 17: "Real time streaming of BIG data", Shimon Tolts, General Manager, Data Solutions at IronSource

David FitcherFrom London, United Kingdom

The User JourneyTouchpoint #1

Customizing anew device

David bought a new LG device.Installed 12 apps overall, 6 were games.

Gender: Male76% Probability

Age: 25-3595% Probability

User Profile: Casual Gamer81% Probability

Touchpoint #2

Using a mobile app

David is now playing a game which uses our SDK.What we already knowProbably a ‘Casual Gamer’What we offerRewarded video ads for casual gaming apps

User Profile: Casual Gamer95% Probability

Subcategory Interest: Simulation GamesLTV Projection: High

One month later…

What we know now

Page 18: "Real time streaming of BIG data", Shimon Tolts, General Manager, Data Solutions at IronSource

10 MillionFree Monthly Events

Thank you!

[email protected] @shimontolts