advanced visual analytics and real-time analytics at platform scale by brian bulkowski, co-founder...

7
© 2014 Aerospike. All rights reserved. Confidential 1 Advanced Visual Analytics and Real-time Analytics at Platform scale Kunal Umrigar Senior Architect at Pubmatic In conversation with Brian Bulkowski CTO and co-founder Aerospike

Upload: the-hive

Post on 14-Jul-2015

285 views

Category:

Technology


3 download

TRANSCRIPT

Page 1: Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian Bulkowski, co-founder & CTO at Aerospike

© 2014 Aerospike. All rights reserved. Confidential 1

Advanced Visual Analytics and

Real-time Analytics at

Platform scale

Kunal Umrigar

Senior Architect at Pubmatic

In conversation with Brian Bulkowski

CTO and co-founder

Aerospike

Page 2: Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian Bulkowski, co-founder & CTO at Aerospike

© 2014 Aerospike. All rights reserved. Confidential 2

Who am I ?

■ Starting: TRS-80, PC, Apple II, Vax 11/70, Wang

■ First product: lightpen university teaching kiosk

■ Networks: computers without people are boring

■ Silicon Valley internet boom■ 10B market cap in 1999, employee 32

■ 2003-2007 “time off” ( startups )

■ Citrusleaf / Aerospike history■ 42 year old first-time CEO (me)

■ 2008 Prototype

■ 2010 First sale, get the band back together

■ 2011+ 3 rounds of funding (Draper, ALP, NEA, CNTP)

■ 2014 Open Source

■ 70 employees, 2 offices

[email protected]

[email protected]

@bbulkow

Page 3: Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian Bulkowski, co-founder & CTO at Aerospike

© 2014 Aerospike. All rights reserved. Confidential 3

MILLIONS OF CONSUMERS

BILLIONS OF DEVICES

APP SERVERS

DATA

WAREHOUSEINSIGHTS

Advertising Technology Stack

WRITE CONTEXT

In-memory NoSQL

WRITE REAL-TIME CONTEXT

READ RECENT CONTENT

PROFILE STORE

Cookies, email, deviceID, IP address, location,

segments, clicks, likes, tweets, search terms...

REAL-TIME ANALYTICS

Best sellers, top scores, trending tweets

BATCH ANALYTICS

Discover patterns,

segment data:

location patterns,

audience affinity

Page 4: Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian Bulkowski, co-founder & CTO at Aerospike

© 2014 Aerospike. All rights reserved. Confidential 4

Introduction to Advertising: Real-time Bidding

Page 5: Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian Bulkowski, co-founder & CTO at Aerospike

© 2014 Aerospike. All rights reserved. Confidential 5

North American RTB speeds & feeds

■ 1 to 6 billion cookies tracked

■ Some companies track 200M, some track 20B

■ Each bidder has their own data pool

■ Data is your weapon

■ Recent searches, behavior, IP addresses

■ Audience clusters (K-cluster, K-means) from offline Hadoop

■ “Remnant” from Google, Yahoo is about 0.6 million / sec

■ Facebook exchange: about 0.6 million / sec

■ “other” is 0.5 million / sec

Currently about 3.0M / sec in North American

Page 6: Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian Bulkowski, co-founder & CTO at Aerospike

© 2014 Aerospike. All rights reserved. Confidential 6

Old Architecture ( scale out in 2000 )

Request routing and sharding

APP SERVERS

CACHE

DATABASE

STORAGE

CONTENT

DELIVERY NETWORK

LOAD BALANCER

Page 7: Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian Bulkowski, co-founder & CTO at Aerospike

© 2014 Aerospike. All rights reserved. Confidential 7

Modern Scale Out Architecture

Load balancer

Simple stateless

APP SERVERS

IN-MEMORY NoSQL

RESEARCH

WAREHOUSE

CONTENT

DELIVERY NETWORK

LOAD BALANCER

Long term cold

storage

Fast stateless

HDFS BASED