infrastructure for decision makers

46
INFRASTRUCTURE FOR DECISION MAKERS Questions for a better architecture

Upload: eric-lubow

Post on 04-Aug-2015

214 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Infrastructure for Decision Makers

INFRASTRUCTURE FOR DECISION MAKERSQuestions for a better architecture

Page 2: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

PERSONAL VANITY

๏ CTO of SimpleReach

๏ Co-Author of Practical Cassandra

๏ Skydiver, Mixed Martial Artist, Motorcyclist, Dog Dad (IG: @charliedognyc), NY Giants fan

Page 3: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

SIMPLEREACH

๏ Identify the best content

๏ Use engagement metrics

๏ Stream processing ingest

๏ Many metrics, time sliced

๏ Multiple data stores

Page 4: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

What do you mean infrastructure?

Page 5: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

๏ Architects

๏ CTOs

๏ Lead Developers

๏ Developers

๏ Basically everyone

WHO IS MAKING THESE DECISIONS?

Page 6: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

YOU WOULDN’T BUILD SOFTWARE WITHOUT PLANNING

FIRST, SO WHY WOULD YOU BUILD AN ARCHITECTURE

WITHOUT PLANNING?

Page 7: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

๏ Architectures get built ad hoc

๏ Pieces tend to be built as needed and not always thought out

๏ Many lead developers don’t have a lot of architecture experience

๏ We don’t live in a perfect world and are usually time bound

๏ Product needs to be built and we’ll figure out the rest later (technical debt)

REALITY OF THE SITUATION

Page 8: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

What are we actually going to talk about today?

Page 9: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

๏ Hardware

๏ Cloud

๏ Databases

๏ Message Systems

๏ Scale/Scaling

๏ Costs

๏ Compliance

๏ Development ease

๏ Authentication

FRAMEWORK FOR BUILDING

๏ Developer / Operational Capabilities

๏ Available Support

๏ Monitoring / Instrumentation

๏ Testing / Staging / QA

๏ Repeatability of Systems

๏ Safety nets

๏ Pressure valves

๏ Administration ease

๏ Authorization

Page 10: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

WHY SHOULDN’T I LEAVE RIGHT NOW

Page 11: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

๏ Unsexy talks can have good information

๏ Understanding these concepts can save lots of technical debt

๏ There are lessons learned from not knowing which to ask questions

๏ I’m kind of entertaining

๏ In case I’m not entertaining, I’ll use some entertaining pictures

๏ I’m going to tell you a story

REASONS TO LISTEN

Page 12: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

HOW DID SIMPLEREACH GET FROM …

Page 13: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

TO …

Business/Application/Translation/Data Access

Router/Load Balancer/Config/Authentication

SERVICE SERVICE SERVICE SERVICE

SERVICE SERVICE SERVICE SERVICE

Redshift

Platform

Page 14: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

๏ Allows people to use a common language when discussing or solving problems

๏ Allows a common toolset for solving problems

๏ Simplifies difficult tasks

๏ Every language has frameworks: Ruby/Rails, Python/Django, Javascript/Ember.js

๏ Attempts to answer the questions:

๏ How should I do this?

๏ Is this a good idea?

๏ Is this the right tool?

WHY ARE FRAMEWORKS IMPORTANT

Page 15: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

Page 16: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

๏ Where is this going to live?

๏ How do I get data in?

๏ How am I going to store the data?

๏ How do I move data around?

BASIC QUESTIONS

๏ How should data look coming out?

๏ How do I get data out?

๏ How do I know if something is wrong?

๏ How do I maintain/scale/build?

Page 17: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

๏ Is this going on the cloud? Amazon, Google, Azure, Rackspace?

๏ Do you need to be in a data center?

๏ Are APIs important?

๏ What kind of distribution of services / fault tolerance needs to be

available?

๏ What kind of SLAs do you need to meet (100% uptime)?

WHERE IS THIS GOING TO LIVE?

Page 18: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

HOW DO I GET DATA IN?

๏ Build apps that follow the same paradigm

๏ POST data to an end point

๏ Consume off a queue

๏ Use message systems for queueing

๏ Message aggregation for efficiency

๏ Message sampling for throttles

๏ Try to avoid talking directly to a database from client facing applications

๏ Write your own client driver to talk to your architecture

Page 19: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

HOW AM I GOING TO STORE THE DATA?

Page 20: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

๏ What’s the latest cool technology?

CHOOSING A DATABASE IS EASY, #AMIRITE

๏ What is my data volume?

๏ What are my query patterns?

๏ Is my data (un)structured?

๏ Will data remain consistent?

๏ Am I read heavy or write heavy?

๏ Am I batch loading data?

๏ Is eventually consistent data ok?

๏ Can I have a DR plan?

๏ Legal/compliance requirements?

๏ Are there experts/enterprise support?

๏ What’s the community like?

๏ Easy to administer?

๏ Tooling, monitoring, language support?

๏ Cloud or iron?

๏ High volume ingestion or batch loading?

๏ Fault tolerance?

๏ Open source vs enterprise system?

๏ Employee learning curve vs. learning cost?

Page 21: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

HOW DO I MOVE DATA AROUND?ROAD METAPHOR:

๏Messages = Cars

๏Message System = Highway / Roads

๏Database = Parking Lot

๏Cache = Cell Phone Lot

๏Commerce/Industry = Worker/Consumer/Analyzer

๏Enrichment = Gas Station

Page 22: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

๏ Only recently starting to become part of important discussions

๏ Provide consistent interfaces between disparate systems

๏ Clients can have minimal architecture knowledge

๏ Everyone can speak the same language (JSON, please not XML)

๏ Allow for high availability

๏ Help minimize the cost of downtime

๏ Control data flow patterns

๏ Makes [horizontal] scaling easier

๏ Enrichment/in-stream modifications of data

๏ Instrument and monitor data states between systems

MESSAGE SYSTEMS ARE MY FAV

Page 23: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

๏ Distributed and de-centralized topology

๏ At least once delivery guaranteed

๏ Multi-cast style message routing

๏ Simple to configure and deploy

๏ All for zero-downtime maintenance windows

๏ Ephemeral channels for testing data

๏ Channel sampling

NSQ

nsq.io

Page 24: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

HOW SHOULD DATA LOOK COMING OUT?

๏ Agree on a data format?

๏ XML, JSON, AVROJSON

๏ Again, please don’t use XML

๏ HATEOAS - heavy lift but decent client support

๏ What meta data should be sent with the response?

๏ How can unnecessary calls to an API be mitigated?

Page 25: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

HOW DO I GET DATA OUT?

๏ Monolithic service architecture

๏ REST interface through a single URL to ask for data?

๏ Many micro-service end points?

๏ HTTP / RPC / THRIFT

๏ JSON API / HATEOS / CUSTOM

๏ How many libraries need to be written, tested and maintained?

Page 26: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

And now back to our story…

Page 27: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

SIMPLEREACH CONTEXT

๏ 100 million URLs

๏ 300 million Tweets

๏ 50k - 100k events per second (tens of billions of events per day)

๏ 200G new per hour

๏ 700T of total data (10T per month)

๏ 10T of hot data

๏ 2-3T of daily log data

๏ Excludes all monitoring data

Page 28: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

Solr

Solr

Vertica + Cassandra

Vertica + Cassandra

Vertica

Mongo

Page 29: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

STREAM-BASED DATA COLLECTION

Internet

Edge

Inte

rnal A

PI

Solr

C*

Mongo

Redis

Vertica

API

Fire Hose

App

Co

nsu

me

rs

Qu

eu

e

Page 30: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

NEED FOR SPEED

๏ Concurrency

๏ Compiled code is much faster

๏ Statically typed languages make for less unexpected error situations

๏ Still speaks every other interchange language

๏ Cleaner code

Page 31: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

MICROSERVICES: THE NEW HOTNESS!

Page 32: Infrastructure for Decision Makers

๏ Fine grained, clearly scoped services

๏ Break 1 thing != break #allthethings

๏ Better fault isolation

๏ Easier to create throttles/release

valves

๏ Better able to monitor more

granularly

๏ Made everyone more devopsy

MICROSERVICES: THE NEW HOTNESS?

๏ Strict micro-service setups have

large database overheads

๏ Testing/deployments are more

complex

๏ More general overhead

๏ Slow down developer time

๏ Service discovery

Pros Cons

Page 33: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

HYBRID MICRO-SERVICE / SHARED LIBRARY

Business/Application/Translation/Data Access

Router/Load Balancer/Config/Authentication

SERVICE SERVICE SERVICE SERVICE

SERVICE SERVICE SERVICE SERVICE

Redshift

Platform

Page 34: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

GENERIC SERVICE AND DATA FLOW

Redshift

Data Access Layer

Business Logic

NSQApplication Layer

NSQ

rou

ter

au

th

Data Access Layer

Business Logic

NSQApplication Layer

NSQ

Data Access Layer

Business Logic

NSQApplication Layer

NSQ Data Access Layer

Business Logic

NSQApplication Layer

NSQ

logstash

Page 35: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

SMART ROUTER

๏ Handles service state and service registry/discovery information

๏ Canonical reference for all things platform

๏ Prevents older versions of services from re-appearing

๏ Highly available proxy application

๏ Has burst-able capacity to mitigate DoS

๏ Auto-scaling tier

Page 36: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

BUSINESS LOGIC LAYER

๏ Contains thicker macro services

๏ Aggregates common features and functionality

๏ Permissioning/throttling/access restrictions

๏ Centrally handling trigger events

๏ Exposing various API end points

๏ Orchestrating calls to the DAL

Page 37: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

DATA ACCESS LAYER

๏ Responsible for CRUD

๏ Houses many of the data models

๏ Responsible for balancing throughput of data in/out of databases

๏ Minimize the number of DB connections by using pooling

Page 38: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

HYBRID MICRO-SERVICE / SHARED LIBRARY

Redshift

Platform

WebApp 1 WebApp 2 Python App Go App

Ingestion Stream

Proxy/Router

Ingestion Stream

Ingestion Stream

Page 39: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

SMILEY HAPPY PEOPLE

Page 40: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

HOW DO I KNOW IF SOMETHING IS WRONG

๏ Testing

๏ Monitoring

๏ Instrumentation

๏ No pull requests w/o instrumentation

๏ No pull requests w/o monitoring

๏ Build dashboards

Page 41: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

DASHBOARD #ALLTHETHINGS

Page 42: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

WHAT SHOULD I MONITOR/INSTRUMENT?

๏ Frequency

๏ Error rates

๏ Success rates

๏ Request Volume

๏ Message Counts

Page 43: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

HOW DO I MAINTAIN/SCALE/BUILD?

๏ Already discussed monitoring/instrumentation

๏ Making sure you can maintain architecture is the same as ensuring you can

maintain code

๏ Have easy to use, flexible deployment systems

๏ Keep an audit trail

๏ Make processes repeatable and systematic

๏ Configuration management

๏ Automation (event based when possible)

๏ Easy enough to add and maintain but difficult to break

Page 44: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

If you want to increase innovation, you need to lower the cost of failure.

Joi Ito, MIT Media Lab

Page 45: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

WHAT JUST HAPPENED

๏ A little architecture knowledge is a good thing

๏ Don’t start out with complexity

๏ Build what you need with growth in mind

๏ Make sure you have the basics covered

๏ Might be something to the micro-service hype

๏ Monitor everything

๏ Allow customizations and innovations

Page 46: Infrastructure for Decision Makers

Eric Lubow @elubow #ddsea15

QUESTIONS IN LIFE ARE GUARANTEED,

ANSWERS AREN’T.

Eric Lubow

@elubow

Data Day Seattle

#ddsea15