membase east coast meetups

39
Boston

Upload: membase

Post on 17-Dec-2014

1.648 views

Category:

Technology


1 download

DESCRIPTION

These are the slides from the Membase meetups in NYC and Boston.

TRANSCRIPT

Page 1: Membase East Coast Meetups

Boston

Page 2: Membase East Coast Meetups

Tonight

2

• Membase Overview• Use Cases and Deployment Examples• Membase Architecture• Demo!• Developing with Membase• A Glimpse into the Future

Page 3: Membase East Coast Meetups

What is Membase?

Page 4: Membase East Coast Meetups

Membase is a distributed database

4

Membase Servers

In the data center

Web application server

Application user

On the administrator console

Web application serverWeb application server

Page 5: Membase East Coast Meetups

Five minutes or less to a working cluster• Downloads for Linux and Windows• Start with a single node• One button press joins nodes to a clusterEasy to develop against• Just SET and GET – no schema required• Drop it in. 10,000+ existing applications

already “speak membase” (via memcached)• Practically every language and application

framework is supported, out of the boxEasy to manage• One-click failover and cluster rebalancing• Graphical and programmatic interfaces• Configurable alerting

Membase is Simple, Fast, Elastic

5

Page 6: Membase East Coast Meetups

Membase is Simple, Fast, Elastic

6

Predictable• “Never keep an application waiting”• Quasi-deterministic latency and throughputLow latency• Built-in Memcached technology

High throughput• Multi-threaded• Low lock contention• Asynchronous wherever possible• Automatic write de-duplication

Page 7: Membase East Coast Meetups

Membase is Simple, Fast, Elastic

7

Zero-downtime elasticity• Spread I/O and data across commodity

servers (or VMs) • Consistent performance with linear cost• Dynamic rebalancing of a live clusterAll nodes are created equal• No special case nodes• Any node can replace any other node, online• Clone to growExtensible• Filtered TAP interface provides hook points

for external systems (e.g. full-text search, backup, warehouse)

• Data bucket – engine API for specialized container types

Page 8: Membase East Coast Meetups

Built-in Memcached Caching Layer

8

Memcached

Membase Database

Membase Cache

Membase Database

Memcached Mode Membase Mode

Fact: Membase development team has also contributed over half of the code to the Memcached project.

Page 9: Membase East Coast Meetups

Use Cases

Page 10: Membase East Coast Meetups

Leading cloud service (PAAS) providerOver 65,000 hosted applicationsMembase Server serving over 1,200 Heroku customers (as of June 10, 2010)

Deployments Leading Membase

10

Social game leader – FarmVille, Mafia Wars, Café WorldOver 230 million monthly usersMembase Serveris the 500,000 ops-per-second database behind FarmVille and Café World

Page 11: Membase East Coast Meetups

Ad targeting

11

eventsprofiles, campaigns

profiles, real time campaign statistics

40 milliseconds to come up with an answer.

2

3

1

Page 12: Membase East Coast Meetups

Search and Gaming Portal

12

Database

Page 13: Membase East Coast Meetups

Membase Architecture

Page 14: Membase East Coast Meetups

Clustering

• Underlying cluster functionality based on erlang OTP

• Have a custom, vector clock based way of storing and propagating...– Cluster topology– vBucket mapping

• Collect statistics from many nodes of the cluster– Identify hot keys, resource

utilization

14

Page 15: Membase East Coast Meetups
Page 16: Membase East Coast Meetups

TAP

• A generic, scalable method of streaming mutations from a given server– As data operations arrive, they can be sent to arbitrary TAP

receivers

• Leverages the existing memcached engine interface, and the non-blocking IO interfaces to send data

• Three modes of operation

Working setDataMutations

Working setDataMutations

Working set

16

Page 17: Membase East Coast Meetups

Membase data flow – under the hood

17

SET request arrives at KEY’s master server

Listener-Sender

Master server for KEY Replica Server 2 for KEYReplica Server 1 for KEY

3 3

1SET acknowledgement returned to application2

DiskDisk Disk

RAM

mem

base

sto

rage

eng

ine

DiskDisk Disk

4

Page 18: Membase East Coast Meetups

ns_servermembase(memcached + membase engine)

moxi ns_server

vbucketmigratorTAP

memcached operationswith tap commands

memcached operations

Client

port 11211 memcached operations

moxi + Client

port 11210 memcached operations REST/comet

cluster topology and vbucket map

Clients, nodes and other nodes

18

Page 19: Membase East Coast Meetups

Data buckets are secure membase “slices”

19

Membase data servers

In the data center

Web application server

Application user

On the administrator console

Bucket 1Bucket 2

Aggregate Cluster Memory and Disk Capacity

Page 20: Membase East Coast Meetups

vBucket mapping

20

Page 21: Membase East Coast Meetups

Disk > Memory

Buc

ket C

onfig

urat

ion

mem_high_wat

mem_low_wat

memory quota

21

Dataset may have many items infrequently accessed. However, memcached has different behavior (LRU) than wanted with membase.

Still, traditional (most) RDBMS implementations are not 100% correct for us either. The speed of a miss is very, very important.

Page 22: Membase East Coast Meetups

Membase Demo

Page 23: Membase East Coast Meetups

23

Thanks!

Page 24: Membase East Coast Meetups

Key-Value Patterns

Page 25: Membase East Coast Meetups

Key-Value

Image courtesy http://www.flickr.com/photos/brenda-starr/3509344100/sizes/m/in/photostream/

(with a replica )25

Items have:KeyValueExpirationFlagsCAS (more on this later)

Operations include:Get/SetIncrement/DecrementAppend/Prepend

Page 26: Membase East Coast Meetups

Membase Datatypes

• byte[]– Does your data have

1s and 0s?

26

“Any customer can have a car painted any colour that he wants so long as it is black.”

• Items do have flags– Many clients use flags

– Data type options• Google protobuf• Thrift• Avro

Page 27: Membase East Coast Meetups

Transactions

• Lock == slow me down• CAS operations

– Optimistic locking• Very useful with complex

datatypes– Imagine two clients trying to

update a complex item• You’re likely using CAS

already... if you use a CPU

27

User 1

Fail!

User 2Success

Page 28: Membase East Coast Meetups

Common Use: Sessions

• Web user sessions– Highly read, less writes in many case– Protocol advantage of memcached

• Options already for PHP, Ruby and Java

• Application state– Not necessarily “entity” style things– May be appropriate for a “cache” pool

28

Page 29: Membase East Coast Meetups

Common Use (cache): Rate Limiting

• Want to provide API calls into the system– Twitter search– Google search services

• Use the atomic increment– Set an item with a unique ID– Upon API request,

increment and check• HTTP 420: go away and come

back later

29

Your Users

Your App

¡Ouch!

Page 30: Membase East Coast Meetups

Looking Ahead: NodeCodeFrank Weigel, Membase

Page 31: Membase East Coast Meetups

Beyond key-value • Indexing/Range Queries• Advanced Data Structures• Sub-object direct manipulation

Validation and In-flight transformation• Block mutations failing validation• Enrich or transform objects

Connectors (Integrate easily with other systems)• Solr• Hadoop• MySQL

NodeCode – Motivation

31

Page 32: Membase East Coast Meetups

NodeCode - What is it?

Method for extending & customizing Membase

Separate code modules

Defined interface to datapath and cluster manager

Notification on events• Synchronous• Asynchronous

32

Page 33: Membase East Coast Meetups

Simple• Packaged modules for easy install and enable• Library of “off the shelf” modules• Module monitoring• Straight forward development and debuggingFast• Low latency/high-throughput• Per-bucket process isolation• Don’t break data manager performance/correctnessElastic• Automatically migrate and instantiate on rebalance• Provide support for migration of internal data• Leverage native Membase engine for internal data storage

NodeCode – Drivers

33

Page 34: Membase East Coast Meetups

Block-level architecture

34

Page 35: Membase East Coast Meetups

Java only– jar format

Must implement minimal module API• Initial module startup• Module removal• Association with bucket

NodeCode library helper functions• Register synchronous & asynchronous listeners/callbacks• Register protocol extension/callbacks • Register rebalance callback• Register cluster manager event callbacks• Membase data access

NodeCode 1.0 Plans

35

Page 36: Membase East Coast Meetups
Page 37: Membase East Coast Meetups

37

Q&A

Page 39: Membase East Coast Meetups