membase meetup - san diego
DESCRIPTION
The slides from the San Diego NoSQL presentation of Membase by Matt Ingenthron with Jason Sirota from The Knot.TRANSCRIPT
San Diego
Thursday, January 6, 2011
What is Membase?
Thursday, January 6, 2011
Membase is a distributed database
3
Membase Servers
In the data center
Web application server
Application user
On the administrator console
Web application serverWeb application server
Thursday, January 6, 2011
Five minutes or less to a working cluster• Downloads for Linux and Windows• Start with a single node• One button press joins nodes to a clusterEasy to develop against• Just SET and GET – no schema required• Drop it in. 10,000+ existing applications
already “speak membase” (via memcached)• Practically every language and application
framework is supported, out of the boxEasy to manage• One-click failover and cluster rebalancing• Graphical and programmatic interfaces• Configurable alerting
Membase is Simple, Fast, Elastic
4
Thursday, January 6, 2011
Membase is Simple, Fast, Elastic
5
Predictable• “Never keep an application waiting”• Quasi-deterministic latency and throughputLow latency• Built-in Memcached technology
High throughput• Multi-threaded• Low lock contention• Asynchronous wherever possible• Automatic write de-duplication
Thursday, January 6, 2011
Membase is Simple, Fast, Elastic
6
Zero-downtime elasticity• Spread I/O and data across commodity
servers (or VMs) • Consistent performance with linear cost• Dynamic rebalancing of a live clusterAll nodes are created equal• No special case nodes• Any node can replace any other node, online• Clone to growExtensible• Filtered TAP interface provides hook points
for external systems (e.g. full-text search, backup, warehouse)
• Data bucket – engine API for specialized container types
Thursday, January 6, 2011
Built-in Memcached Caching Layer
7
Memcached
Membase Database
Membase Cache
Membase Database
Memcached Mode Membase Mode
Fact: Membase development team has also contributed over half of the code to the Memcached project.
Thursday, January 6, 2011
Use Cases
Thursday, January 6, 2011
Ad targeting
9
eventsprofiles, campaigns
profiles, real time campaign statistics
40 milliseconds to come up with an answer.
2
3
1
Thursday, January 6, 2011
Search and Gaming Portal
10
Database
Thursday, January 6, 2011
Membase Caching at The KnotJason Sirota
Thursday, January 6, 2011
Membase Architecture
Thursday, January 6, 2011
Clustering
• Underlying cluster functionality based on erlang OTP
• Have a custom, vector clock based way of storing and propagating...– Cluster topology– vBucket mapping
• Collect statistics from many nodes of the cluster– Identify hot keys, resource
utilization
13
Thursday, January 6, 2011
Thursday, January 6, 2011
Thursday, January 6, 2011
Thursday, January 6, 2011
Thursday, January 6, 2011
TAP
• A generic, scalable method of streaming mutations from a given server– As data operations arrive, they can be sent to arbitrary TAP
receivers
• Leverages the existing memcached engine interface, and the non-blocking IO interfaces to send data
• Three modes of operation
Working setDataMutations
Working setDataMutations
Working set
15
Thursday, January 6, 2011
Membase data flow – under the hood
16
SET request arrives at KEY’s master server
Listener-Sender
Master server for KEY Replica Server 2 for KEYReplica Server 1 for KEY
3 3
1SET acknowledgement returned to application2
DiskDisk Disk
RAM
mem
base
sto
rage
eng
ine
DiskDisk Disk
4
Thursday, January 6, 2011
ns_servermembase(memcached + membase engine)
moxi ns_server
vbucketmigratorTAP
memcached operationswith tap commands
memcached operations
Client
port 11211 memcached operations
moxi + Client
port 11210 memcached operations REST/comet
cluster topology and vbucket map
Clients, nodes and other nodes
17
Thursday, January 6, 2011
Data buckets are secure membase “slices”
18
Membase data servers
In the data center
Web application server
Application user
On the administrator console
Bucket 1Bucket 2
Aggregate Cluster Memory and Disk Capacity
Thursday, January 6, 2011
vBucket mapping
19
Thursday, January 6, 2011
Disk > Memory
Buc
ket C
onfig
urat
ion
mem_high_wat
mem_low_wat
memory quota
20
Dataset may have many items infrequently accessed. However, memcached has different behavior (LRU) than wanted with membase.
Still, traditional (most) RDBMS implementations are not 100% correct for us either. The speed of a miss is very, very important.
Thursday, January 6, 2011
Membase Demo
Thursday, January 6, 2011
Key-Value Patterns
Thursday, January 6, 2011
Key-Value
23
Thursday, January 6, 2011
Key-Value
23
Items have:KeyValueExpirationFlagsCAS (more on this later)
Operations include:Get/SetIncrement/DecrementAppend/Prepend
Thursday, January 6, 2011
Key-Value
23
Thursday, January 6, 2011
Key-Value
Image courtesy http://www.flickr.com/photos/brenda-starr/3509344100/sizes/m/in/photostream/
(with a replica )23
Thursday, January 6, 2011
Membase Datatypes
24
Thursday, January 6, 2011
Membase Datatypes
• byte[]– Does your data have
1s and 0s?
24
Thursday, January 6, 2011
Membase Datatypes
• byte[]– Does your data have
1s and 0s?
24
“Any customer can have a car painted any colour that he wants so long as it is black.”
Thursday, January 6, 2011
Membase Datatypes
• byte[]– Does your data have
1s and 0s?
24
“Any customer can have a car painted any colour that he wants so long as it is black.”
• Items do have flags– Many clients use flags
– Data type options• Google protobuf• Thrift• Avro
Thursday, January 6, 2011
Transactions
• Lock == slow me down• CAS operations
– Optimistic locking• Very useful with complex
datatypes– Imagine two clients trying to
update a complex item• You’re likely using CAS
already... if you use a CPU
25
User 1
Fail!
User 2Success
Thursday, January 6, 2011
Common Use: Sessions
• Web user sessions– Highly read, less writes in many case– Protocol advantage of memcached
• Options already for PHP, Ruby and Java
• Application state– Not necessarily “entity” style things– May be appropriate for a “cache” pool
26
Thursday, January 6, 2011
Common Use (cache): Rate Limiting
• Want to provide API calls into the system– Twitter search– Google search services
• Use the atomic increment– Set an item with a unique ID– Upon API request,
increment and check• HTTP 420: go away and come
back later
27
Your Users
Your App
¡Ouch!
Thursday, January 6, 2011
Looking Ahead: NodeCode
Thursday, January 6, 2011
Beyond key-value • Indexing/Range Queries• Advanced Data Structures• Sub-object direct manipulation
Validation and In-flight transformation• Block mutations failing validation• Enrich or transform objects
Connectors (Integrate easily with other systems)• Solr• Hadoop• MySQL
NodeCode – Motivation
29
Thursday, January 6, 2011
Thursday, January 6, 2011
31
Q&A
Thursday, January 6, 2011
Attributions
• http://commons.wikimedia.org/wiki/File:Flag_of_China.png
• http://commons.wikimedia.org/wiki/File:Flag_of_South_Korea.svg
• http://commons.wikimedia.org/wiki/File:Flag_of_Japan.svg
32
Thursday, January 6, 2011
Thursday, January 6, 2011