everything you always wanted to know about redis but were afraid to ask

73
Redis Everything you always wanted to know about Redis but were afraid to ask Carlos Abalde [email protected] March 2014

Upload: carlos-abalde

Post on 10-May-2015

5.319 views

Category:

Technology


7 download

DESCRIPTION

Yet another Redis presentation: - Introduction: context, popular Redis users, latest releases… - Redis 101: basics, scripting, some examples… - Mastering Redis: persistence, replication, performance, sharding…

TRANSCRIPT

Page 1: Everything you always wanted to know about Redis but were afraid to ask

RedisEverything you always wanted to know

about Redis but were afraid to ask

Carlos Abalde [email protected]

March 2014

Page 2: Everything you always wanted to know about Redis but were afraid to ask

Introduction

Remote dictionary server

Page 3: Everything you always wanted to know about Redis but were afraid to ask

INSERT INTO…

$ redis-cli -n 0!

127.0.0.1:6379> SET the-answer 42OK!

127.0.0.1:6379> QUIT!

rulo:~$

Page 4: Everything you always wanted to know about Redis but were afraid to ask

SELECT * FROM…

$ redis-cli -n 0!

127.0.0.1:6379> GET the-answer"42"!

127.0.0.1:6379> QUIT!

rulo:~$

Page 5: Everything you always wanted to know about Redis but were afraid to ask

The end

Page 6: Everything you always wanted to know about Redis but were afraid to ask

The end

Page 7: Everything you always wanted to know about Redis but were afraid to ask

Agenda

I. Introduction

‣ Context, popular Redis users, latest releases…

II. Redis 101

‣ Basics, scripting, some examples…

III.Mastering Redis

‣ Persistence, replication, performance, sharding…

Page 8: Everything you always wanted to know about Redis but were afraid to ask

I. Introduction

http://www.flickr.com/photos/verino77/5616332196/

Page 9: Everything you always wanted to know about Redis but were afraid to ask

NoSQL / NoREL mess๏ Document DBs

‣ MongoDB, CouchDB, Riak…

๏ Graph DBs

‣ Neo4j, FlockDB…

๏ Column oriented DBs

‣ HBase, Cassandra, BigTable…

๏ Key-Value DBs

‣ Memcache, MemcacheDB, Redis, Voldemort, Dynamo…

Page 10: Everything you always wanted to know about Redis but were afraid to ask

Who’s behind Redis?

๏ Created by Salvatore Sanfilippo

‣ http://antirez.com

‣ @antirez at Twitter

๏ Currently sponsored by Pivotal

‣ Previously to May 2013 sponsored by VMware

Page 11: Everything you always wanted to know about Redis but were afraid to ask

Who’s using Redis? I

Page 12: Everything you always wanted to know about Redis but were afraid to ask

Who’s using Redis? II

๏ The architecture Twitter uses to deal with 150M active users, 300K QPS, a 22 MB/S Firehose, and send tweets in under 5 seconds. High Scalability (2013)▸

๏ Storing hundreds of millions of simple key-value pairs in Redis. Instagram Engineering Blog (2012)▸

๏ The Instagram architecture Facebook bought for a cool billion dollars. High Scalability (2012)▸

๏ Facebook’s Instagram: making the switch to Cassandra from Redis, a 75% ‘insta’ savings. Planet Cassandra (2013)▸

Page 13: Everything you always wanted to know about Redis but were afraid to ask

Who’s using Redis? III

๏ Highly available real time push notifications and you. Flickr Engineering Blog (2012)▸

๏ Using Redis as a secondary index for MySQL. Flickr Engineering Blog (2013)▸

๏ How we made GitHub fast. The GitHub Blog (2009)▸

๏ Real world Redis. Agora Games (2012)▸

๏ Disqus discusses migration from Redis to Cassandra for horizontal Scalability. Planet Cassandra (2013)▸

Page 14: Everything you always wanted to know about Redis but were afraid to ask

Memory is the new disk

๏ BSD licensed in-memory data structure server

‣ Strings, hashes, lists, sets…

๏ Optional durability

๏ Bindings to almost all relevant languages

“Memory is the new disk, disk is the new tape”

— Jim Gray

Page 15: Everything you always wanted to know about Redis but were afraid to ask

A fight against complexity

๏ Simple & robust foundations

‣ Single threaded

‣ No map-reduce, no indexes, no vector clocks, no Paxos, no Merkle trees, no gossip protocols…

๏ Blazingly fast

‣ Implemented in C (20K LoC for the 2.2 release)

‣ No dependencies

Page 16: Everything you always wanted to know about Redis but were afraid to ask

A fight against complexity

!

5. We’re against complexity. We believe designing systems is a fight against complexity. […] Most of the time the best way to fight complexity is by not creating it at all.

The Redis Manifesto▸

Page 17: Everything you always wanted to know about Redis but were afraid to ask

Most popular K-V DB

๏ Currently most popular key-value DB▸

๏ Redis 1.0 (April’09) ↝ Redis 2.8.6 (March’14)

Google Trends▸

Page 18: Everything you always wanted to know about Redis but were afraid to ask

Latest releases I

๏ Redis 2.6 (October’12)

‣ LUA scripting

‣ New commands

‣ Milliseconds precision expires

‣ Unlimited number of clients

‣ Improved AOF generation

Page 19: Everything you always wanted to know about Redis but were afraid to ask

Latest releases II

๏ Redis 2.8 (November’13)

‣ Redis 2.7 removing clustering stuff

‣ Partial resynchronization with slaves

‣ IPv6 support

‣ Config rewriting

‣ Key-space changes notifications via Pub/Sub

Page 20: Everything you always wanted to know about Redis but were afraid to ask

Latest releases III

๏ Redis 3.0

‣ Next beta release planned to March’14

‣ Redis Cluster

‣ Speed improvements under certain workloads

Page 21: Everything you always wanted to know about Redis but were afraid to ask

Commands๏ redis-server

๏ redis-cli

‣ Command line interface

๏ redis-benchmark

‣ Benchmarking utility

๏ redis-check-dump & redis-check-aof

‣ Corrupted RDB/AOF files utilities

Page 22: Everything you always wanted to know about Redis but were afraid to ask

Performance

๏ Redis 2.6.14

๏ Intel Xeon CPU E5520 @ 2.27GHz

๏ 50 simultaneous clients performing 2M requests

๏ Loopback interface

๏ Key space of 1M keys

Sample benchmark

Page 23: Everything you always wanted to know about Redis but were afraid to ask

PerformanceNo pipelining

$ redis-benchmark \ -r 1000000 -n 2000000 \ -t get,set,lpush,lpop -q!SET: 122556.53 requests per secondGET: 123601.76 requests per secondLPUSH: 136752.14 requests per secondLPOP: 132424.03 requests per second

Page 24: Everything you always wanted to know about Redis but were afraid to ask

Performance16 command per pipeline

$ redis-benchmark \ -r 1000000 -n 2000000 \ -t get,set,lpush,lpop -P 16 -q!SET: 552028.75 requests per secondGET: 707463.75 requests per secondLPUSH: 767459.75 requests per secondLPOP: 770119.38 requests per second

Page 25: Everything you always wanted to know about Redis but were afraid to ask

Summary

✓ Simple

✓ Predictable

✓ Reliable

✓ Fast

✓ Widely supported

✓ Lightweight

Page 26: Everything you always wanted to know about Redis but were afraid to ask

II. Redis 101

http://www.flickr.com/photos/caseycanada/2058552752/

Page 27: Everything you always wanted to know about Redis but were afraid to ask

Overview

๏ Family of fundamental data structures

‣ Strings and string containers

‣ Accessed / indexed by key

‣ Directly exposed — No abstraction layers

๏ Rich set of atomic operations over the structures

‣ Detailed reference using big-O notation for complexities

๏ Basic publish / subscribe infrastructure

Page 28: Everything you always wanted to know about Redis but were afraid to ask

Keys๏ Arbitrary ASCII strings

‣ Define some format convention and adhere to it

‣ Key length matters!

๏ Multiple name spaces are available

‣ Separate DBs indexed by an integer value

- SELECT command

- Multiples DBs vs. Single DB + key prefixes

๏ Keys can expire automatically

Page 29: Everything you always wanted to know about Redis but were afraid to ask

Data structures I

๏ Strings

‣ Caching, counters, realtime metrics…

๏ Hashes

‣ “Object” storage…

๏ Lists

‣ Logs, queues, message passing…

Page 30: Everything you always wanted to know about Redis but were afraid to ask

Data structures II

๏ Sets

‣ Membership, tracking…

๏ Ordered sets

‣ Leaderboards, activity feeds…

RTFM, please :) ▸

Page 31: Everything you always wanted to know about Redis but were afraid to ask

Publish / Subscribe

๏ Classic pattern decoupling publishers & subscribers

‣ You can subscribe to channels; when someone publish in a channel matching your interests Redis will send it to you

‣ SUBSCRIBE, UNSUBSCRIBE & PUBLISH commands

๏ Fire and forget notifications

‣ Not suitable for reliable off-line notification of events

๏ Pattern-matching subscriptions

‣ PSUBSCRIBE & PUNSUBSCRIBE commands

Overview

Page 32: Everything you always wanted to know about Redis but were afraid to ask

Publish / Subscribe

๏ Available since Redis 2.8

‣ Disabled in the default configuration

‣ Key-space vs. keys-event notifications

๏ Delay of key expiration events

‣ Expired events are generated when Redis deletes the key; not when the TTL is consumed

- Lazy (i.e. on access time) key eviction

- Background key eviction process

Key-space notifications

Page 33: Everything you always wanted to know about Redis but were afraid to ask

Pipelining๏ Redis pipelines are just a RTT optimization

‣ Deliver multiple commands together without waiting for replies

‣ Fetch all replies in a single step

- Server needs to buffer all replies!

๏ Pipelines are NOT transactional or atomic

๏ Redis scripting FTW!

‣ Much more flexible alternative

Page 34: Everything you always wanted to know about Redis but were afraid to ask

Transactions๏ Or, more precisely, “transactions”

‣ Commands are executed as an atomic & single isolated operation

- Partial execution is possible due to pre/post EXEC failures!

‣ Rollback is not supported!

๏ MULTI, EXEC & DISCARD commands

‣ Conditional EXEC with WATCH

๏ Redis scripting FTW!

‣ Redis transactions are complex and cumbersome

Page 35: Everything you always wanted to know about Redis but were afraid to ask

Scripting

๏ Added in Redis 2.6

๏ Uses the LUA 5.1 programming language▸

‣ Base, Table, String, Math & Debug libraries

‣ Built-in support for JSON and MessagePack

‣ No global variables

‣ redis.{call(), pcall()}

‣ redis.{error_reply(), status_reply(), log()}

Overview I

Page 36: Everything you always wanted to know about Redis but were afraid to ask

Scripting

๏ Scripts are atomic, like any other command

๏ Scripts add minimal overhead

‣ Single thread ⇒ Shared LUA context

๏ Scripts are replicated on slaves by sending the script (i.e. not the resulting commands)

‣ Scripts are required to be pure functions

‣ Maximum execution time vs. Atomic execution

Overview II

Page 37: Everything you always wanted to know about Redis but were afraid to ask

Scripting

๏ Server side manipulation of data

๏ Minimizes latency

‣ No round trip delay

๏ Maximizes CPU usage

‣ Less parsing

‣ Less OS system calls

๏ Simpler & faster alternative to WATCH

What is fixed with scripting?

Page 38: Everything you always wanted to know about Redis but were afraid to ask

Scripting

๏ Stored procedures are evil

๏ Backend logic should be 100% application side

‣ No hidden behaviors

‣ No crazy version management

๏ Redis keys are explicitly declared as parameters of the script

‣ Cluster friendly

‣ Hashed scripts

Scripts vs. Stored procedures

Page 39: Everything you always wanted to know about Redis but were afraid to ask

ScriptingHello world!

> EVAL " return redis.call('SET', KEYS[1], ARGV[1])" 1 foo 42OK!> GET foo"42"

Page 40: Everything you always wanted to know about Redis but were afraid to ask

ScriptingDECREMENT-IF-GREATER-THAN

EVAL " local res = redis.call('GET', KEYS[1]);! if res ~= nil then res = tonumber(res); if res ~= nil and res > tonumber(ARGV[1]) then res = redis.call('DECR', KEYS[1]); end end! return res" 1 foo 100

Page 41: Everything you always wanted to know about Redis but were afraid to ask

Scripting

๏ EVALSHA sha1 nkeys key [key…] arg [arg…]

‣ Client libraries optimistically use EVALSHA

- On NOSCRIPT error, EVAL is used

‣ Automatic version management

๏ SCRIPT LOAD script

‣ Cached scripts are no flushed until server restart

‣ Ensures EVALSHA will not fail (e.g. MULTI/EXEC)

Some more commands

Page 42: Everything you always wanted to know about Redis but were afraid to ask

Dangerous commands

๏ KEYS pattern

๏ SAVE

๏ FLUSHALL & FLUSHDB

๏ CONFIG

Page 43: Everything you always wanted to know about Redis but were afraid to ask

Some examples I

๏ 11 common web use cases solved in Redis▸

๏ How to take advantage of Redis just adding it to your stack▸

๏ A case study: design and implementation of a simple Twitter clone using only PHP and Redis▸

๏ Scaling Crashlytics: building analytics on Redis 2.6▸

Page 44: Everything you always wanted to know about Redis but were afraid to ask

Some examples II

๏ Fast, easy, realtime metrics using Redis bitmaps▸

๏ Redis - NoSQL data store▸

๏ Auto complete with Redis▸

๏ Multi user high performance web chat▸

Page 45: Everything you always wanted to know about Redis but were afraid to ask

III. Mastering Redis

http://www.fotolia.com/id/19245921

Page 46: Everything you always wanted to know about Redis but were afraid to ask

Persistence

๏ The whole dataset needs to feet in memory

‣ Durability is optional

‣ Very high read & write rates

‣ Optimal & simple memory and disk representations

๏ What if Redis runs out of memory?

‣ Swapping ⇒ Performance degradation

‣ Hit maxmemory limit ⇒ Failed writes or eviction policy

Overview

Page 47: Everything you always wanted to know about Redis but were afraid to ask

Persistence

๏ Periodic asynchronous point-in-time dump to disk

‣ Every S seconds and C changes

‣ Fast service restarts

๏ Possible data lost during a crash

๏ Compact files

๏ Minimal overhead during operation

๏ Huge data sets may experience short delays during fork()

๏ Copy-on-write fork() semantics ⇒ 2x memory problem

Snapshotting — RDB

Page 48: Everything you always wanted to know about Redis but were afraid to ask

Persistence

๏ Journal file logging every write operation

‣ Configurable fsync frequency: speed vs. safety

‣ Commands replayed when server restarts

๏ No as compact as RDB

‣ Safe background AOF file rewrite fork()

๏ Overhead during operation depends on fsync behavior

๏ Recommended to use both RDB + AOF

‣ RDB is the way to of for backups & disaster recovery

Append only file — AOF

Page 49: Everything you always wanted to know about Redis but were afraid to ask

Security

๏ Designed for trusted clients in trusted environments

‣ No users, no access control, no connection filtering…

๏ Basic unencrypted AUTH command

‣ requirepass s3cr3t

๏ Command renaming

‣ rename-command FLUSHALL f1u5hc0mm4nd

‣ rename-command FLUSHALL ""

Page 50: Everything you always wanted to know about Redis but were afraid to ask

Replication

๏ One master — Multiple slaves

‣ Scalability & redundancy

- Client side failover, eviction, query routing…

‣ Lightweight master

๏ Slaves are able to accept other slave connections

๏ Non-blocking in the master, but blocking on the slaves

๏ Asynchronous but periodically acknowledged

Overview I

Page 51: Everything you always wanted to know about Redis but were afraid to ask

Replication

๏ Automatic slave reconnection

๏ Partial resynchronization: PSYNC vs. SYNC

‣ RDB snapshots are used during initial SYNC

๏ Read-write slaves

‣ slave-read-only no

‣ Ephemeral data storage

๏ Minimum replication factor

Overview II

Page 52: Everything you always wanted to know about Redis but were afraid to ask

Replication

๏ Trivial setup

‣ slaveof <host> <port>

‣ SLAVEOF [<host> <port >| NO ONE]

๏ Some more configuration tips

‣ slave-serve-stale-data [yes|no]

‣ repl-ping-slave-period <seconds>

‣ masterauth <password>

Some commands & configuration

Page 53: Everything you always wanted to know about Redis but were afraid to ask

Replication

๏ Inconsistencies are possible when using some eviction policy in a replicated setup

‣ Set slave’s maxmemory to 0

Final tips

Page 54: Everything you always wanted to know about Redis but were afraid to ask

Performance

๏ Fast CPUs with large caches and not many cores

๏ Do not invest on expensive fast memory modules

๏ Avoid virtual machines

๏ Use UNIX domain sockets when possible

๏ Aggregate commands when possible

๏ Keep low the number of client connections

General tips

Page 55: Everything you always wanted to know about Redis but were afraid to ask

Performance

๏ Special encoding of small aggregate data types

๏ 32 vs. 64 bit instances

๏ Consider using bit & byte level operations

๏ Use hashes when possible

๏ Alway check big-O notation complexities

Advanced optimization

Page 56: Everything you always wanted to know about Redis but were afraid to ask

Performance

๏ redis-cli --latency

‣ Typical latency for 1 GBits/s network is 200 μs

‣ SHOWLOG GET

‣ Monitor number of client connections and consider using multiplexing proxy

‣ Improve memory management

Understanding metrics I

Page 57: Everything you always wanted to know about Redis but were afraid to ask

Performance

๏ redis-cli INFO | grep …

๏ used_memory

‣ Usually inferior to used_memory_rss

- Used memory as seen by the OS

‣ Swapping risk when approaching 45% / 95%

‣ Reduce Redis footprint when possible

Understanding metrics II

Page 58: Everything you always wanted to know about Redis but were afraid to ask

Performance

๏ total_commands_processed

‣ Use multi-argument commands, scripts and pipelines when possible

๏ mem_fragmentation_ratio

‣ used_memory_rss ÷ used_memory

‣ Execute SHUTDOWN SAVE and restart the instance

‣ Consider alternative memory allocators

Understanding metrics III

Page 59: Everything you always wanted to know about Redis but were afraid to ask

Performance

๏ evicted_keys

‣ Keys removed when hitting maxmemory limit

‣ Increase maxmemory when possible

‣ Reduce Redis footprint when possible

‣ Consider sharding

Understanding metrics IV

Page 60: Everything you always wanted to know about Redis but were afraid to ask

Redis pools๏ Redis is extremely small footprint and lightweight

๏ Multiple Redis instances per node

‣ Full CPU usage

‣ Mitigated RDB 2x memory problem

‣ Fine tuned instances

๏ How to use multiple instances?

‣ Sharding

‣ Specialized instances

Page 61: Everything you always wanted to know about Redis but were afraid to ask

Redis Sentinel

๏ Official Redis HA / failover solution

‣ Periodically check liveness of Redis instances

‣ On master failure, choose slave & promote to master

‣ Notify clients & slaves about the new master

๏ Multiple Sentinels

‣ Complex distributed system

‣ Gossip, quorum & leader election algorithms

Overview I

Page 62: Everything you always wanted to know about Redis but were afraid to ask

Redis Sentinel

๏ Work in progress not ready for production

๏ Master pub/sub capabilities

‣ Auto discovery of other sentinels & slaves

‣ Notification of master failover

๏ Explicit client support required

๏ Redis Sentinel is a monitoring system with support for automatic failover. It does not turn Redis into a distributed data store. CAP discussions do not apply▸

Overview II

Page 63: Everything you always wanted to know about Redis but were afraid to ask

Redis Sentinel

๏ Set of primitives to ease building distributed systems

‣ http://zookeeper.apache.org

‣ Handling of network partitions, leader election, quorum management…

‣ Replicated, highly available, well-known…

๏ Ad-hoc Redis HA alternative to Sentinel

‣ Explicit client implementation required

Apache Zookeeper

Page 64: Everything you always wanted to know about Redis but were afraid to ask

Redis Cluster๏ Long term project to be released in Redis 3.0

๏ High performance & linearly scalable complex distributed DB

‣ Sharding across multiple nodes

‣ Graceful handling of network partitions

๏ Implemented subset

‣ Commands dealing with multiple keys, etc. not supported

‣ Multiple databases are not supported

๏ Keys hash tags

Page 65: Everything you always wanted to know about Redis but were afraid to ask

Sharding

๏ Distribute data into multiple Redis instances

‣ Allows much larger databases

‣ Allows to scale the computational power

๏ Data distribution strategies

‣ Directory based

‣ Ranges

‣ Hash + Module

‣ Consistent hashing

Overview I

Page 66: Everything you always wanted to know about Redis but were afraid to ask

Sharding

๏ Data distribution responsibility

‣ Client side

‣ Proxy assisted

‣ Query routing

๏ Do I really need sharding?

‣ Very unlikely CPU becomes bottleneck with Redis

‣ 500K requests per second!

Overview II

Page 67: Everything you always wanted to know about Redis but were afraid to ask

Sharding

๏ Multi-key commands are not supported

๏ Multi-key transactions are not supported

๏ Sharding unit is the key

๏ Harder client logics

๏ Complex to scale up/down when used as a store

Disadvantages

Page 68: Everything you always wanted to know about Redis but were afraid to ask

Sharding

๏ Hard to scale up/down sharded databases

‣ But data storage needs may vary over the time

๏ Take advantage of small Redis footprint

‣ Think big!

๏ Redis replication allows moving instances with minimal downtime

Presharding

Page 69: Everything you always wanted to know about Redis but were afraid to ask

Sharding

๏ Redis Cluster is currently not production ready

‣ Mix between query routing & client side partitioning

๏ Not all Redis clients support sharding

๏ Automatic sharding Redis & Memcache (ASCII) proxy

‣ Developed by Twitter & Apache 2.0 licensed

‣ https://github.com/twitter/twemproxy/

‣ Single threaded & extremely fast

Twimproxy overview I

Page 70: Everything you always wanted to know about Redis but were afraid to ask

Sharding

๏ Also known as nutcracker

๏ Connection multiplexer pipelining requests and responses

‣ Original motivation

๏ No bottleneck or single point of failure

๏ Optional node ejection

‣ Only useful when using Redis as a cache

Twimproxy overview II

Page 71: Everything you always wanted to know about Redis but were afraid to ask

Sharding

๏ Multiplexed persistent server connections

๏ Automatic sharding and protocol pipelining

๏ Multiple distribution algorithms supporting nicknames

๏ Simple dumb clients

๏ Automatic fault tolerance capabilities

๏ Zero copy

Why Twimproxy ?

Page 72: Everything you always wanted to know about Redis but were afraid to ask

Sharding

๏ Extra network hop

‣ Pipelining is your friend

๏ Not all commands supported

‣ Transactions

‣ Pub / Sub

๏ HA not supported

‣ Redis Sentinel Twemproxy agent

Why not Twimproxy ?

Page 73: Everything you always wanted to know about Redis but were afraid to ask

Thanks!

http://www.flickr.com/photos/62337512@N00/3958637561/