couchbase 101: couchbase connect 2014

37
Couchbase 101 Dipti Borkar | Sr. Director | Solutions Engineering Couchbase

Upload: couchbase

Post on 09-Jul-2015

796 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Couchbase 101: Couchbase Connect 2014

Couchbase 101Dipti Borkar | Sr. Director | Solutions Engineering

Couchbase

Page 2: Couchbase 101: Couchbase Connect 2014

What is Couchbase?

Page 3: Couchbase 101: Couchbase Connect 2014

Couchbase Server

©2014 Couchbase, Inc. 3

General purpose

Elastic scalability Consistent high

performance

Always

available

Flexible, global

deployment

Enterprise grade

administration

Real time big

data

Data

mobility

Developer

focused

The Most complete, scalable & highest performing NoSQL database

Page 4: Couchbase 101: Couchbase Connect 2014

Couchbase Server

Couchbase offers a full range of

Data Management solutions

High Availability

Cache

Key Value Document Mobile

device

SSN: 400 658 9993

Pass: ******

©2014 Couchbase, Inc. 4

Page 5: Couchbase 101: Couchbase Connect 2014

Key Capabilities

©2014 Couchbase, Inc. 5

• Developer focused

JSON Support

Indexing/Querying

Incremental Map-Reduce

• Elastic Scalability

Shared-nothing

architecture with single

node type

Cross-data center

replication (XDCR)

Push button scale out

• Consistent High Performance

Built-in Object level cache

Fine grained locking

Hash Partitioning

• Always available

Zero downtime administration and

upgrades

Streaming and rack aware

replication

Comprehensive cluster-wide

monitoring

Page 6: Couchbase 101: Couchbase Connect 2014

Couchbase Server 3.0

©2014 Couchbase, Inc. 6

Page 7: Couchbase 101: Couchbase Connect 2014

Cluster-wide Diagnostics Tool

Improved Crash Reporting

Better Serviceability

Improved Monitoring for Warm-up

Stats Enhancements

Improved Checkpointing with XDCR

Expended Options with Couchbase-cli Parallelized Warm-up

Database Change Protocol (DCP)

Increased Connection LimitsExtended Documentation

Enhanced Event Logging

Extended XDCR Resiliency

Better failover resiliency with DCP

Enhanced SSD Performance

Streamlined Build

Side by side support for DCP and TAP

Improved Error Reporting for Apps

Built-in OS Tuning for Linux Flavors

… and More

Improved Resume-ability with Intra Cluster Replication

View Engine rewrite in C

Web.Config & App.Config support with .NET in SDK 2.0

Unified New App Model with 2.0 SDK

Java SDK 2.0 built on top of RxJava

N1QL Preview Support in the 2.0 SDKs

Faster Warm-up time under Metadata Ejection

Access log for monitoring port 8091

CRAM & MD5 Support in .Net SDK

Client Side Log4Net Integration in .Net SDK

New Cluster object for Cluster Operations in SDK 2.0

Replica Read in SDK 2.0 with .Net and PHP

Added Support for Debian v7

Faster View indexing

Faster XDCR Synchronization

Faster ReplicateTo

Faster PersistTo

Faster Rebalance

Encrypted Data Access

Encrypted Admin Access

Encrypted View Access

Graceful Failover

Delta Node Recovery

Incremental Backup and Restore

Lower Latency XDCR

Auto tuning IO subsystem

Additional Configuration on .Net pooling interval

New repository for Yum for Ubuntu

Community Edition of 3.0 Release with Enterprise Edition

Page 8: Couchbase 101: Couchbase Connect 2014

Key Concepts

©2014 Couchbase, Inc.

Page 9: Couchbase 101: Couchbase Connect 2014

Couchbase can act as a

Key-Value Store Document Store

2014-06-23-10:15am : 75F

2014-06-23-11:30am : 77F

2014-06-23-02:00pm : 82F

0001:

{firstname: “Dipti”,

lastname: “Borkar”,

language: “English”,

time_zone: “PST”,

zip: 94403

}

Key - UTF-8 string up to 250 bytes

Value - can be 0 bytes – 20 MB (best practice < 1 MB)©2014 Couchbase, Inc. 9

Page 10: Couchbase 101: Couchbase Connect 2014

Similar to primary keys in relational databases

Documents are sharded based on the document ID

ID based document lookup is extremely fast

Must be unique

Fundamentals

JSON

Binary - integers, strings, booleans

Common binary values include serialized objects, compressed XML, compressed text

Document ID or Key

Value

CAS Value (unique identifier for concurrancy)

TTL

Flags (optional client library metadata)

Revision #

Metadata

©2014 Couchbase, Inc.

10

Page 11: Couchbase 101: Couchbase Connect 2014

Can Represent Complex Objects and Data Structures

Very simple notation, lightweight, compact, readable

The most common API return type for Integrations

Facebook, Twitter, you name it, return JSON

Native to Javascript (can be useful)

Can be inserted straight into Couchbase (faster development)

Serialization and Deserialization are very fast

Benefits of JSON

©2014 Couchbase, Inc. 11

Page 12: Couchbase 101: Couchbase Connect 2014

Storing and retrieving documents

Couchbase Cluster

Server Nodes

User/application data

Which live on

Data Buckets

DocumentsRead from / Written to

That form a

Clients

Servers

Dynamically scalable

Based on hash partitioning

©2014 Couchbase, Inc. 12

Page 13: Couchbase 101: Couchbase Connect 2014

User Objectstring uid

string firstname

string lastname

int age

array favorite_colors

string email

u::[email protected]{ “uid”: 123456,

“firstname”: “John”,“lastname”: “Smith”,“age”: 22,“favorite_colors”: [“blue”, “black”],“email”: “[email protected]

}

User Objectstring uid

string firstname

string lastname

int age

array favorite_colors

string email

u::[email protected]{ “uid”: 123456,

“firstname”: “John”,“lastname”: “Smith”,“age”: 22,“favorite_colors”: [“blue”, “black”],“email”: “[email protected]

}

add()

get()

Objects Serialized to JSON and Back

©2014 Couchbase, Inc. 13

Page 14: Couchbase 101: Couchbase Connect 2014

Within each server – Single Node Type

©2014 Couchbase, Inc. 14

Heart

beat

Pro

cess m

onitor

Glo

bal sin

gle

ton s

uperv

isor

Configura

tion m

anager

on each node

Rebala

nce o

rchestr

ato

r

Node h

ealth m

onitor

one per clusa

vB

ucket sta

te a

nd r

eplic

atio

n m

anager

http

RE

ST

man

ag

em

en

t A

PI/

Web

UI

HTTP

8091Erlang port mapper

4369Distributed Erlang

21100 - 21199

Erlang/OTP

storage interface

Couchbase EP Engine

11210Memcapable 2.0

Moxi

11211Memcapable 1.0

Memcached

Persistence Layer

8092Query API

Qu

ery

En

gin

e

Data Manager Cluster Manager

Page 15: Couchbase 101: Couchbase Connect 2014

Couchbase Basic Operations

©2014 Couchbase, Inc.

Page 16: Couchbase 101: Couchbase Connect 2014

Single Node Operations - Write

©2014 Couchbase, Inc. 16

33 2Managed Cache

Dis

k Q

ueu

e

Disk

Replication Queue

App Server

Memory-to-Memory

Replication to other

node

Doc

Doc Doc

Page 17: Couchbase 101: Couchbase Connect 2014

Managed Cache

Disk

Single Node Operations - Read

©2014 Couchbase, Inc. 17

Managed Cache

Doc 1

Get Doc 1

Doc 1Doc 1

App Server

Dis

k Q

ueu

e

Replication Queue

Memory-to-Memory

Replication to other

node

Page 18: Couchbase 101: Couchbase Connect 2014

Disk

Managed Cache

Single Node Operations – Cache Ejection

©2014 Couchbase, Inc. 18

Doc 1

Doc 1

Doc 2Doc 3Doc 4Doc 5Doc 6

Doc 2Doc 3Doc 4Doc 5Doc 6

App Server

Dis

k Q

ueu

e

Replication Queue

Memory-to-Memory

Replication to other

node

Page 19: Couchbase 101: Couchbase Connect 2014

Single Node Operations – Cache Miss

©2014 Couchbase, Inc. 19

33 2

Dis

k Q

ueu

e

Disk

Replication Queue

App Server

Memory-to-Memory

Replication to other

node

Doc 1

Doc 2Doc 3Doc 4Doc 5Doc 6

Doc 2Doc 3Doc 4Doc 5Doc 6

Doc 1

Doc 1Doc 1

Managed Cache

Get Doc 1

Page 20: Couchbase 101: Couchbase Connect 2014

Cluster-wide Operations

©2014 Couchbase, Inc.

Page 21: Couchbase 101: Couchbase Connect 2014

Each bucket has active and replica data sets

Each data set has 1024 Virtual Bucket (vBuckets)

Documents gets logically mapped to vBuckets

Document IDs always get hashed to the same virtual bucket

Virtual buckets to do not have a fixed physical server location

Mapping between the virtual buckets and physical server is called the cluster map

Each virtual bucket contains 1/1024th portion of the data set

Auto sharding – Bucket and vBuckets

vB

Data buckets

vB

1 ….. 1024

Virtual buckets

©2014 Couchbase, Inc.21

Page 22: Couchbase 101: Couchbase Connect 2014

Cluster Map

Hash function (KEY)

vB1 vB2 vB3 vB4 vB5 vB6

Ph

ys

ica

l

se

rve

rs

A B C

Add nodeWhen more scalability

required

Lo

gic

al

Pa

rtit

ion

s

Cluster Map

New Cluster Map

©2014 Couchbase, Inc. 22

Page 23: Couchbase 101: Couchbase Connect 2014

read/write/update

Active

SERVER 1

Active

SERVER 2

Active

SERVER 3

APP SERVER 1

COUCHBASE Client Library

CLUSTER MAP

COUCHBASE Client Library

CLUSTER MAP

APP SERVER 2

Shard

5

Shard

2

Shard

9

Shard

Shard

Shard

Shard

4

Shard

7

Shard

8

Shard

Shard

Shard

Shard

1

Shard

3

Shard

6

Shard

Shard

Shard

Replica Replica Replica

Shard

4

Shard

1

Shard

8

Shard

Shard

Shard

Shard

6

Shard

3

Shard

2

Shard

Shard

Shard

Shard

7

Shard

9

Shard

5

Shard

Shard

Shard

Multi-Node Operations

• Docs distributed evenly across servers

• Each server stores both active and replica docs- Only one server active at a time

• Client library provides app with simple interface to database

• Cluster map provides map to which server doc is on- App never needs to know

• App reads, writes, updates docs

• Multiple app servers can access same document at same time

©2014 Couchbase, Inc. 23

Page 24: Couchbase 101: Couchbase Connect 2014

SERVER 4 SERVER 5

Replica

Active

Replica

Active

read/write/update

APP SERVER 1

COUCHBASE Client Library

CLUSTER MAP

COUCHBASE Client Library

CLUSTER MAP

APP SERVER 2

Active

SERVER 1

Shard

9

Shard

Replica

Shard

4

Shard

1

Shard

8

Shard

Shard

Shard

Active

SERVER 2

Shard

8

Shard

Replica

Shard

6

Shard

3

Shard

2

Shard

Shard

Shard

Active

SERVER 3

Shard

6

Shard

Replica

Shard

7

Shard

9

Shard

5

Shard

Shard

Shard

read/write/update

Shard

5

Shard

2

Shard

Shard

Shard

4

Shard

7

Shard

Shard

Shard

1

Shard

3

Shard

Shard

Adding Nodes

• Two servers added withone-click operation

• Docs automatically rebalance across cluster- Even distribution of docs- Minimum doc movement

• Cluster map updated

• App database calls now distributed over larger number of servers

©2014 Couchbase, Inc. 24

Page 25: Couchbase 101: Couchbase Connect 2014

Failover

SERVER 4 SERVER 5

Replica

Active

Replica

Active

App Server 1

COUCHBASE Client Library

CLUSTER MAP

COUCHBASE Client Library

CLUSTER MAP

App Server 2

Active

SERVER 1

Shard 5

Shard 2

Shard 9Shard

Shard

Shard

Replica

Shard 4

Shard 1

Shard 8Shard

Shard

Shard

Active

SERVER 2

Shard 4

Shard 7 Shard 8

Shard

Shard Shard

Replica

Shard 6

Shard 3 Shard 2

Shard

Shard Shard

Active

SERVER 3

Shard 1

Shard 3

Shard 6Shard

Shard

Shard

Replica

Shard 7

Shard 9

Shard 5Shard

Shard

Shard

• App servers accessing Shards

• Requests to Server 3 fail

• Cluster detects server failedo Promotes replicas of

Shards to activeo Updates cluster map

• Requests for docs now go to appropriate server

• Typically rebalance would follow

Shard 1 Shard 3

Shard

©2014 Couchbase, Inc. 25

Page 26: Couchbase 101: Couchbase Connect 2014

Cross Datacenter Replication (XDCR)

US DATA

CENTER

EUROPE

DATA

CENTER

ASIA DATA

CENTER

http://blog.groosy.com/wp-content/uploads/2011/10/internet-map.jpg©2014 Couchbase, Inc. 26

Page 27: Couchbase 101: Couchbase Connect 2014

Cross Datacenter Replication (XDCR)

©2014 Couchbase, Inc. 27

• Replicates data continuously FROM source cluster TO remote clusters

• Supports unidirectional and bidirectional operation

• Application can read and write from both clusters (active – active replication)

• Replication throughput scales out linearly

• Simplified Administration via console, REST, and CLI

Page 28: Couchbase 101: Couchbase Connect 2014

Cross Datacenter Replication (XDCR)

©2014 Couchbase, Inc. 28

Unidirectional Replication

• Hot spare / Disaster Recovery

• Development/Testing copies

• Replicate to indexing cluster

• Integrate to Connector e.g. Solr,

ElasticSearch

• Integrate to custom consumer

Page 29: Couchbase 101: Couchbase Connect 2014

Cross Datacenter Replication (XDCR)

©2014 Couchbase, Inc. 29

Bidirectional Replication

• Multiple Active Masters

• Data locality

• Disaster Recovery

Page 30: Couchbase 101: Couchbase Connect 2014

33 2

XDCR after Write

2

Managed Cache

Dis

k Q

ueu

e

Disk

Replication Queue

App Server

Couchbase Server Node

Doc 1

Doc 1

XDCR Queue

Doc 1Doc 1

(New in 3.0)

Memory-to-Memory

Replication to

remote clusterMemory-to-Memory

Replication to other

node

©2014 Couchbase, Inc.30

Page 31: Couchbase 101: Couchbase Connect 2014

ACTIVE

SERVER 1

RAM

DISK

Doc

Doc 2

Doc 9

Doc Doc Doc

ACTIVE

SERVER 2

RAM

DISK

Doc

Doc

Doc

Doc Doc Doc

ACTIVE

SERVER 3

RAM

DISK

Doc

Doc

Doc

Doc Doc Doc

Cross Data Center Replication (XDCR)

COUCHBASE SERVER CLUSTER

NYC DATA CENTER

COUCHBASE SERVER CLUSTER

SF DATA CENTER

ACTIVE

SERVER 1

RAM

DISK

Doc

Doc 2

Doc 9

Doc Doc Doc

ACTIVE

SERVER 2

RAM

DISK

Doc

Doc

Doc

Doc Doc Doc

ACTIVE

SERVER 3

RAM

DISK

Doc

Doc

Doc

Doc Doc Doc

{ } { } { }

©2014 Couchbase, Inc.31

Page 32: Couchbase 101: Couchbase Connect 2014

Index and Query

Distributed indexing and querying

Secondary indexes of JSON document content

Flexible querying of indexes

Incremental Map-Reduce

Distributed simple real-time analytics

Only considers changes due to updated data

Full Text Search

Robust integration with ElasticSearch / Solr cluster

Flexible full text search and faceted search

Indexing and Querying Features

©2014 Couchbase, Inc. 32

Page 33: Couchbase 101: Couchbase Connect 2014

33 2

View processing after write

2

Managed Cache

Dis

k Q

ueu

e

Disk

Replication Queue

App Server

Couchbase Server Node

Doc 1

Doc 1

To other node

View engine Doc 1Doc 1

©2014 Couchbase, Inc.33

Page 34: Couchbase 101: Couchbase Connect 2014

Active

SERVER 1

Shard

5

Shard

2

Shard

Shard

Replica

Shard

4

Shard

1

Shard

Shard

Shard

1

Active

SERVER 3

Shard

5

Shard

2

Shard

Shard

Replica

Shard

4

Shard

1

Shard

Shard

Shard

1

Active

SERVER 2

Shard

5

Shard

2

Shard

Shard

Replica

Shard

4

Shard

1

Shard

Shard

Shard

1

APP SERVER 1

COUCHBASE Client Library

CLUSTER MAP

COUCHBASE Client Library

CLUSTER MAP

APP SERVER 2

Couchbase Server Architecture - Views

• Indexing work is distributed amongst nodes

• Large data set possible

• Parallelize the effort

• Each node has index for data stored on it

• Queries combine the results from required nodes

©2014 Couchbase, Inc. 34

Page 35: Couchbase 101: Couchbase Connect 2014

Live cluster

©2014 Couchbase, Inc.

Page 36: Couchbase 101: Couchbase Connect 2014

Q & A

©2014 Couchbase, Inc.

Page 37: Couchbase 101: Couchbase Connect 2014

[email protected]

@dborkar

©2014 Couchbase, Inc.