openstack aws cassandra - cassandra boston users group

10
m Thomas Vachon September 2012 Cassandra on Openstack and AWS presenting to Boston Cassandra Users Group

Upload: tomvachon

Post on 04-Jun-2015

1.006 views

Category:

Technology


2 download

DESCRIPTION

Presentation given on 9/19/2012 to the Cassandra Boston Users Group

TRANSCRIPT

Page 1: Openstack AWS Cassandra - Cassandra Boston Users Group

mThomas Vachon • September 2012Cassandra on Openstack and AWS

presenting toBoston Cassandra

Users Group

Page 2: Openstack AWS Cassandra - Cassandra Boston Users Group

2

Agenda

• Current Openstack Implementation• Running Cassandra on Openstack• Lessons Learned about Cassandra on AWS• Connecting Openstack and AWS• Connecting Cassandra on Openstack and AWS• Questions• Show & Tell

Page 3: Openstack AWS Cassandra - Cassandra Boston Users Group

3

Current Cassandra Implementation (AWS)

• 9 Cassandra Nodes (3 per AZ)• Cassandra 1.0.10• AWS m1.xlarge – 4 Drive RAID-0 Array• EC2Snitch• RF = 3• Network Topology Aware

• Statistics• Peak Traffic: 724 r/s with 1308 wr/s across the cluster

• 3.5ms read latency avg• 1.7ms write latency avg

INTRODUCTION

Page 4: Openstack AWS Cassandra - Cassandra Boston Users Group

4

Running Cassandra on Openstack

• Ec2Snitch doesn’t work (looks to the wrong endpoint)• Its hard to guarantee you keep your instances on separate machines

with a single zone• Contention/Steal is more easily achieved due to KVM and the lack of

CPU throttling• As always, the faster the hardware, the better the performance

• Perf Test - 5 Cassandra Nodes with RF=3 (cassandra-stress)• Read/s: 1,562 w/s• Writes/s: 3,846 r/s• Avg latency per op: 7.2ms• Seems to hurt the testing server more than the Cassandra Cluster

Page 5: Openstack AWS Cassandra - Cassandra Boston Users Group

5

Lessons Learned with Cassandra and AWS

• Be proactive in adjusting your caches• Row cache is a great thing (keep it out of heap)• Key cache hit rates dictate if you should burn memory on them or not• KNOW your Data and Access Patterns

• A slow node is worse than a dead node• CPU Steal is your mortal enemy

Page 6: Openstack AWS Cassandra - Cassandra Boston Users Group

6

Connecting Openstack and AWS

• Two Options• Public Internet Replication (SSL Highly recommended)

• HUGE transfer costs, risky

• VPC Tunnel• Static Tunnel with ASA – ASA’s can only connect to one tunnel at a time even in an HA

pair• BGP Tunnel with Routing – Each router connects to two endpoints, HSRP between,

extremely redundant

• Openstack Complexity – VLAN Tagging• If using VLAN tagging in Openstack, your tunnel device needs to participate in the

VLAN which is used for VM’s (300 by default)

Page 7: Openstack AWS Cassandra - Cassandra Boston Users Group

7

Connecting Cassandra

• Since EC2Snitch doesn’t work in OS, RackInferringSnitch must be used

• Standard Multi-datacenter tokenization strategies are required• Replication lag is dependant on connectivity and latency• Tests from VPC IPSec tunnels in NJ show 8ms to Ashburn• Tests from Ashburn DC datacenters are about 4ms• The biggest problem is the volume of data and a hard cutover

• We started in EC2, but are migrating to VPC

Page 8: Openstack AWS Cassandra - Cassandra Boston Users Group

m Questions/Suggestions?

8

Page 9: Openstack AWS Cassandra - Cassandra Boston Users Group

m P.S. - We are HIRING!

9

Page 10: Openstack AWS Cassandra - Cassandra Boston Users Group

mCome see our Openstack cluster

Show & Tell

10