cassandra meetup - choosing the right cloud instances for success
TRANSCRIPT
© DataStax, All Rights Reserved.
Apache Cassandra™ Choosing instances for success
1
Erick Ramirez DataStax Engineering
@flightc
Welcome• Your app in focus — reads vs writes, CPU vs RAM• What IOPS? How much is enough?• Are ephemeral disks evil?• False economy — cheaper instances can cost you more• A time to kill — they’re not your pets
© DataStax, All Rights Reserved.3
© DataStax, All Rights Reserved.4
© DataStax, All Rights Reserved.
https://academy.datastax.com
5
© DataStax, All Rights Reserved.
ONE SIZE DOES NOTFIT ALL
6
Tailor to workload
• intimately understand your app• reads vs writes• CPU vs memory• OLTP vs OLAP• use case will dictate requirements
© DataStax, All Rights Reserved.
STORAGE OPTIONS
8
© DataStax, All Rights Reserved.
EBS gp2 SSDs
9
• general purpose EBS option• persistent (durable)• default volume for EC2 instances• guaranteed 99% single-digit millisecond latency• only pay for each GB (IOPS included)• minimum 10K IOPS for production workloads
3 IOPS/GB (3K IOPS/TB)
Max 10K IOPS/vol
Max 160MB/s throughput/vol
1TB = $122/mo, $1474/yr
© DataStax, All Rights Reserved.
EBS io1 SSDs
10
• fastest available EBS option• persistent (durable)• for latency-sensitive OLTP workloads• guaranteed 99.9%* single-digit millisecond latency• provisioned IOPS are charged extra• minimum 10K IOPS for production workloads
* read the fine print
Up to 50 IOPS/GB
Max 20K IOPS/vol
Max 320MB/s throughput/vol
1TB = $141/mo, $1695/yr
1K IOPS = $72/mo, $864/yr
© DataStax, All Rights Reserved.
#spoileralertEPHEMERAL IS YOUR FRIEND
11
Ephemeral storage
• performance orders of magnitude better than EBS• already included in instance costs, e.g. m3, c3, i3
• “physically” attached• not durable across reboots but…
© DataStax, All Rights Reserved.
HELLO, CASSANDRA
13
What is Cassandra
• massively scalable NoSQL database• fully distributed, no single-point-of-failure• linear horizontal scaling
© DataStax, All Rights Reserved.
Why Cassandra
15
• all nodes are the same — no SPOF• real-time, durable writes• linear scaling on commodity servers• real-time replication across data centres• always on — no offline operation• because you have a scale problem
© DataStax, All Rights Reserved.16
Replication across DCs
© DataStax, All Rights Reserved.
CHEAP INSTANCESMAY BE COSTING YOU
17
© DataStax, All Rights Reserved.
Real example
18
• deployed on c4.4xlarge
• using EBS io1 with 3K PIOPS• nodes dropping writes• high read latencies
16 vCPU, 30GB RAM
Instance $ 5443EBS io1 1TB $ 1695PIOPS 3K $ 2592 ————————Annual cost $ 9730
© DataStax, All Rights Reserved.
Recommendation
19
• swap to i3.2xlarge
• 1.9TB NVMe SSDs included• 3M IOPS, 16GB/s• 60-70% cheaper than replaced i2.2xlarge
8 vCPU, 61GB RAM
Instance $ 4174 ————————Annual cost $ 4174
© DataStax, All Rights Reserved.
HORSES FOR COURSES
20
© DataStax, All Rights Reserved.
Use case - dev, light prod
21
• m3.large suitable
• entry-level load, testing-the-waters• minimum 3 C* nodes with RF=3• use CMS GC with 2GB heap
2 vCPU7.5GB RAM1 x 32GB SSD
$ 962/yr
© DataStax, All Rights Reserved.
Use case -low prod volume
22
• m3.xlarge suitable
• JVM will perform better with the extra RAM• min 3 C* nodes with RF=3• use CMS GC with 8GB heap
4 vCPU15GB RAM1 x 40GB SSD
$ 1924/yr
© DataStax, All Rights Reserved.
Use case -moderate prod volume
23
• c3.2xlarge recommended
• more diskspace, extra cores a bonus• costs 50% more for 2x CPU and 4x diskspace• min 3 C* nodes with RF=3• use CMS GC with 8GB heap
8 vCPU15GB RAM2 x 80GB SSD
$ 2916/yr
© DataStax, All Rights Reserved.
Use case -real prod volume
24
• i3.2xlarge recommended
• will handle all kinds of workloads including Analytics, Graph and Search (Solr)
• min 3 C* nodes with RF=3• use G1 GC with 24GB heap (32GB for Search nodes)
8 vCPU61GB RAM1.9TB NVMe SSD
$ 4174/yr
© DataStax, All Rights Reserved.
https://datastaxacademy.slack.com
25
© DataStax, All Rights Reserved.
Thank you
26