techtalk v2.0 - performance tuning cassandra + aws

28
Eddie Garcia, VP of InfoSec and Services Gazzang, Inc. I/O Performance tuning for Cassandra running on AWS with Gazzang

Upload: pythian

Post on 20-Jun-2015

975 views

Category:

Engineering


5 download

TRANSCRIPT

Page 1: TechTalk v2.0 - Performance tuning Cassandra + AWS

Eddie  Garcia,  VP  of  InfoSec  and  Services  Gazzang,  Inc.  

I/O  Performance  tuning  for  Cassandra  running  on  AWS  with  Gazzang  

Page 2: TechTalk v2.0 - Performance tuning Cassandra + AWS

Today’s  Agenda  

•  Tips  and  Tricks  to  achieve  high  performance  when  running  

Cassandra  on  AWS  

•  ConfiguraBon  tuning  for  Cassandra  

•  Tools  to  benchmark  raw  file  system  I/O  

•  AWS  available  AMIs  to  boost  performance  

•  Stress  tesBng  on  AWS  i2  HVM  instances  

•  Configuring  AWS  EC2  instances  with  SSDs  and  EBS  storage  

with  PIOPS  

4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 2

Page 3: TechTalk v2.0 - Performance tuning Cassandra + AWS

Performance  tuning  

• Tuning  at  every  layer  – Tune  the  AWS  layer  – Tune  the  Cassandra  layer  – Tune  the  file  system  /  security  layer  

4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 3

Page 4: TechTalk v2.0 - Performance tuning Cassandra + AWS

 Tune  the  AWS  layer  

   

4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 4

Page 5: TechTalk v2.0 - Performance tuning Cassandra + AWS

Tune  the  AWS  layer  

•  i2  HVM  instances  will  provide  beNer  I/O  over  other  instance  

types  

•  i2  instances  will  support  SSD  TRIM  for  beNer  SDD  health  and  

performance  over  Bme  

•  Use  Amazon  Linux  distribuBon  AMI  or  kernel  version  3.8  and  

greater  for  higher  I/O  performance  

•  Use  Amazon  Linux  distribuBon  AMI  for  built-­‐in  SR-­‐IOV  (single  

root  I/O  virtualizaBon)  drivers  to  enable  higher  performance  

AWS  Enhanced  Networking  when  running  in  a  VPC  

4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 5

Page 6: TechTalk v2.0 - Performance tuning Cassandra + AWS

Amazon  Linux  AMI  Instance  Types  and  Sizes  

4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 6

http://aws.amazon.com/amazon-linux-ami/

Page 7: TechTalk v2.0 - Performance tuning Cassandra + AWS

Amazon  Linux  AMI  Instance  Types  and  Cost  on-­‐demand  in  US  East  

4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 7

http://aws.amazon.com/ec2/pricing/

Page 8: TechTalk v2.0 - Performance tuning Cassandra + AWS

 Tune  the  Cassandra  layer  

   

4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 8

Page 9: TechTalk v2.0 - Performance tuning Cassandra + AWS

Tune  the  Cassandra  layer  

•  Follow  DataStax  published  Cassandra  best  pracBces  hNp://www.datastax.com/documentaBon/cassandra/2.0/cassandra/install/installRecommendSe]ngs.html  

•  Data  directory  should  go  on  the  mounted  ephemeral  instance  

storage,  avoid  EBS  storage  for  maximum  I/O  performance  

•  IMPORTANT:  You  must  have  a  backup  strategy  when  using  

ephemeral,  for  example  using  S3  for  backups  

•  RAID-­‐0  (stripe)  of  SSDs  is  supported  but  Cassandra  also  does  a  great  job  of  using  all  mounted  drives  without  RAID  

•  Scale  by  adding  smaller  instances  vs.  increasing  instance  size  

4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 9

Page 10: TechTalk v2.0 - Performance tuning Cassandra + AWS

Tune  the  Cassandra  layer  

•  Cassandra  writes  immutable  sstable  files  to  disk.    It  then  

compacts  mulBple  sstables  into  1  larger  sstable  with  some  

cleanup  occurring  along  the  way  which  also  helps  TRIM    

•  More  OS  memory  the  beNer,  on  read  the  sstables  are  cached  

as  normal  memory  mapped  file  loaded  into  OS  memory  

•  Increasing  the  JVM  heap  size  can  cause  performance  issues  for  

Cassandra  during  garbage  collecBon  “Death  by  Garbage  

CollecBon”  

4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 10

Page 11: TechTalk v2.0 - Performance tuning Cassandra + AWS

 Tune  the  file  system  /  security  layer  

   

4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 11

Page 12: TechTalk v2.0 - Performance tuning Cassandra + AWS

Tune  the  file  system  layer  

•  Format  the  file  system  with  ext4  vs  ext3  or  xfs  if  supported  by  

your  chosen  Linux  distribuBon  

•  Use  the  most  current  Linux  version  for  your  distribuBon,  many  

performance  fixes  are  supported  only  in  newer  kernels  

•  Use  IOZone  or  other  file  system  tests  before  and  ager  

configuraBons  to  benchmark  raw  file  I/O  before  loading  your  

Cassandra  data  

4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 12

Page 13: TechTalk v2.0 - Performance tuning Cassandra + AWS

Tune  the  file  security  layer  

•  Use  Block  Level  encrypBon  dedicaBng  enBre  SSD  volume  

•  Encrypt  the  cluster  before  loading  data  whenever  possible  

•  Use  systems  that  support  hardware  encrypBon  acceleraBon  

like  Intel  AES-­‐NI  hNp://aws.amazon.com/ec2/instance-­‐types  

4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 13

Page 14: TechTalk v2.0 - Performance tuning Cassandra + AWS

     Test  and  measure  

   

4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 14

Page 15: TechTalk v2.0 - Performance tuning Cassandra + AWS

Performance  TesJng  

•  When  tesBng  performance  reduce  the  number  of  variables  

that  can  affect  the  test  

–  Stopping  and  stopping  a  server  can  switch  your  instance  to  a  different  host  with  different  performance  

–  Time  of  day  when  you  run  tests  can  affect  the  performance  

–  Eliminate  cached  in  memory  data  from  prior  tests  which  may  

contaminate  your  results  

–  Avoid  tesBng  on  systems  with  unknown  state  and  size  of  data  

4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 15

Page 16: TechTalk v2.0 - Performance tuning Cassandra + AWS

Cassandra  Test  Environment  

4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 16

Cassandra  Stress  Client  

Cassandra  Node  1  

Cassandra  Node  2  

Cassandra  Node  3  

Cassandra  Node  4  

Cassandra  Node  5  

Cassandra  Node  6  

EBS  Clear  text  

EBS  4K  PIOPS  

SSD  Clear  text  

SSD  Encrypted  

IOZone Tests Cassandra

Stress Tests

S3  Backups  

Page 17: TechTalk v2.0 - Performance tuning Cassandra + AWS

Test  Environment  SpecificaJons  

Instance:  i2.2xlarge      AZ:  us-­‐east-­‐1a  AMI  InformaBon:  amzn-­‐ami-­‐hvm-­‐2013.09.2.x86_64-­‐ebs  (ami-­‐e9a18d80)  Linux  DistribuBon:  Amazon  Linux  AMI  release  2013.09  Kernel  Version:  3.4.73-­‐64.112.amzn1.x86_64  Drive  Layout:          Filesystem                        Size    Used  Avail  Use%  Mounted  on          /dev/xvda1                        7.9G    1.8G    6.1G    23%  /    (EBS  backed  for  tests,  ephemeral  is  beNer)          tmpfs                                    30G          0      30G      0%  /dev/shm          /dev/xvdb                          734G    197M    697G      1%  /mount/ssd1    (Cleartext  test  SSD)          /dev/mapper/encrypted  734G      36G    662G      6%  /encrypted    (Encrypted  test  SSD)    Cassandra  Stress  Client  –  m1.medium    Cassandra  Cluster:  6  Nodes  DataStax  enterprise:  dse-­‐libcassandra-­‐3.2.2-­‐1.noarch  Cassandra:  version  1.2.12.2    Java  HotSpot(TM)  64-­‐Bit  Server  VM/1.6.0_45    

4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 17

Page 18: TechTalk v2.0 - Performance tuning Cassandra + AWS

IOZone  SSD  vs.  Non-­‐SSD  

IOZone  test  configuraBon  Bme  iozone  -­‐ORa  -­‐s  163840  -­‐r  16384            Iozone:  Performance  Test  of  File  I/O                            Version  $Revision:  3.420  $                      Compiled  for  64  bit  mode.                      Build:  linux-­‐AMD64              OPS  Mode.  Output  is  in  operaBons  per  second.            Excel  chart  generaBon  enabled            Auto  Mode            File  size  set  to  163840  KB            Record  Size  16384  KB            Command  line  used:  iozone  -­‐ORa  -­‐s  163840  -­‐r  16384            Time  ResoluBon  =  0.000001  seconds.            Processor  cache  size  set  to  1024  Kbytes.            Processor  cache  line  size  set  to  32  bytes.            File  stride  size  set  to  17  *  record  size.  

4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 18

http://www.iozone.org/

Page 19: TechTalk v2.0 - Performance tuning Cassandra + AWS

Cassandra  Test  Environment  

4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 19

Cassandra  Node    

EBS  Clear  text  

EBS  4K  PIOPS  encrypted  

SSD  

SSD  Encrypted  

IOZone Tests

real 1m6.360s user 0m0.084s sys 0m0.911s

real 0m15.223s user 0m0.115s sys 0m1.391s

real 0m9.951s user 0m0.291s sys 0m3.595s

Page 20: TechTalk v2.0 - Performance tuning Cassandra + AWS

Cassandra  stress  

The  cassandra-­‐stress  tool  

•  A  Java-­‐based  stress  tesBng  uBlity  for  benchmarking  and  load  tesBng  a  Cassandra  cluster.  

•  The  binary  installaBon  of  the  tool  also  includes  a  daemon,  which  in  larger-­‐scale  tesBng  can  prevent  potenBal  skews  in  the  test  results  by  keeping  the  JVM  warm.  

•  Modes  of  operaBon:  –  InserBng:  Loads  test  data.  –  Reading:  Reads  test  data.  –  Indexed  range  slicing:  Works  with  RandomParBBoner  on  indexed  tables.  

4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 20

http://www.datastax.com/documentation/cassandra/2.0/cassandra/tools/toolsCStress_t.html

Page 21: TechTalk v2.0 - Performance tuning Cassandra + AWS

Current  Cassandra  stress  test  configuraJon  

•  Cassandra  stress  test  command  –  <cassandra  home>/tools/bin/cassandra-­‐stress  -­‐l  3  -­‐o  insert  -­‐n  100000000  -­‐i  1  -­‐e  ONE  -­‐c  10  -­‐d  <Cassandra  Node  IPs>  -­‐t  150  -­‐f  T1.csv  &  

•  In  the  stress  test,  client  stress  test  nodes  1  –  3  will  target  two  separate  Cassandra  nodes.  On  client  node  #4,  target  all  Cassandra  nodes.  –  Client#1  —>  CAS  1,  2  –  Client#2  —>  CAS  3,  4  –  Client#3  —>  CAS  5,  6  –  Client#4  —>  CAS  1,  2,  3,  4,  5,  6  

4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 21

Page 22: TechTalk v2.0 - Performance tuning Cassandra + AWS

Cassandra  Test  Environment  

4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 22

Stress    Client  1  

Cassandra  Node  1  

Cassandra  Node  2  

Cassandra  Node  3  

Cassandra  Node  4  

Cassandra  Node  5  

Cassandra  Node  6  

SSD  Clear  text  

SSD  Encrypted  

Cassandra Stress Tests

Stress    Client  2  

Stress    Client  3  

Stress    Client  4  

Page 23: TechTalk v2.0 - Performance tuning Cassandra + AWS

Benchmark  clear  text  vs  encrypted  inserts  (write)  

4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 23

Page 24: TechTalk v2.0 - Performance tuning Cassandra + AWS

Summary  

•  Test  in  your  environment  with  your  data,  results  will  vary  greatly  on  OS,  HW  and  applicaBon  configuraBons  –  Baseline  before  you  tune  –  Tune  –  Test  ager  tuning  –  Measure  –  Rinse  and  repeat  twice  

 •  Security  and  Performance  are  not  mutually  exclusive,  

encrypBon  can  coexist  with  High  I/O  performance    •  Do  your  homework,  configure  and  run  tests  that  map  to  your  

use  case  

4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 24

Page 25: TechTalk v2.0 - Performance tuning Cassandra + AWS

• Headquartered  in  AusBn,  Texas  • Focus  on  securing  sensiBve  data  in  cloud  and  big  data  environments  

• Enable  customers  to  meet  compliance    requirements  like  HIPAA,  PCI,  FIPS  and  FERPA  

• SaBsfy  internal  security  mandates  

• Protect  valuable  client  informaBon  

About  Gazzang  

Page 26: TechTalk v2.0 - Performance tuning Cassandra + AWS

Gazzang  is  focused  on  data  at-­‐rest  encrypBon  

 

Security  in  the  cloud  is  a  layered  approach  

26 4/24/14 Gazzang - All rights reserved 2013

Data  in  process  (in  applicaJon)  

Data  at  rest  (storage)  

Data  in  transit  (SSL)  

Page 27: TechTalk v2.0 - Performance tuning Cassandra + AWS

and  key  management  

 

27 4/24/14 Gazzang - All rights reserved 2013

Security  in  the  cloud  is  a  layered  approach  

Data  in  process  (in  applicaJon)  

Data  at  rest  (storage)  

Data  in  transit  (SSL)  

Page 28: TechTalk v2.0 - Performance tuning Cassandra + AWS

Thank  you!  

Gazzang,  Inc  www.gazzang.com      Eddie  Garcia  VP  of  InfoSec  and  Services  [email protected]  

4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 28