Cassandra@Coursera: AWS deploy and MySQL transition

Download Cassandra@Coursera: AWS deploy and MySQL transition

Post on 19-Aug-2014

651 views

Category:

Engineering

9 download

Embed Size (px)

DESCRIPTION

Touches on what Coursera aims to get out of Cassandra, what goes into a good deployment, and our experience so far transitioning off MySQL.

TRANSCRIPT

  • Cassandra @ Coursera Deploying in AWS MySQL Transition Daniel Chia @DanielJHChia Software Engineer, Infrastructure
  • Overview Why Cassandra What goes into a good deployment MySQL Cassandra transition experience
  • 110 partners ! 698 courses ! 8.5 million learners
  • A Coursera Course
  • Your Final Project This is your chance to apply the course concepts to real-world situations
  • Identity Veried Certicates
  • Technical 100% hosted on AWS Service-oriented architecture Mix of MySQL and Cassandra for persistence
  • What do we care about?
  • We care about Availability Scalability Operational Ease Latency (Bonus) Multi-region writes
  • Availability matters
  • EBS Outage (2012) Master us-east-1a Slave us-east-1c
  • Scalability
  • Scalability
  • Sharded by class class1 class2 class3 class4 class5 Machine 1 class6 class7 class8 class9 class10 Machine 2 class11 class12 class13 class14 class15 Machine 3
  • New use-case Uh-oh doesnt t in existing sharding
  • We care about Availability Scalability Operational Ease Performance (Bonus) Multi-region
  • Try Cassandra! So we decided to
  • Cassandra [database XYZ]
  • Albert Einstein But if you judge a sh by its ability to climb a tree, it will live its whole life believing that it is stupid.
  • Time to deploy Cassandra! sudo apt-get install dse-full
  • A good deployment Machine-level Cluster-level
  • Picking a machine Disk IOPS IOPS IOPS Latency Author: D-Kuru/Wikimedia Commons Licence: CC-BY-SA-3.0-AT
  • Picking a machine CPU Author: Mark Sze Licence: CC BY-NC-ND 2.0
  • Picking a machine Memory Save some for page cache! Author: brutalSoCal Licence: CC BY-NC-ND 2.0
  • On AWS Ephemeral disks. Please dont use EBS. Really. IOPS usually the problem Instance sizes: spinning disk: m1.large, m1.xlarge, m2.4xlarge ssd: m3.xlarge, c3.2xlarge, i2.*
  • Set up the machine Lots of documentation / talks about this Recommended reading: Datastax guide [1] [1] http://www.datastax.com/documentation/cassandra/2.0/cassandra/install/installRecommendSettings.html
  • Cluster conguration A C B
  • Priam care and feeding of Cassandra on AWS https://github.com/Netix/Priam
  • Cluster Topology We use RF=3 Ring balanced within datacenter Nodes alternate racks (or AZs)
  • Cluster Topology (Priam) Token assignments stored in a database Can takeover token in instance of node failure
  • Cluster Topology (Priam) Priam assigns tokens evenly per region Alternates AZs within region az1 az3 az2 az1 az2 az3
  • Autoscaling groups Recover from lost instance We don't use it for scaling with trafc
  • Important: Need one ASG per AZ east-1a east-1a east-1a east-1b east-1beast-1b east-1ceast-1c east-1c ASG size: 9
  • Important: Need one ASG per AZ ASG size: 9 east-1a east-1a east-1a east-1b east-1beast-1b east-1ceast-1c east-1b
  • Important: Need one ASG per AZ ASG-1a size: 3 east-1a east-1a east-1a east-1b east-1beast-1b east-1ceast-1c ASG-1b size: 3 ASG-1c size: 3 east-1c
  • Backups Data on ephemeral disks Guard against application errors SSTables immutable -> ship to S3 Priam does this
  • Restore Have to be able use your backup Also useful for QA / test Priam handles this rather nicely
  • Deployed! Time to chill? https://www.ickr.com/photos/spunkinator/2394514059 Creative Commons
  • Monitoring working / not working doesnt count.
  • We have our own custom reporter agent for Datadog Theres pluggable reporter support in 2.0.2 now.
  • JVM GC woes
  • JVM GC woes All happy now
  • SSTables Read Histogram
  • Questions? before we carry on
  • Transition takes time mindset shift expertise (some) risk
  • Our experience Pick one feature rst Mindset shift Data modeling consulting Libraries / Patterns / Data-as-a-service
  • Pick one feature Dont go all in with Cassandra with something important right away Work closely with that team
  • You probably will make mistakes Oops!
  • Mindset shift Everyone knows SQL Not everyone knows Cassandra / NoSQL Need to know queries beforehand
  • Enrollment Example Learners enroll into a course learner (many-to-many) course Need to keep track of this membership
  • MySQL Model CREATE TABLE `courses_learners` ( `id` INT(11) NOT NULL auto_increment, `course_id` INT(11) NOT NULL, `learner_id` INT(11) NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `c_l` (`learner_id`, `course_id`), CONSTRAINT `ref1` FOREIGN KEY (`course_id`) CONSTRAINT `ref2` FOREIGN KEY (`learner_id`) )
  • MySQL Model CREATE TABLE `courses_learners` ( `id` INT(11) NOT NULL auto_increment, `course_id` INT(11) NOT NULL, `learner_id` INT(11) NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `c_l` (`learner_id`, `course_id`), CONSTRAINT `ref1` FOREIGN KEY (`course_id`) CONSTRAINT `ref2` FOREIGN KEY (`learner_id`) )
  • MySQL Model CREATE TABLE `courses_learners` ( `id` INT(11) NOT NULL auto_increment, `course_id` INT(11) NOT NULL, `learner_id` INT(11) NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `c_l` (`learner_id`, `course_id`), CONSTRAINT `ref1` FOREIGN KEY (`course_id`) CONSTRAINT `ref2` FOREIGN KEY (`learner_id`) )
  • MySQL Model CREATE TABLE `courses_learners` ( `id` INT(11) NOT NULL auto_increment, `course_id` INT(11) NOT NULL, `learner_id` INT(11) NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `c_l` (`learner_id`, `course_id`), CONSTRAINT `ref1` FOREIGN KEY (`course_id`) CONSTRAINT `ref2` FOREIGN KEY (`learner_id`) )
  • Cassandra Style CREATE TABLE courses_by_learner ( learner_id uuid, course_id uuid, PRIMARY KEY (learner_id, course_id) )
  • Data modeling consulting Build core team procient at C* data modeling Available to consult for trickier use cases
  • Libraries / Patterns Abstract away simple (but common) use-cases Key-value storage Simple time series Maybe every developer wont need deep C* knowledge? More radical: data as a service (e.g. STAASH) STAASH: https://github.com/Netix/staash
  • Its a long road but well get there Author: Carissa Rogers License: CC BY 2.0
  • Conclusion Know Cassandra Know what makes a good deployment Know that new skills have to be acquired
  • Questions? Were hiring! coursera.org/jobs