running cassandra in aws
DESCRIPTION
For this upcoming meetup, we welcome Patrick Eaton PhD, Systems Architect at Stackdriver, and Joey Imbasciano, Cloud Platform Engineer at Stackdriver. What You'll Learn At This Meetup: • Why Stackdriver chose Cassandra over other DB offerings • Stackdriver's data pipeline that runs into Cassandra • Operating Cassandra Running on AWS • Stackdriver's approach to disaster recovery Patrick and Joey will be presenting their use of Apache Cassandra at Stackdriver, some lesson's learned, technical tips and a Q&A to end the evening.TRANSCRIPT
Running Cassandra in AWS
Patrick Eaton, [email protected]@PatrickREaton
Joey [email protected]@_joeyi
Stackdriver at a Glance
Stackdriver's hosted intelligent monitoring service helps SaaS companies innovate more by reducing the burden of day-to-day operations● Cloud-native and cloud-aware● Designed for complex distributed applications● Founded by cloud/infrastructure industry veterans
(Microsoft, VMware, EMC, Endeca, Red Hat) with deep systems and DevOps expertise
● Team of ~25, based in Downtown Boston
Intelligent MonitoringDiscover customer’s cloud-hosted applications● Infrastructure inventory● Logical units, like groups/clusters● Services, hosted and self-managed● Elastic resources
Monitor● Various data sources
● Provider metrics● Host metrics● Custom metrics● Endpoints● Events● Health
● Rich visualizations
Analyze● Integrate data sources● Aggregate metrics● Report utilization, cost, etc.● Detect policy violations● Recommend actions
Lambda Architecture
● Typical of modern architectures for on-line applications.
● Formalized by Nathan Marz● Composed of "batch", "speed", and "serving" layers● Batch layer
○ Store of record○ Compute arbitrary views
● Speed layer○ Low latency updates○ Streaming algorithms
● Serving layer○ Combine data from batch and speed layers to
answer queries
Speed Batch
Data
Serving
Stackdriver Architecture
● Shares characteristics of lambda architecture● Indexing (speed) path
○ Make "live" data available "pre-analysis"● Analysis (batch) path
○ Compute aggregations○ Create recommendations
● Query (serving) layer○ Combine "live" and analyzed
data to answer queries○ May require on-the-fly analysis
● Alerting (speed) path (not discussed here)○ Stream processing to detect
policy-based anomalies
Database
Data
Query(Serving)
Analysis(Batch)
Indexing(Speed)
Alerting(Speed)
Notification(Serving)
Database Options
● We chose Cassandra!○ True P2P architecture○ Good support for write-heavy workloads○ Compatible data model for time series data
■ Column per metric type, timestamps as columns● Why not MySQL?
○ Experience with operating large, sharded deployments○ Relational data model not a good match
● Why not HBase?○ Operational complexity - zk, hadoop, hdfs, ...○ Special "Master" role
● Why not Dynamo?○ Avoid vendor lock-in and high cost
Stackdriver Architecture ++
● Archival pipeline stores all data● Very small surface area, battle-tested● Critical for disaster recovery● S3 considered durable enough● Replicated for availability
● Archive means Cassandra is "soft state"● C* consolidates analysis and indexing results● Properties of data in C*
● Immutable data● Append-only● Read-1, write-1 consistency
● Scales out easily● Indexers, archivers, analyzers, query servers
Analyze
ArchiveIndex
S3
Roll-upsAnalysis
Recs
InventoryData Series
Data
Query
Cassandra
Cassandra at Stackdriver Cluster Configuration
● Version: Datastax Community Edition 1.2.10● Replication Factor: 3● Vnodes● Murmur3Partitioner● Ec2Snitch
○ Aids in request efficiency○ Enables Cassandra to ensure replicas are in
different Availability Zones● phi_convict_threshold: 8 -> 12
○ Used to determine when nodes are down○ AWS network can be spotty
Cassandra Topology in AWS
1
us-east-1a
3
us-east-1c
2
us-east-1b
Where we started...
Keep it balanced!
us-east-1a
us-east-1cus-east-1b
Where we are...
Cassandra EC2 Node Configuration
● m1.xlarge ○ 4 cores○ 15 GB RAM○ 4 ephemeral disks available
● 4 disks RAID-0 for Data Volume and CommitLog○ ext4 - defaults,noatime○ mdadm RAID-0○ Compactions○ Heavy Read/Write IO
Cassandra Automation and Operations
● Combination of Boto, Fabric, & Puppet○ Boto for AWS API○ Fabric + Puppet for Bootstrapping○ Fabric for Operations
● One command to:○ Launch a new cluster○ Upsize a cluster○ Replace a dead node○ Remove existing nodes○ List nodes in a cluster
Our (Internal) Slogan
Cassandra Backups using S3
● No Cassandra Powered Backups● Restore from S3● Useful for major version upgrades
S3Bulk Loader
Map Reduce CassandraData
1. Data is archived when it is received2. Bulk loader reads from S33. M/R re-analyzes data4. Cassandra is repopulated
Disaster Recover in the Wild
● October 23, Stackdriver suffered a total loss of our C* cluster● Exhausted memory due to number of open file descriptors (see graph)
● We did not notice the problem until it was too late● Nodes began crashing, resulted in inconsistent view of the ring
● Attempted to restart the cluster unsuccessfully for ~2 hours● Provisioned new 36 node cluster in ~2 hours● Directed “live” data to new cluster● Started bulk restore operation from archive
● Full-fidelity data and aggregations● No data loss due to archival pipeline● See http://www.stackdriver.com/post-mortem-october-23-stackdriver-outage/
Cluster Restoration Process
UIUI
UI
UIUIAPI
S3Bulk Loader
Map Reduce
UIUI
Gateway
Historical Data
New Data
New Cluster
Old Cluster
Thank you!
Yes, we are hiring!
Patrick Eaton - [email protected] - @PatrickREatonJoey Imbasciano - [email protected] - @_joeyi