scalr: setting up automated scaling
DESCRIPTION
Despite all the buzz about it, building a horizontally scalable application for cloud deployment isn't all that different from building one for a physical deployment, except in its ability to change size on-the-fly. Bigger applications have been using commodity hardware and fault-tolerant design to achieve high availability and scalability for a while, but provisioning capacity remains troublesome there. The real addition the cloud brings architecturally is the ability to add new resources instantly, and even change your provisioning profile algorithmically.TRANSCRIPT
A brief history of auto-scaling
Lessons learned from 5 years of it
some context
First,
About me
● Sebastian Stadil
● Founder of meetup.com/cloudcomputing
● Founder of Scalr
● Slashdotted at 14
About
● Simple, powerful cloud management suite
● Helps you design & manage resilient,
scalable infrastructure
● For apps deployed in public & private clouds
● Over 2,000,000 instances launched
● Applications vary from 1 to 10,000 instances
● Started out as simple auto-scaling system
our feature presentation
And now
A brief history of auto-scaling
Lessons learned from 5 years of it
● Combination of CPU, disk IO, number of
processes running
● Represents system utilization.
● Good for most applications.
● Most widely used.
Load Average
CPU
● Good for services with dominant CPU consumption (duh)
● Data processing, video processing, etc..
Response times
● Rarely used metric
● Many factors screw it up (network
throughput, system resources, different
application queues)
● When response only depends on hardware,
can work
● Downscaling is problematic
RAM
● Good for RAM based databases and caches
● Beware of invalidating keys
● Memcached, Redis, etc.
Schedule
● Good for services with predictable traffic
● Advertising campaigns, product launches
● When you know that you will get extra traffic
at specific time or day
● When traffic changes throughout the day
Queue size
● Maintain processing rate, esp. SLA*
*Processing rate = queue size / servers (given that each server can process X tasks per hour).
● Good for processing services such as video
encoding or sending messages
Bandwidth
● Limited channel per server (1Gbit anyone?)
● Need higher download capacity
● Known traffic per user
Disk io
● Cassandra
● Certain Hadoop jobs
● Stuff that hits the disk
Build your own algorithms
Fuck it
Custom metrics
● Read a file
● Execute a script
● Example: # of threads / connections
Custom algorithms
● OR for upscaling
● AND for downscaling
● Configurable cooldowns
● Configurable steps
● Example: scale up early, scale down slowly
Examples
● Social gaming
● Enterprise services
Another example (mysql)
● Take master out
● Take backing up slave out
Started with general, went specific,then went custom
Summary