optimizing costs with spot instances
TRANSCRIPT
Run large scale workloads
cost effectively in the cloud
EC2 Spot Instances
#ec2spot
Jafar Shameem
Agenda
• What is Spot?
• Use Cases
• Best Practices
• Spot Fleet
What is Spot?
Name your own price for EC2 Compute
– A market where price of compute changes based upon Supply and Demand
– When Bid Price exceeds Spot Market Price, instance is launched
– Instance is terminated (with 2 minute warning) if market price exceeds bid price
– Unused On-Demand Instances
On average, AWS adds enough new
server capacity every day to support
Amazon’s global infrastructure when it
was a $7B business.
Pools of capacity
General-purpose: M1, M3 , T2
Compute-optimized: C1, CC2, C3, C4
Memory-optimized: M2, CR1, R3, M4
Dense-storage: HS1, D2
I/O-optimized: HI1, I2
GPU: CG1, G2
Micro: T1, T2
.micro
.medium
.large
.xlarge
.2xlarge
.4xlarge
.8xlarge
Windows
Linux-1a
-1b
-1c
….
Type Size OS AZ
c3.2xlarge
On-Demand
Price:
$0.420/hr
cc2.8xlarge32 cores, 60.5 GB
memory
On-Demand
Price:
$2.00/hr
Some Spot Use Cases
• Stateless Web/App Server Fleets
• Hadoop Workloads
• Continuous Integration (CI)
• High Performance Computing (HPC)
• Grid Computing
• Media Rendering / Transcoding
Web-based Architecture with Spot
Elastic Load
Balancing
Stateless
Web Servers
Stateless
Web Servers
On Demand
Autoscaling group
Session
State Data
Stateless Web
Servers (spot)
Stateless Web
Servers (spot)
Spot Autoscaling
group
Availability Zone A
Availability Zone B
Stateless Web
Servers (spot)
Stateless Web
Servers (spot)
Spot Autoscaling
group
Scaling Hadoop Jobs with Spothttp://engineering.bloomreach.com/strategies-for-reducing-your-amazon-emr-costs/
Bloomreach
launches 1,500 to
2,000 Amazon EMR
clusters and run
6,000 Hadoop jobs
every day.
Continuous Integration
& Testing with Spot• Tapjoy - Premier Mobile Ad Network Across iOS & Android
• Global Network (435 Million Monthly Reach)
• Jenkins + Spot Instances
• https://github.com/bwall/ec2-plugin (thanks to an RIT senior project)
• Go wide during business hours, scale back in the evenings.
Automatically kicks online at 06:00ET
• Workers scale horizontally to support dozens of simultaneous regression
tests spread out over dozens of workers
• Jenkins automatically guards against spot termination
Queue-based media transcodingOoyala
- Video technology platform that serves ESPN, Bloomberg, ...
- Uses combo of OD/RI/Spot to ensure it can cover predicted volumes while keeping costs low
- http://aws.amazon.com/solutions/case-studies/ooyala/
Vevo
- Library of over 75,000 HD videos
- Must be able to rapidly transcode library to a new screen format
- Can spin up 100s of Spot instances to transcode entire library in a matter of days (instead of the weeks)
Some example frameworks with built-in Spot
support
• Hadoop
– Elastic MapReduce (EMR)
• High Performance Computing
– CFNCluster (https://github.com/awslabs/cfncluster)
– Cycle Computing
• Web Applications / Microservices
– AutoScaling
• Continuous Integration
– Jenkins
– Bamboo
Using Spot Effectively – Normalize
- CPU Generation
- Memory/core
- Networking
- VPC or Classic EC2
Using Spot Effectively – Diversify
- Regions (US-East1, EU-West2, US-West2,…..)
- Availability Zones (US-East1a, US-East1b, US-East1c,…)
- Instance Families (c3, r3, ….)
- Instance Types (c3.xlarge, r3.xlarge, m3.xlarge,….)
Using Spot Effectively – Bidding Strategies
• You only pay what the Market price is
• But, bid what you are willing to pay
• You pay for the price as you enter the hour
• And pay for it at the end of the hour
• If you get interrupted, you don’t pay for that hour
Bid only what you are willing to pay.
(by default, bid limited to 10 * On Demand Price)
Frontend Applications
on On-Demand/Reserved Instances
+
Backend Applications*
on Spot Instances
* e.g., batch video transcoding
• Example:
Using Spot Effectively: Mix and Match
Using Spot Effectively – let AWS do the heavy lifting
Auto Scale
as-create-launch-config spotlc-5cents
--image-id ami-e565ba8c
--instance-type m1.small
--spot-price “0.05”
. . .
as-create-auto-scaling-group spotasg
--launch-configuration spotlc-5cents
--availability-zones “us-east-1a,us-east-1b”
--max-size 16
--min-size 1
--desiredcapacity 3
Introducing Spot Fleet
• Instead of writing all that code to manage Spot Instances, simply specify:– Target Capacity – The number of EC2 instances that you want
in your fleet.
– Maximum Bid Price – The maximum bid price that you are willing to pay.
– Launch Specifications – # of and types of instances, AMI id, VPC, subnets or AZs, etc.
– IAM Fleet Role – The name of an IAM role. It must allow EC2 to terminate instances on your behalf.
Introducing Spot Fleet
• Optionally also specify:– Client Token – A unique, case-sensitive identifier for the
request.
– Valid From – The start date and time of the request
– Valid Until – The end date and time of the request
– Terminate on Expiration – If set to TRUE, all Spot instances in
the fleet will be terminated when the Valid Until time is reached.
Spot Fleet
• Will attempt to reach the desired target capacity
given the choices that were given
• Manage the capacity even as Spot prices
change
• Launch using launch specifications provided
Using Spot Fleet
• Create EC2 Spot Fleet IAM Role
• Requesting a fleet:– aws ec2 request-spot-fleet --spot-fleet-request-config file://mySmallFleet.json
• Describe fleet:– aws ec2 describe-spot-fleet-requests
– aws ec2 describe-spot-fleet-requests --spot-fleet-request-ids <sfr-………..>
• Describe instances within the fleet– aws ec2 describe-spot-fleet-instances --spot-fleet-request-id <sfr-…………>
• Cancel Spot Fleet (with termination):– aws ec2 cancel-spot-fleet-requests --spot-fleet-request-ids <sfr-…………..> -
terminate-instances
mySpotFleet.json
{
"TargetCapacity": 5,
"SpotPrice": "1.00",
"IamFleetRole": "arn:aws:iam::962872214910:role/fleetRole",
"LaunchSpecifications": [
{
"ImageId": "ami-ff527ecf",
"InstanceType": "m1.small"
},
{
"ImageId": "ami-ff527ecf",
"InstanceType": "m1.medium"
},
{
"ImageId": "ami-ff527ecf",
"InstanceType":"m1.large"
}
]
}
Further reading
• EC2 Spot Instances:– http://aws.amazon.com/ec2/purchasing-options/spot-instances/
• EC2 Spot Fleet API:– https://aws.amazon.com/blogs/aws/amazon-ec2-spot-fleet-api-
manage-thousands-of-instances-with-one-request/
• Spot Best Practices:– https://aws.amazon.com/blogs/aws/focusing-on-spot-instances-
lets-talk-about-best-practices/