Exploiting latency bounds for energy efficient load balancing
Cruz Monrreal, Daniel Jones, Michael May and Mohit Taneja
The University of Texas at Austin
Overview
● Problem● State of server power (lack of power
proportionality)● Inspiration● Current Solutions● Assumptions● Our Solution● Comparisons of results● Limitations of our solutions● Future prospects● Q & A
Description of Problem
Current server implementations are power inefficient during low load hours.
Many requests do not need to be serviced as fast as possible thus have an acceptable stall period.
System Power Consumption
Source - http://static.usenix.org/events/hotpower08/tech/full_papers/rivoire/rivoire_html/
System Power ConsumptionTable 1 - Number of Cores utilized to Power Usage
# of Cores Power (W)
0 0
1 270
2 300
3 320
4 330
Inspiration
Inspiration (Cont)
Assumptions - Model4 core machine
3 servers total
1 job saturates a core
Instant on/off
No background tasks
Assumptions - Model (Cont)Load generator simulates sending variable time jobs (service requests) to load balancer.
Load Scheduler distributes jobs to servers.
Server simulates running job by sleeping the given time.
Server sends number of cores running back to load balancer with its own timestamp.
Current Solutions
No Power management (Round Robin)
Basic Power management (Round Robin)
Advanced Power management
Current Solutions - No Power Management
Load Scheduler uniformly schedules jobs to each server in sequence
Problems:Lots of time spent in idleFew cores used = low efficiency
Current Solutions - Round Robin w/o Power Management
Load Scheduler uniformly schedules jobs to each server in sequenceTurns off unused machines
Problems:Few cores used = low efficiency
Current Solutions - Round Robin w/ Server Toggling
Load Scheduler uniformly schedules jobs to each server by sending 4 jobs at a time sequentially. Turns off unused servers.
Problems:Does not fully utilize latency
Overview of Solution
If any servers are running but not full, the load balancer will send a job to the server with the most jobs running.
If all servers are full on/off than the load balancer will wait the given stall time until sending a job to an off server (thus turning it on).
Comparisons
Run at average load = 25%,50%,75%,100%
Vary job time around average load job time.
Example: 25% load timeJob time = 8-12 millisecondsStall time = 500 millisecondsTime between jobs = 2-8 milliseconds
Conclusion
Limitations
With large core counts advantages start to diminish
At 100% no gains
Future Prospects
Storage systems (SANs)Latency exploitation for networks
Q&A
Resources
[1] http://web.eecs.umich.edu/~twenisch/papers/asplos12.pdf
[2] http://static.usenix.org/events/hotpower08/tech/full_papers/rivoire/rivoire_html/