thermal aware data management in cloud based data centers
DESCRIPTION
NSF SEEDM workshop, May 2-3, 2011 . Thermal Aware Data Management in Cloud based Data Centers. Ling Liu College of Computing Georgia Institute of Technology. Thermal aware Computing Era. Power density increases Circuit density increases by a factor of 3 every 2 years - PowerPoint PPT PresentationTRANSCRIPT
Thermal Aware Data Management in Cloud based Data Centers
Ling LiuCollege of Computing
Georgia Institute of Technology
NSF SEEDM workshop, May 2-3, 2011
Thermal aware Computing Era
• Power density increases– Circuit density increases by a factor of 3 every 2 years– Energy efficiency increases by a factor of 2 every 2 years– Effective power density increases by a factor of 1.5 every 2 years
[Keneth Brill: The Invisible Crisis in the Data Center]
• Maintenance/TCO rising– Data Center TCO doubles every three years– Three-year cost of electricity exceeds the purchase cost of the server– Virtualization/Consolidation is a 1-time/short term solution
[Uptime Institute]
• Thermal management corresponds to an increasing portion of expenses– Thermal-aware computing and management solutions becoming prominent– Increasing need for thermal awareness
[VarsamopoulosGupta 2008]
Thermal aware Task Scheduling in Data Centers
• Given a total task C, how to divide it among N server nodes to finish computing task with minimal cooling energy cost ?
• Self-Interference and cross-interference lead to the temperature rise of inlet air, should be minimized
• Environment interference(room temperature) is not critical• Task scheduling in spatial domain
[VarsamopoulosGupta 2008]
Cooling Cost aware Scheduling
[VarsamopoulosGupta-2008]
Energy Saving by Dynamic Load DistributionIncreasing the range of changes in the rack heat load• Heat load distribution of [30 kW, 5 kW, 5 kW, 20 kW] in the case study only
needs 1.7 m/s (9,726 CFM) cooling air flow• It is 19% less than the uniform distribution needs• This could save ~$189,000 annually in typical real world data centers
[15,15,15,15] kW with 2.1 m/s [30,5,5,20] kW with 1.7 m/s
Temperature Contours Around Racks:
[Yogendra Joshi, Georgia Tech/CERCS]
Think Globally, Act Locally
Numerically
Run simulations for a range of
velocities
Make a server heat load-Inlet T variation matrix
Change in max. inlet T of servers
Unit change in server loads
S1 S2 Sn
S1
S2
Sn
Experimentally
Vary the heat loads sequentially
at servers for a chosen unit cell and monitor the
max. server inlet T
Advantage:The simulations run for different velocities are not required for the experimental approach.
Modifications:Blocks of servers can be identified with same effect or no effect on the inlet T. • This will give insights on the sparsity of this matrix.• Reduce the computational work.
A Matrix
n
iil
1
max
..ts crT TlA
maxmin lll Where,
server I load
Minimum load (startup)
Max. load (full utilization)
Max. inlet T allowed by ASHRAE
n
iil
1
max
crT TlA
maxmin lll maxmin lll
[Yogendra Joshi, Georgia Tech/CERCS] ]
68% increase in allowed heat dissipation
(For the same CRAC velocity)
37.5% decrease in Facilities Energy Consumption (For the same heat
dissipation)
An Example
288
293
298
303
308
313
318
323
328
Max
. Inl
et T
at S
erve
rs (K
)
AILM: 0.8-7.5kWserver range - A rack
AILM: 0.8-7.5kWserver range - B Rack
Uniform: 5kW serverload - A Rack
Uniform: 5kW serverload - B Rack
SafeTemperature
Limit
11 141312 15 4116 21 3122 23 2524 26
Total Data Center Load Dissipation
298kW
297kW
VCRAC = 5m/s
11 41
16 46
[Yogendra Joshi, Georgia Tech/CERCS]
Pertinence of Thermal Maps in Data Center Management
• Given an equipment utilization layout, find the temperature around the room
• Create a collection of thermal maps or a function to “predict” thermal behavior of a task assignment
• Use collection to decide on job placement (temporally and spatially)
[VarsamopoulosGupta 2008]
Thermal-awareData Management
[Adapted from VarsamopoulosGupta 2008]
Thermal aware data management
• Task profiling – CPU utilization, I/O activity
etc• Equipment power profiling
– CPU consumption, disk consumption etc
• Heat recirculation modeling• Task management technologies
Need for a comprehensive research framework