matt warner future facilities
DESCRIPTION
Proactive Airflow Management in Data Centre Operation. - using CFD simulation to improve resilience, energy efficiency and utilisation. Matt Warner Future Facilities. Design Intent. Design Capacity. 100 %. Design intent. Utilisation. Time. Mid life. Expected end of life. - PowerPoint PPT PresentationTRANSCRIPT
Matt WarnerFuture Facilities
Proactive Airflow Management in Data Centre Operation
- using CFD simulation to improve resilience, energy efficiency and utilisation
Design Intent
Mid life Expected end of life
100 %Design Capacity
Uti
lisa
tio
n
Time
Design intent
Operational Reality…
Mid life Expected end of life
Typical Operation
Design intent
Lost Lifespan
Stranded Capacity
“The biggest challenge for 80%+ of Owner/Operators is obtaining the right balance between Space, Power and Cooling.” Gartner
65 %
100 %
Uti
lisa
tio
n
Time
The Physics of Cooling through the Data Centre Supply Chain
Chip Manufacturer IT Deployment Data Centre Manager
Lack of communication between the equipment suppliers and theData Centre industry causes inefficiency in operation.
Device Manufacturer
Efficient Data Centre Management is all about Airflow Management
Grilles to
equipment
inlets
Equipment exhaust to ACU
ACU to floor grilles
Typical changes in a Data Centre
InfrastructureIT
Equipment Cabinets
Introducing the Virtual Facility
InfrastructureIT
Equipment Cabinets The Virtual Facility
The Virtual Facility enables the data centre designer and operator to understand the consequence of any physical change before committing to it.
What is the Virtual Facility?
The Virtual Facility is a full 3D mathematical representation of the data center that simulates and visualises the physical impact of any change in the data centre.
Resilience at cabinet level
Data centre management often monitor temperatures and react to the problems.
Cheap and simple: brush or foam
supply temperature 15 oC
max inlet temperatures 32 oC
1. Restack
supply temperature 15 oC
max inlet temperatures 19 oC
2. Turn down ACU set points 4oC
supply temperature 11 oC
max inlet temperatures 28 oC
3. Block gap under cabinet
supply temperature 15 oC
max inlet temperatures 16 oC
expensive IT operation
higher energy costs
The Virtual Facility illustrates 3 ways of achieving resilience:
Resilience at cabinet level
Problem:• Each IT device has its own airflow and heat characteristics.• Each IT device has the potential to effect the resilience of every other
device in the rack.• Therefore the stacking of the IT Equipment determines resilience.
Symptom:• Some rack configurations will cause IT Equipment to overheat.
Reaction:• A typical reaction to overheating IT devices is to reduce the cooling set
points in the area of the devices.• This reduces the efficiency of ACUs and reduces cooling capacity.
Solution:• A simpler and cheaper solution can usually be found using simulation to
visualise internal cabinet problems and test potential fixes…• but the ideal solution is to be pro-active and simulate the deployment
and avoid any problems and the knock-on energy costs
Resilience at room level –Deploying IT devices in a room
• Scenario: Small room at 65% capacity, space needs to be allocated for two Sun SPARC Enterprise M5000 servers:• 10U• 3738W (nameplate)• 4x power supplies• 2x RJ45 LAN ports (+ 1 management port?)• 283 l/s
• After checking available Space, Power, Networking and rack inlet temperatures there are at least 3 options:
Option 1 Like for Like: Install one in each of
the two racks that already have
Sun M5000s installed.
Option 2 Mix server types: Install one in
each of the server racks with
the most available space.
Option 3 Install both in the empty rack in
the row allocated to blades
Servers
Switch
Storage
Same equipment, 2 layouts:
one layout is resilient
another layout overheats
Different designs, different technologies, different power densities, different cooling requirements
disrupt airflow and lead to hotspots…
Resilience at room level – Laying out cabinets in a group
Resilience at room level
Problem:• Each IT device has its own airflow and heat characteristics.• Each IT device has the potential to effect the resilience of every other
device in the room.• Therefore the physical configuration of the IT Equipment determines
resilience.
Symptom:• Some configurations of the IT Equipment will cause hot-spots.
Reaction:• A typical reaction to thermal hot-spots is to reduce the cooling set
points for the entire room.• This reduces the efficiency of ACUs and increases energy costs of
chiller units.
Solution:• The ideal solution is to be pro-active and simulate cabinet deployments
and avoid any problems and the knock-on energy costs
Thermal Resilience and Efficiency
• The purpose of a data centre is to provide space, power, cooling and networking for every IT device.
• The challenge is to provide these in the most energy efficient manner without giving up the required resilience.
• To reduce energy costs of cooling, air temperatures in the data halls must be raised.
• Hotspots at rack or room level will prevent air temperatures being raised (and in practice often lowers them compared to the design).
• Data centres typically supply air at about 15°C
• IT devices are typically resilient to 30°C
• There are many hotspots caused by poor airflow management that are masked by low supply air temperatures.
• These must be fixed before air temperatures can be raised.
100 %
Uti
liza
tio
n
Time
Maximising utilisation of the data centre
80kW
60kW
60kW
200kWTotal
80kW
The original design assumes hot aisle cold aisle and front to back breathing equipment with 2 medium density zones and 1 higher density zone.
Lost Lifespan
45% Stranded Capacity
55 %
100 %
Uti
liza
tio
n
Time
Maximising utilisation of the data centre
80kW
200kWTotal
80kW
10kW
20kW
110kW
The original design assumes hot aisle cold aisle and front to back breathing equipment with 2 medium density zones and 1 higher density zone.Option 1 is to locate the new cabinets in one end of the room.but after 55% power load
60kW
60kW
hotspots will start to develop.
Maximising utilisation of the data centre
60kW
200kWTotal
35kW
35kW
110kW
The original design assumes hot aisle cold aisle and front to back breathing equipment with 2 medium density zones and 1 higher density zone.Option 2 is to locate the new cabinets in the centre of the room.but after 75% power load80kW
60kW
hotspots will start to develop.
Lost Lifespan
25% Stranded Capacity75 %
100 %
Uti
liza
tio
nTime
Utilisation and Stranded Capacity
Problem:• Any IT deployed in a data centre will disrupt the airflow and cooling even
in empty zones of the room.
Symptom:• The simple example we just looked at illustrates a common feature of data
centres that is not well understood: • If your data centre is running at 40% of design load, you do not have 60%
capacity left!• Thermal hotspots will occur before the data centre reaches capacity.
Reaction:• Resilience concerns prevent further installation of IT devices• Confusion between Facilities, IT and Management.• More data centres are built to gain more capacity.
Solution:• Alternate configurations of IT load can be pre-tested using CFD simulation
to evaluate whether they will result in stranded capacity in a data centre.• Proactive airflow management is required to maximise the utilisation of a
data centre
The benefits of the Virtual Facility
Reclaimed Capacity
Using a Virtual Facility
Reclaimed Lifespan
“Data centers seldom meet the operational and capacity requirements of their initial design.” Gartner
The Virtual Facility enables data centre operators to reclaim stranded capacity and extend the life of their existing data centres
Mid life Expected end of life
Year 1
Lost Lifespan
Stranded Capacity
Util
izat
ion
100 %
Typical Operation
Design intent
The choice: reclaim stranded cooling capacity or add to estate
Option 1: Add to estate and continue with typical operation
+60%
Existing estate Expanding the estate
Uti
liza
tio
n
Time
Uti
liza
tio
n
Time
+20%
=60%
Option 2: Reclaim lost capacity in the existing estate
The greenest data centre is the one you don’t need to build...
Matt WarnerFuture Facilities
“Continuous improvement of tools and processes has enabled us to see where to push beyond perceived limits and how to reclaim
capacity in our existing data centres” – Ashley Davis, JPMChase