aurora hpc solutions value

22
Aurora solutions Value creation

Upload: eurotechhpc

Post on 01-Nov-2014

686 views

Category:

Technology


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Aurora hpc solutions value

Aurora solutions

Value creation

Page 2: Aurora hpc solutions value

AURORA

Aurora solution benefits

Page 3: Aurora hpc solutions value

Aurora supercomputers from Eurotech

• Aurora is the name of the Eurotech liquid cooled supercomputers, which excel in

• Computing power• Density• Energy efficiency• Reliability• Availability• Compatibility• Cost effectiveness

Page 4: Aurora hpc solutions value

Why Aurora solutions?

ScalabilityLinear scalability to users from Gigaflops to Exaflops

Compatibility and flexibilityChoice of interconnectsX86 based systems

GreenAurora data centers consume 50% less power than data centers based on standard air cooled technology

CompetenceHPC division expertiseEnd to end solution deploymentFlexibility to work with final customer, SI and as OEM

ReliabilityHigh qualityNo moving parts and reduced hot spots

High performance and high densityFastest available technology for high density computational power

Page 5: Aurora hpc solutions value

High Performance and High Density

• Aurora uses the fastest Intel technology available− Intel Xeon E5 (Sandy Bridge) on the latest Aurora HPC 10-10− Optional GPU accelerators as part of the solution

• Fast Infiniband Interconnects− Water cooled infiniband switches included in the systems (one every chassis)

• 3D Torus− High speed up to 60 GB/sec 3D torus network based on FPGA− In collaboration with research institutions like I.N.F.N, TNW and FBK, Eurotech has

developed one of the fastest and most reliable 3DTorus in the market

• FPGA accelerators− Aurora nodes have an on-board FPGA that can be programmed as an accelerator

• High Density– Aurora systems can pack 2 PetaFlops in just 30 m2, the size of a studio flat.

Page 6: Aurora hpc solutions value

Scalability

• Hot pluggable modular system– All components of the Aurora system are hot pluggable. Aurora can be scaled from a

single chassis with 16 nodes to multiple chassis in a rack and into a system with multiple racks

• 3D torus network− Next neighbor network with no switches and no bottlenecks facilitates scalability

• Infiniband– Fast interconnections allow low latency communication

• Synchronization network– Very fast channel and global commands with subdomain manageability– Low/high level synchronization

Page 7: Aurora hpc solutions value

• Aurora provides the customers with no moving parts:– There are no spinning discs and no fans for cooling on-board heat generating

components– As a result, there are no vibrations that can harm memory and rotating discs,

increasing the longevity of components and, as a consequence, of the whole system.

• Aurora cooling limits hot spots− Improved cooling infrastructure and direct to component heat removal allows

uniformity in heat production/removal, limiting hot spots and hence another cause of failures

• Monitoring and resilience– Independent sensor networks– Redundancy of all components including networks– Choice of pro active support (preventive maintenance)

• Eurotech commitment to quality– Eurotech produces its boards in its Japanese plant, following high quality standards– Eurotech HPC selects the best in class supplier to set up the Aurora solutions

Reliability

Page 8: Aurora hpc solutions value

Flexibility and compatibility

• Choice of interconnects – Aurora computational units offer both an Infiniband network and a 3D Torus

interconnection– The customer can choose what is the best technology to be implemented accordingly

to the nature of the computational problem to be solved

• x86 based solution– Aurora supercomputers are based on x86 processors– Intel cluster ready certified

• A choice of software– Aurora can run a vast variety of software, both open source and commercial

• A flexible solution approach– Eurotech HPC can design solutions involving accelerators, storage and software,

according to customer requirements

Page 9: Aurora hpc solutions value

Everyone nowadays claims to be «green» – but are they?

Aurora Green proposition:

• Energy efficiency (achievable datacenter PUE of 1.05)• Direct on component water cooling• High reliability means less spare parts and hence less waste• 230 AC to 10 V DC in 2 steps for a power conversion efficiency between

93% and 97%• Free cooling (heat exchangers rather than chillers and AHU)• Thermal energy recovery• High density for floor space savings• Noiseless operations means better work environments

Green and environmental

Page 10: Aurora hpc solutions value

• 10 years+ of HPC experience:– An extended experience in top HPC projects– Collaborated with best in class research center to develop advanced supercomputer

prototypes– Developed experience in delivering large systems (15M$+)

• Structure and agility:– Eurotech HPC division benefits from the structure coming from the Eurotech group,

while keeping the typical agility of a start up– This means prompt adaptability to customer needs. Eurotech can deliver end to end

HPC solutions inclusive of supercomputers, storage, sotware and services. But they can also work with system integrators, delivering parts of larger systems and as OEM for larger vendors

• Financial solidity– Differently from the pure HPC players in the market, Eurotech income statements

relies on multiple line of business, helping the HPC division to smooth revenues and to rely on abundant resources

Competence

Page 11: Aurora hpc solutions value

AURORA

Aurora total cost of ownership

Page 12: Aurora hpc solutions value

Data center TCO drivers

Driver Cost components

IT CAPEX Initial SW and HW capital expenditures

Space occupancy (footprint)

Cost of the occupied space and auxiliary infrastructure: rent, opportunity cost, civil, structural and engineering, permits and taxes

Data center infrastructure Electrical (UPS, generator, cables…)Cooling (Chillers, AHUs, heat exchangers, pumps…)

Installation Delivery costs, installation, test and tuning of IT, electrical and cooling equipment

Energy Cost of energy: IT, cooling, lighting and waste

Maintenance and additional operation costs

Warranty extensions, support, software licenses, IT maintenance, electrical and cooling maintenance, facilities maintenance, costs of outages, heating, security

Other: disposal, green Costs of end of life, carbon footprint (missed) incentives, fines…

Page 13: Aurora hpc solutions value

Main Areas of Aurora systems impact on TCO

• Area 1: energy savings– Lower costs due to energy cost optimization and better PUE

• Area 2: density (FLOPS/ m2) – Aurora density allows for space, racks, electrical, cooling and network

savings

• Area 3: reliability– Aurora reliability contributes to lower maintenance costs and outage

business costs

• Area 4: liquid cooling

– A part from the energy savings that water cooling implies, it also allows to save on the capital costs of cooling infrastructure

Page 14: Aurora hpc solutions value

TCO - energy“Typical” power breakdown in datacenters

Data from APC

Image from APC

Page 15: Aurora hpc solutions value

TCO- energyPower breakdown in an Aurora datacenter leading to a PUE of 1.05

Page 16: Aurora hpc solutions value

• Example of energy savings compared to 2 alternative air cooled solutions• Rationale: less energy spent in cooling and less energy wasted in power

conversion.

Savings compared to an air cooled 1U servers based

data centre (PUE = 2.13)

Savings compared to an air cooled blade servers based

data centre (PUE = 1,6)

Total energy savings in 5 years compared to alternative solutions

€ 2,820,000 € 1,200,000

TCO- energyEnergy savings

Page 17: Aurora hpc solutions value

TCO - density

LESS Flops/m2

+ servers

+ maintenance costs

+ electrical

+ cooling

+ volume occupancy(m3)

+ civil, structural and engineering

costs

+ energy costs

+ raised floor costs

+ space occupancy (m2)

+ IT hardware costs, like racks

+ network, electrical and

cooling hardware costs

Page 18: Aurora hpc solutions value

N is related to density

Reliability impacts TCO in 2 ways:- Direct costs, associated with spare parts, extended warranties, support

personnel- Indirect costs, related to the business cost associated to an outage

The direct costs depend on the number of components and their estimated FIT (failure in time) rate, as demonstrated in the MTBF equation where is the failure rate of the single component and N the number of components

is related to quality, operating conditions,

monitoring and preventive

maintenance of components

MTBF =

The indirect impact depends on organisation and could range from thousands to millions € per hour of outage. So the impact of low reliability on the business could offset any saving reached during purchase and installation of IT solutions!

TCO - reliability

Page 19: Aurora hpc solutions value

Adopting liquid cooling technology, it is possible to avoid most of air conditioning used to cool the IT equipment

This bear some saving coming from the avoidance of chillers, AHU, CFD (computational fluid dynamics), raised floor, air conditioning tuning

Liquid cooling infrastructure is generally cheaper relying on components like piping, pumps and free coolers.

If we take a 1 MW installation and we consider a cost of air cooling infrastcrure of 3000$/KW, the toal cooling would be 3M$. The same 1MW data center would probably require roughly 10% of that expenditure

TCO – liquid cooling

Page 20: Aurora hpc solutions value

TCO example: comparison among 3 systems (500 Tflops)*

The best air cooled competitive solution has to be 100% cheaper to match TCO!!!

The standard air cooled competitive solution has to be 75% cheaper to match TCO!!!

*Calculations assume that cost of hardware and software is the same in the 3 cases

Air cooled 1U servers Air cooled blade server Liquid cooled blade server

Processor Xeon 5600 Xeon 5600 Xeon E5

Cost of energy $2,720 $1,560 $510

Retuning and additional CFD $17 $6 $0

Total outage cost $500 $390 $160

Preventive maintenance $150 $150 $150

Annual facility and infrastructure maintenance. $670 $380 $130

Lighting $14 $5 $2

Annualized 3 years capital costs $3,520 $3,250 $3,040

Annualized 10 years capital costs $1,770 $1,100 $440

Annualized 15 years capital costs $380 $120 $30

ANNUALIZED TCO (K od USD) $9,741 $6,961 $4,462

Different processors are used

Page 21: Aurora hpc solutions value

TCO example: comparison among 3 systems (500 Tflops)*

The best air cooled competitive solution has to be 55% cheaper to match TCO!!!

The standard air cooled competitive solution has to be 40% cheaper to match TCO!!!

*Calculations assume that cost of hardware and software is the same in the 3 cases

Air cooled 1U servers Air cooled blade server Liquid cooled blade server

Processor Xeon 5600 Xeon 5600 Xeon 5600

Cost of energy $1,690 $1,560 $910

Retuning and additional CFD $14 $6 $0

Total outage cost $390 $390 $330

Preventive maintenance $150 $150 $150

Annual facility and infrastructure maintenance. $450 $390 $240

Lighting $11 $5 $2

Annualized 3 years capital costs $3,390 $3,270 $3,220

Annualized 10 years capital costs $1,100 $1,100 $820

Annualized 15 years capital costs $300 $130 $40

ANNUALIZED TCO (K od USD) $7,495 $7,001 $5,712

Same processors are used

Page 22: Aurora hpc solutions value