hp - hpc-29mai2012

42
HPC SOLUTIONS AT HP Andrei Balcu Consultant Tehnic HP Romania

Upload: agora-group

Post on 10-May-2015

977 views

Category:

Technology


9 download

TRANSCRIPT

Page 1: HP - HPC-29mai2012

HPC SOLUTIONS AT HP

Andrei BalcuConsultant Tehnic HP Romania

Page 2: HP - HPC-29mai2012

HPC FabricsDatacenter Power & Cooling

HPC Software Infrastructure

Purpose Built HPC Servers

Purpose BuiltHPC Storage

HPC is built on a converged infrastructure

Page 3: HP - HPC-29mai2012

SERVERS IN HPC

Page 4: HP - HPC-29mai2012

SERVERS IN HPC

CPU’s : what’s new ?

Page 5: HP - HPC-29mai2012

20112010

G6/G7 G7 and Gen8

Magny-Cours12 cores, 12M L3

ValenciaBulldozer Core (8 cores)

Lisbon6 cores, 6 M L3

InterlagosBulldozer Core (16 cores)

2012

Opteron 6100 series

Opteron 4000 series

2 mem channels

4 mem channels

2 mem channels

4 mem channels

Sandy Bridge8 cores, 20M L31600+ DDR3, Sockets R B2

Westmere-EP6 cores, 12M L3

130/95/80/60/40W, 1333 MHz DDR3

Xeon 5600 series

Nehalem-EX8 cores, 24 MB L3

Up to 130W

Xeon 6500 & 7500 series

3 mem channels

4 mem channels

G6/G7

Westmere-EX10 cores, 30 MB L3

Up to 130WG7

Gen8

Xeon E5-2600 series

Opteron 6200 series

Page 6: HP - HPC-29mai2012

SERVERS IN HPC

Next generation servers

Page 7: HP - HPC-29mai2012

Intel® Westmere-EP vs. Intel® Sandy Bridge-EP-EN

FeatureWestmere-EP E5-2600 (EP) E5-2400 (EN)

Cores Up to 6 cores / 12 threads Up to 8 cores / 16 threads

Cache Size 12 MB Up to 20 MB

Max Memory Channels per Socket

3 4 3

Max Memory Speed 1333 MHz 1600 MHz

New Instructions AES-NI Adds AVX

QPI frequency 6.4 GT/s Up to 8.0 GT/s

Inter-Socket QPI Links 1 2 1

PCI Express • 36 Lanes PCIe2* on Chipset 40 Lanes/Socket Integrated PCIe3

24 Lanes/Socket Integrated PCIe3

Server/Workstation Power TDP

Server/Workstation: 130W, 95W, 80W, LV (Low

Power)

150 (Workstation Only)130, 115, 95, 80,

70, 60 (Low power)

95, 80, 70, 60, 50 (Low Power)

Page 8: HP - HPC-29mai2012

HP FlexNet Adapters

HP Smart Storage

Insight Online

Innovation beyond industry standards

HP ProLiant Gen8 Marquee Features

iLO Management Engine

ProLiant System Architecture

Sea of Sensors 3D

Page 9: HP - HPC-29mai2012

INTELLIGENTPROVISIONING

AGENTLESS MANAGEMENT

ACTIVE HEALTH SYSTEM

REMOTE SUPPORT

Ready to deploy and update

without the need for HP discs or

downloads

Base hardware health monitoring

and alerting without OS agents

Continuously running

diagnostics to minimize downtime

Built-in phone-home function to ease setup and configuration

Cloud-enabled embedded management throughout all ProLiant Gen8 platforms

Core lifecycle management functions built in for instant availability

iLO Management Engine

Page 10: HP - HPC-29mai2012

HP FlexLOM - Grow Your Environment Without ComplexityChange ready for future proofing and adaptable infrastructure

Provides choice

• Upgrade options of 1Gb and 10Gb

Choose your fabric

• Ethernet, FlexFabric, Flex-10, Infiniband

Universal

• Available on all BL, SL and select DL servers

Flexible

• Supports shared iLO port like the traditional

LOM1 1 LOM is short for LAN on motherboard. The term refers to a

chip or chipset capable of network connections that has been embedded directly on the motherboard of a server

Page 11: HP - HPC-29mai2012

Gen8 Smart Array InnovationsIncreased performance, data availability and storage capacity

Faster access to data• Up to 2X performance

improvement*• 2X Cache (up to 2 GB)

Address explosive data growth• 2X Drives supported (up to 227)

Minimize data loss• Long term data retention with Flash Backed Write Cache standard

Reduce initial setup time • 95% reduction in parity initialization from several days to 5 hours**

*256KiB, Sequential write, RAID 5 with 15K SAS drives, performance will vary based on configuration** HP R & D, Validation information TBD

Over 5 Million SAS Smart Array controllers sold! Continuing the legacy of innovation with Gen8

Page 12: HP - HPC-29mai2012

Lower power, faster and more reliable

HP SmartMemory

• 15 - 20% less power than 3rd party memory at 3DPC for DDR3-1333 1.35V RDIMM and DDR3-1333 LRDIMM

• 25% greater throughput at either 1DPC or 2DPC versus 3rd party memory for DDR3-1333 UDIMM

• Genuine HP Qualified memory reliability assured by unique electronic signature

Page 13: HP - HPC-29mai2012

Workload optimized, engineered for any demand

Industry’s most complete portfolio for HPC

13

ProLiant DL Family

Versatile, rack-optimized servers with a balance of efficiency,

performance and management

ProLiant BL Family

Cloud-ready converged infrastructure

engineered to maximize every hour, watt and

dollar

Purpose built for the world’s most

extreme data centers

ProLiant SLFamily

Page 14: HP - HPC-29mai2012

Workload optimized, engineered for any demand

Industry’s most complete portfolio for HPC

14

ProLiant DL Family

Versatile, rack-optimized servers with a balance of efficiency,

performance and management

ProLiant BL Family

Cloud-ready converged infrastructure

engineered to maximize every hour, watt and

dollar

Purpose built for the world’s most

extreme data centers

ProLiant SLFamily

Page 15: HP - HPC-29mai2012

The world's leading server blade

Snap 1

ProLiant BL460c Gen8

The first server blade to deliver over 2,000

cores per rackSnap 1

ProLiant BL465c Gen8

Breakthrough server blade economics for essential enterprise

workloadsSnap 2

ProLiant BL420 Gen8

HP ProLiant BL400c Series Positioning

Page 16: HP - HPC-29mai2012

HP ProLiant BL460c Gen8 Overview• As the world's leading server blade, the ProLiant BL460c Gen8 offers

the ideal balance of performance, scalability, and expandability.

• This makes it ideal for:

• Heterogeneous datacenters and a wide variety of mainstream businesses

• HPC scale-out applications for small, medium, and enterprise data centers

• Key workloads include:

• Virtualization/consolidation

• IT infrastructure (file & print, networking, security, systems management, etc.)

• Web infrastructure (web serving, streaming media, etc.)

• Collaborative (e-mail, workgroup, etc.)

Page 17: HP - HPC-29mai2012

HP ProLiant BL420c Gen8 Overview•The BL420c Gen8 delivers breakthrough server blade economics for essential enterprise workloads. It provides the perfect balance of price, performance, and high availability in the enterprise space.

•This makes it ideal for:

• Mid-market and cost-sensitive enterprise customers

• Service Providers who prefer the manageability of blades

• Scale-out

•Key workloads include:

• Web Hosting/Services in the Enterprise space

• Single application on a single server

• IT Infrastructure (File & Print, Networking, Security & Systems Mgmt)

Page 18: HP - HPC-29mai2012

BL420c Gen8 BL460c Gen8

Processor Intel® Xeon® E5-2400 Series Intel® Xeon® E5-2600 Series

Chipset Intel® C600

Memory (12) DDR3, RDIMM/UDIMM, up to 1333MHz (16) DDR3, /RDIMM/UDIMM/LRDIMM/ LVDIMM

Max Memory 384GB (12 DIMMs x32GB) 512GB (16 DIMMs x32GB)

Internal Storage

2 SFF HP HDD SAS, SATA, SSDDynamic Smart Array B320i RAID controller

2 SFF HP HDD SAS, SATA, SSDSmart Array P220i controller

Max Internal Storage

2TB SAS; 2TB SATA; 1.6TB SSD

Networking (1) Dual Port networking daughter card: 1GbE, 10GbE, Flex-10, or FlexFabric

I/O Slots (2) PCIe Gen3: (1) x8 Type A mezzanine; (1) x16 Type B mezzanine

Integrated Management

HP iLO Management Engine, SIM, IRS - Optional: HP Insight Control, iLO Advanced

Form Factor Half-height c-Class server blade16 blades per c7000 (10U) enclosure; 8 blades per c3000 (6U) enclosure

Page 19: HP - HPC-29mai2012

Workload optimized, engineered for any demand

Industry’s most complete portfolio for HPC

21

ProLiant DL Family

Versatile, rack-optimized servers with a balance of efficiency,

performance and management

ProLiant BL Family

Cloud-ready converged infrastructure

engineered to maximize every hour, watt and

dollar

Purpose built for the world’s most

extreme data centers

ProLiant SLFamily

Page 20: HP - HPC-29mai2012

• Next generation NVIDIA Tesla performance• Up to 30% higher performance with M2090,

combined computation and visualization with M2070Q

• Optional HP PCIe IO Accelerator• Integrated solid state storage device to

accelerate I/O bound applications

• Future: Intel® Many Integrated Core (MIC) • Accelerate highly parallel applications, using

the standard IA instruction set

Integrated accelerator solutions for the SL200s family

Driving new levels of performance/$/watt/ft2

Page 21: HP - HPC-29mai2012

• Shared power & fans for reduced component quantity and increased power efficiency

• Ability to mix and match SL half-width nodes

• Front cabling for increased rear air-flow and ease of serviceability

• Individually serviceable nodes

*Needs1200mm deep racks

• SL230 –Socket-R, ultra-dense server for virtualization and HPC applications (1U)

• SL250 –Socket-R, hybrid-compute node for GPU computing and data base applications in HPC (2U)

• SL270 –Socket-R, high-performance GPU solution, optimized for extreme GPU density (4U)

• SL140 – Socket-B, cost-effective, power-efficient and ultra-dense solution (1U)

SL140SL230 SL270SL250

Page 22: HP - HPC-29mai2012

SL140s Gen8 SL230s Gen8 SL250s Gen8 SL270s Gen8Processor E5-2400 - 4/6/8 Cores E5-2600 - 4/6/8 Cores

Chipset Intel® C600

Memory12xDR3,

RDIMM/UDIMM,up to 1333MHz –ECC

16xDDR3, RDIMM/UDIMM up to 1600MHz-ECC

Max Memory 256GB 512GB

Internal Storage

2 LFF NHP4 SFF NHP

Opt: 2 SFF HP

2 LFF NHP 4 SFF NHP

Opt: 2 SFF HP

4 SFF HP2 LFF NHP 8 SFF HP

Max Internal Storage

4TB 3.5” SAS; 1.2TB 2.5” SAS; 6TB SATA;

480GB 2.5” SSD

4TB 3.5” SAS; 1.2TB 2.5” SAS; 6TB 3.5” SATA; 2TB

2.5” SATA; 480GB 2.5” SSD

2TB 2.5” hot plug SAS; 1.2TB 2.5” non-hot plug SAS; 2TB

2.5” hot plug SATA; 2TB 2.5” SATA; 480GB 2.5” SSD

4TB SAS; 4TB SATA; 960GB SSD

Networking1x Integrated NC366i

Dual Port Gigabit Server Adapter

1x Integrated NC366i Dual Port Gbe1xDual Port networking daughter card: QDR IB, 10GbE

I/O Slots1xPCIe Gen3: 1x16 HL/LP 1xPCIe Gen3: 1x16 HL/LP

4xPCIe Gen3: 1x8 HL/LP; 3x16 HL/LP

9xPCIe Gen3: 1x8 HL/LP; 8x16 HL/LP

Integrated Management HP iLO Mgt Engine, SIM, IRS Opt: HP Insight Control, iLO Adv

Form Factor1U HW -

8 trays per s6500 (4U) 1U HW –

8 trays per s6500 (4U) 2U HW –

4 trays per s6500 (4U) 4U HW –

2 trays per s6500 (4U)

Page 23: HP - HPC-29mai2012

HP ProLiant SL250s Gen8 2U Half Width Tray

16 DIMM Slots(Below GPU Tray)

2 Socket-R CPUs(Below GPU Tray)

PCIe Expansion Slot

Flex Fabric Slot

Management Port – iLO4

2- 1GbE Ports

Rear GPU or NHP HDDs

4 Nodes per 4U chassis8 CPUs per 4U chassis12 GPUs per 4U chassis

2 GPU Tray

4 HP SFF

Page 24: HP - HPC-29mai2012

INTERCONNECTS IN HPC

Page 25: HP - HPC-29mai2012

HPC Interconnects• Bandwidth (large data exchanges)• Latency ( microseconds )• Scalability: stay efficient even for a high number of links

•Can also accommodate I/O traffic

• Two HPC interconnects:

• Ethernet (1 GigE, 10 GigE 40GigE)

• Infiniband

Page 26: HP - HPC-29mai2012

IBTA specification

Page 27: HP - HPC-29mai2012

• Focus on partnership– Work with technology providers.

• Focus on qualification, integration, efficient supply chain– Rigorous quality testing and control– Efficient supply chain management

• IB products have one basic element : the ASIC. 2 providers : Mellanox or Qlogic.

• HP integrates IB switches from 2 providers : Mellanox and Qlogic (used to be 3 providers with Voltaire)

• We run Benches and tests for all HCA and components.• We qualify HCA, switches, cables on our platforms. • We verify the interopability of MLX and Qlogic.

HP Infiniband strategy

Page 28: HP - HPC-29mai2012

QSFP FDR Cables

HP 56Gbps FDR InfiniBand Portfolio

Unified Fabric Manager(UFM)

Installed-base QDR switchese.g. 4036E

ConnectX-3 HCAs in servers HP Systems Integration

In 2012: FDR Chassis aggregation switches

Acceleration Software

FDR 36-port edge switch

Page 29: HP - HPC-29mai2012

STORAGE IN HPC

Page 30: HP - HPC-29mai2012

Mix HP Storage in HPC cluster

• HP X9000 Network Storage System• Small files• High metadata operation rates• Wide access• /home typically...

• Lustre file system with optimized HPC focused hardware• Extreme sequential bandwidth• “True” parallel I/O

• several writers to same file• Or high single stream throughput• /scratch, /work typically…

Page 31: HP - HPC-29mai2012

HP Storage(X9000, P4000, MDS600) Lustre / DDN SFA10K

Many Applications, or instances of the same application

“One” parallelized application

Each one of many servers is running a single applications instance, up to one instance per core or VM

Parallelized applications are spread across multiple servers. May use MPI to communicate

Each has its own file/data set. Reading and writing to a single file

Many Metadata operations (IOPS) Few Metadata Operations (IOPS)

Metadata is distributed across multiple servers A single server for metadata is enough

Datasets are distributed across multiple servers to balance performance

Dataset is striped across multiple storage servers, for maximum read/write bandwidth

Typical applications: HLS (Next generation sequencing, biosciences /genomics NGS), media (animation render farms), public sector (content depots), financial services

Typical applications: computer-aided engineering, molecular modeling, high-energy physics

Page 32: HP - HPC-29mai2012

MANAGEMENT SW IN HPC

CMU : Cluster Management Utility

Page 33: HP - HPC-29mai2012

Insight CMU v7.0 7.0 (FEBRUARY 2012)

Page 34: HP - HPC-29mai2012

Hyperscale cluster lifecycle management software

Proven– 10 years+ in deployment, Top500 sites included with1000’s of

nodes

Built for Linux, with support for multiple Linux distributions• Including Hybrid support w/Windows

HP Insight CMU

Provision• Simplified

discovery,

firmware audits

• Fast and scalable

cloning

Monitor• ‘At a glance’

view of entire

system; zoom to

component

• Customizable

• Lightweight

Control• GUI and CLI

options

• Easy, friction-

less control of

remote servers

Page 36: HP - HPC-29mai2012

39

CMU main functionalities

DeploymentImaging (cloning)Autoinstall (kickstart|autoyast|preseed) Diskless

Scalable live monitoringScalable non intrusive monitoring engine (+collectl)Monitoring GUI / monitoring API

Day to day administrationinteractive cli ( + cmu_* linux commands)cmudiff, command broadcast multiple window broadcast (one window per host)single window PDSH, one command on all the hostsGUI (JAVA based for the desktop)

Page 37: HP - HPC-29mai2012

Time View

Page 38: HP - HPC-29mai2012
Page 39: HP - HPC-29mai2012

CMU Backup / Cloning Feature

Needs:

Setup of cluster is painful.System management of HPC clusters is difficult due to the large number of nodes.Cloning goals:

Avoid ‘one by one’ system installation on compute nodesFast Cluster installation with an optimised cloning mechanism

HOW:

Install one compute node Backup that compute node ( golden image )

Duplicate that golden image to all compute nodes

Page 40: HP - HPC-29mai2012

Diskless Installation

43

• Large Scale Diskless Support– When Diskless nodes are installed, the FS of the compute nodes completely runs via NFS, while the OS is loaded in RAM.

– Existing NFS-root based diskless support expanded to allow for multiple NFS servers

– Up to 4k diskless compute nodes

Page 41: HP - HPC-29mai2012

45

• CMU provides new binary for extracting GPU metric data from GPU driver–/opt/cmu/tools/cmu_get_nvidia_gpu

• New command cmu_config_nvidia to configure GPU monitoring–Configures load, mem_util, mem_alloc, power_state, and ECC_double_bit alerts by default

–Power_usage, various clock speeds, fan speeds, and temperature also configured but commented out by default

CMU GPGPU Support

Page 42: HP - HPC-29mai2012

THANK YOU