1 down place hammersmith london uk
DESCRIPTION
1 Down Place Hammersmith London UK. 530 Lytton Ave. Palo Alto CA USA. Deployed Maximum Performance Computing customers comparing 1 box from Maxeler (in a deployed system) with 1 box from Intel. Customer 1 App1 19x and App2 25x. Customer 3 App1 22x, App2 22x. Customer 2 - PowerPoint PPT PresentationTRANSCRIPT
www.maxeler.com 1 Down PlaceHammersmithLondon UK
530 Lytton Ave.Palo Alto CA USA
2
Deployed Maximum Performance Computing customers comparing 1 box from Maxeler (in a deployed system) with 1 box from Intel
Customer 1App1 19x and App2
25x
Customer 21.2GB/s per card
Customer 3App1 22x, App2 22x
Customer 4App 32x and App2 29x
Customer 530x
Customer 6App1 26x and App2 30x
• Maxeler delivers bespoke dataflow HPC solutions=> An HPC Computing Appliance for “structured Big Data”
• Building the HPC compute fabric based on the application in a multi-disciplinary, data-centric approach
What Maxeler Does
Hardware Building 1U boxes, Workstations and the cards inside. Building custom large memory systems to deal with Big
Data Integrating rack system with networking and storage. Integrated environment brings bespoke dataflow
computingto high end HPC users
Dataflow programming in Java and Eclipse IDE
Consulting HPC System Performance Architecture Algorithms and Numerical Optimization Integration into business and technical processes
Software
3
Dataflow Computing
5
Technology
One result per clock cycle
MAXELER DATAFLOW COMPUTING
Dynamic (switching) Power Consumption:
Minimal frequency f achieves maximal performance, thus for a given power budget, we get Maximum Performance Computing (MPC)!
fVCP DDloadavg 2
6
Explaining Control Flow versus Data Flow
• Many specialized workers are more efficient (data flow)• Experts are expensive and slow (control flow)
Analogy 1: The Ford Production Line
Example Accelerated Applications
8
• Compute value of complex financial derivatives (CDOs)
• Typically run overnight, but beneficial to compute in real-time
• Many independent jobs• Speedup: 220-270x• Power consumption per
node drops from 250W to 235W/node
JP Morgan Credit Derivatives PricingO. Mencer and S. Weston, 2010
9
3000³ Modeling
0
200
400
600
800
1,000
1,200
1,400
1,600
1,800
2,000
1 4 8
Equi
vale
nt C
PU c
ores
Number of MAX2 cards
15Hz peak frequency
30Hz peak frequency
45Hz peak frequency
70Hz peak frequency
*presented at SEG 2010.
Compared to 32 3GHz x86 cores parallelized using MPI
8 Full Intel Racks ~100kWatts => Single 3U Maxeler System <1kWatt
10
Given matrix A, vector b, find vector x in Ax = b.
Sparse Matrix Solving with MaxelerO. Lindtjorn et al, HotChips 2010
0
10
20
30
40
50
60
0 1 2 3 4 5 6 7 8 9 10
Compression Ratio
Spee
dup
per 1
U N
ode
GREE0A1new01
Domain Specific Address and Data Encoding (*Patent Pending)
MAXELER SOLUTION: 20-40x in 1UDOES NOT SCALE BEYOND 6 x86 CPU CORES