navid azizi, ian kuon, aaron egier, ahmad darabiha and...
TRANSCRIPT
![Page 1: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL](https://reader031.vdocuments.net/reader031/viewer/2022030501/5aad79187f8b9aa9488e50cd/html5/thumbnails/1.jpg)
FCCM 2004
Reconfigurable Molecular Dynamics Simulator
Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabihaand Paul ChowUniversity of Toronto
![Page 2: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL](https://reader031.vdocuments.net/reader031/viewer/2022030501/5aad79187f8b9aa9488e50cd/html5/thumbnails/2.jpg)
2
FCCM 2004
Reconfigurable Molecular Dynamics Simulator
Why is Molecular Dynamics interesting? Simulates interaction of atoms over time
Many possible applications– Biomolecules
Computationally intensive to handle >1000 atoms
Large computer clusters used in the past
Can a reconfigurable simulation system do this better?
![Page 3: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL](https://reader031.vdocuments.net/reader031/viewer/2022030501/5aad79187f8b9aa9488e50cd/html5/thumbnails/3.jpg)
3
FCCM 2004
Reconfigurable Molecular Dynamics Simulator
What is Molecular Dynamics?
Simulate using classical Newtonian mechanicsF = m a
Integrate acceleration to get position and velocity changes
Use a very small timestep ~ 1 femtosecond
![Page 4: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL](https://reader031.vdocuments.net/reader031/viewer/2022030501/5aad79187f8b9aa9488e50cd/html5/thumbnails/4.jpg)
4
FCCM 2004
Reconfigurable Molecular Dynamics Simulator
Molecular Dynamics Background
Simulation Procedure per timestep– Sum force over all interacting atoms
– Calculate acceleration
– Integrate acceleration to update atom position and velocity
– Repeat for all atoms
![Page 5: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL](https://reader031.vdocuments.net/reader031/viewer/2022030501/5aad79187f8b9aa9488e50cd/html5/thumbnails/5.jpg)
5
FCCM 2004
Reconfigurable Molecular Dynamics Simulator
Background - Forces
Two types of forces–Bonded – O(n)
–No hardware acceleration required–Non Bonded – O(n2)
–Needs hardware acceleration
![Page 6: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL](https://reader031.vdocuments.net/reader031/viewer/2022030501/5aad79187f8b9aa9488e50cd/html5/thumbnails/6.jpg)
6
FCCM 2004
Reconfigurable Molecular Dynamics Simulator
-2.5E-11
0.0E+00
2.5E-11
5.0E-11
7.5E-11
1.0E-10
Distance
Pote
ntia
lBackground – Force Calculation
Lennard-Jones (LJ) potential models interaction
Force on a atom is the gradient of potential
( )⎥⎥⎦
⎤
⎢⎢⎣
⎡⎟⎠⎞
⎜⎝⎛−⎟
⎠⎞
⎜⎝⎛=
612
4rr
rLJσσεφ
![Page 7: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL](https://reader031.vdocuments.net/reader031/viewer/2022030501/5aad79187f8b9aa9488e50cd/html5/thumbnails/7.jpg)
7
FCCM 2004
Reconfigurable Molecular Dynamics Simulator
Background – Simulating Large VolumesAny interesting volume has far too many atoms to simulate
Solution – Periodic Boundary Conditions
ReplicatedBox
ReplicatedBox
ReplicatedBox
ReplicatedBox
ReplicatedBox
ReplicatedBox
ReplicatedBox
ReplicatedBox
Box beingsimulated
BA
B’A’
B’A’
B’A’
B’A’
B’A’
B’A’
B’A’
B’A’
![Page 8: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL](https://reader031.vdocuments.net/reader031/viewer/2022030501/5aad79187f8b9aa9488e50cd/html5/thumbnails/8.jpg)
8
FCCM 2004
Reconfigurable Molecular Dynamics Simulator
Architectural Design – System Overview
PairGen
VerletUpdate
ForceComputer
AccelerationUpdate
SunWorkstation
SystemControl
ParticleMemory
FunctionValue
Memory
SlopeMemory
![Page 9: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL](https://reader031.vdocuments.net/reader031/viewer/2022030501/5aad79187f8b9aa9488e50cd/html5/thumbnails/9.jpg)
9
FCCM 2004
Reconfigurable Molecular Dynamics Simulator
Architectural Design – Force Computer
r2 from PG used for function lookup
Interpolate to obtain a more accurate force magnitude
PairG
enForce
Computer
AccelerationUpdate
FunctionValue
Memory
SlopeMemory
![Page 10: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL](https://reader031.vdocuments.net/reader031/viewer/2022030501/5aad79187f8b9aa9488e50cd/html5/thumbnails/10.jpg)
10
FCCM 2004
Reconfigurable Molecular Dynamics Simulator
Architectural Design – Force Computerr2 is larger than 18-bits
–Look up table has a 18-bit address
-1.00E+09
0.00E+00
1.00E+09
2.00E+09
3.00E+09
4.00E+09
5.00E+09
6.00E+09
0 5E-19 1E-18 1.5E-18 2E-18
Separation2
Psue
do-A
ccel
erat
ion
![Page 11: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL](https://reader031.vdocuments.net/reader031/viewer/2022030501/5aad79187f8b9aa9488e50cd/html5/thumbnails/11.jpg)
11
FCCM 2004
Reconfigurable Molecular Dynamics Simulator
Architectural Design – Force Computer
r2
FF...FF
low bits
middlebits
highbits
Residual forinterpolation
Address tolookup tables
![Page 12: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL](https://reader031.vdocuments.net/reader031/viewer/2022030501/5aad79187f8b9aa9488e50cd/html5/thumbnails/12.jpg)
12
FCCM 2004
Reconfigurable Molecular Dynamics Simulator
Precision and Scaling Factors
Architecture uses integer operations to reduce complexity
Precision: number of bits used to represent a valueScaling Factor: the weight of the least significant bit of the value
Precision
ScalingFactor
![Page 13: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL](https://reader031.vdocuments.net/reader031/viewer/2022030501/5aad79187f8b9aa9488e50cd/html5/thumbnails/13.jpg)
13
FCCM 2004
Reconfigurable Molecular Dynamics Simulator
Calculating the Precision and Scaling FactorsCalculations made with atoms at varying distances
Scaling Factor = the minimum valuePrecision = log2 of the difference between minimum and maximum
372-64Acceleration
512-15Velocity
382-64Position
PrecisionScaling Factor
Quantity
![Page 14: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL](https://reader031.vdocuments.net/reader031/viewer/2022030501/5aad79187f8b9aa9488e50cd/html5/thumbnails/14.jpg)
14
FCCM 2004
Reconfigurable Molecular Dynamics Simulator
Simulation Environment is ConfigurableSimulation reconfigurability
– Change precision, scaling factors, number of atoms. forces
– No wasted hardware
– No time overhead when precision is reduced
Entire process is automated– One input file controls entire process
– Hardware – C program creates appropriate VHDL
– Software interface, Software initialization– Always match the hardware
![Page 15: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL](https://reader031.vdocuments.net/reader031/viewer/2022030501/5aad79187f8b9aa9488e50cd/html5/thumbnails/15.jpg)
15
FCCM 2004
Reconfigurable Molecular Dynamics Simulator
ImplementationUsed the Transmogrifier 3
–4 interconnected Virtex-E 2000’s
–2MB memories connected to each Virtex-E
–Slow by today’s standards
![Page 16: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL](https://reader031.vdocuments.net/reader031/viewer/2022030501/5aad79187f8b9aa9488e50cd/html5/thumbnails/16.jpg)
16
FCCM 2004
Reconfigurable Molecular Dynamics Simulator
VerificationTested accuracy of implementation
–Compared TM3 results with software
0
500
1000
1500
2000
2500
0 100 200 300 400 500 600 700
Timestep
Ener
gy (k
J/m
ol)
TM3 -Total EnergySoftware - Total EnergyTM3 - Kinetic EnergySoftware - Kinetic EnergyTM3 - Potential EnergySoftware - Potential Energy
![Page 17: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL](https://reader031.vdocuments.net/reader031/viewer/2022030501/5aad79187f8b9aa9488e50cd/html5/thumbnails/17.jpg)
17
FCCM 2004
Reconfigurable Molecular Dynamics Simulator
System Performance
For a 8192 atom MD system running on the TM3
–Frequency: 26 MHz–Timestep Duration: 37 sec
For a 8192 atom software system running on a 2.4GHz Pentium 4
–Timestep Duration: 10.8 sec
MD system is 3.4X slower than software
![Page 18: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL](https://reader031.vdocuments.net/reader031/viewer/2022030501/5aad79187f8b9aa9488e50cd/html5/thumbnails/18.jpg)
18
FCCM 2004
Reconfigurable Molecular Dynamics Simulator
How to improve this?Memory
New FPGA
Parallelism
![Page 19: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL](https://reader031.vdocuments.net/reader031/viewer/2022030501/5aad79187f8b9aa9488e50cd/html5/thumbnails/19.jpg)
19
FCCM 2004
Reconfigurable Molecular Dynamics Simulator
Memory Requirements of MD System
Acceleration Array (8192 atom system uses 0.17 MB)
Velocity Array (8192 atom system uses 0.17 MB)
Position Array (8192 atom system uses 0.34 MB)
Lookup Tables: 2 MB
![Page 20: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL](https://reader031.vdocuments.net/reader031/viewer/2022030501/5aad79187f8b9aa9488e50cd/html5/thumbnails/20.jpg)
20
FCCM 2004
Reconfigurable Molecular Dynamics Simulator
Improving Performance – Memory Organization On TM3 there is only one external SRAM per FPGA
Single SRAM for all atom information causes large slowdowns
– Handshaking
– Serial reads for x, y and z
– Hardware issues
Better memory system
2.1 seconds/timestep (5X faster than software)
![Page 21: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL](https://reader031.vdocuments.net/reader031/viewer/2022030501/5aad79187f8b9aa9488e50cd/html5/thumbnails/21.jpg)
21
FCCM 2004
Reconfigurable Molecular Dynamics Simulator
Improving Performance – Clock Speed
Run on modern FPGA
All possible improvements for clock speed not explored
Expect a factor of 4 increase to a 100MHz
Better memory system + Faster Clock Speed
0.51 seconds/timestep (21X faster than software)
![Page 22: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL](https://reader031.vdocuments.net/reader031/viewer/2022030501/5aad79187f8b9aa9488e50cd/html5/thumbnails/22.jpg)
22
FCCM 2004
Reconfigurable Molecular Dynamics Simulator
Improving Performance – Parallel Architecture
Better memory system + Faster Clock Speed + Parallelize
0.51/n seconds/timestep (21n X faster than software)
PairGen
AU(1)AA
AU(0)AA
AU(n-1)AA
VU(1)VA
VU(0)VA
VU(n-1)VA
PAi
PA
iP
AiPA
jPA
jPA
j
LJFC(0)
LJFC(1)
LJFC(n-1)
Slope Mem
Value Mem
Value Mem
Slope Mem
Value Mem
Slope Mem
![Page 23: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL](https://reader031.vdocuments.net/reader031/viewer/2022030501/5aad79187f8b9aa9488e50cd/html5/thumbnails/23.jpg)
23
FCCM 2004
Reconfigurable Molecular Dynamics Simulator
Cost, Power Benefits
Performance– MD system can deliver a 21X performance benefit over
software
– Assume a conservative 10X performance advantage
Cost– Microprocessor motherboard-sized board can fit 4
FPGAs
– 4 FPGAs (each $200) + board + SRAM + misc. ~ $1500
– Microprocessor Motherboard + CPU + DRAM ~ $1500
![Page 24: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL](https://reader031.vdocuments.net/reader031/viewer/2022030501/5aad79187f8b9aa9488e50cd/html5/thumbnails/24.jpg)
24
FCCM 2004
Reconfigurable Molecular Dynamics Simulator
Comparison of MD Simulator and Supercomputers
40W106WPower~$1500~$1500Cost40X1XPerformance
4 FPGAs + 24 MB SRAM
1 Pentium + 1 GB DRAM
Per Board
40X ImprovementPerformance/Space40X ImprovementPerformance/Cost
100X ImprovementPerformance/Power
Improvement of MD System
Metric
![Page 25: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL](https://reader031.vdocuments.net/reader031/viewer/2022030501/5aad79187f8b9aa9488e50cd/html5/thumbnails/25.jpg)
25
FCCM 2004
Reconfigurable Molecular Dynamics Simulator
ConclusionsEasily reconfigurable MD System designed
Molecular dynamics simulation can be done on FPGAs
Simple enhancements will improve speed– Power, Cost and Space savings over software
![Page 26: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL](https://reader031.vdocuments.net/reader031/viewer/2022030501/5aad79187f8b9aa9488e50cd/html5/thumbnails/26.jpg)
26
FCCM 2004
Reconfigurable Molecular Dynamics Simulator
Future WorkImprove accuracy
Target newer FPGA platform
Support new forces
Funding for the TM3 Project was provided by Micronet and Xilinx
Acknowledgements