![Page 1: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/1.jpg)
Gaj 1 MAPLD 2005/1016
Development and Maintenance of User Libraries for
SRC Reconfigurable Computers
Kris Gaj1, Tarek El-Ghazawi2, Paul Gage3, Dan Poznanovic3,
Chang Shu1, Deapesh Misra1,
Miaoqing Huang2, Esam El-Araby2,
Mohamed Taher2
1 George Mason University2 The George Washington University3 SRC Computers, Inc.
![Page 2: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/2.jpg)
Gaj 2 MAPLD 2005/1016
ReconfigurableComputers
![Page 3: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/3.jpg)
Gaj 3 MAPLD 2005/1016
Interface
P memory
P memory
. . .
P P . . .
I/O Interface
FPGA memory
FPGA memory
. . .
FPGA FPGA . . .
I/O
Microprocessor system FPGA system
What is a reconfigurable computer?
![Page 4: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/4.jpg)
Gaj 4 MAPLD 2005/1016
Examples of High-End Reconfigurable Computers
• SRC-6E and SRC High-Bar Based Systems from SRC Computers, Inc.
• Cray XD1 (formerly Octiga Bay 12 K) from Cray Inc.
• SGI Altix 3000 from Silicon Graphics
• Star Bridge Hypercomputer from Star Bridge Systems
![Page 5: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/5.jpg)
Gaj 5 MAPLD 2005/1016
SRC MAP™ Reconfigurable Processor
Source: [SRC, MAPLD04]
![Page 6: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/6.jpg)
Gaj 6 MAPLD 2005/1016
SNAP
ComputerMemory(8 GB)
P4(2.8GHz)
P4(2.8GHz)
/ /22400MB/s
MIOC
L2L2
4256 MB/s
// 4256 MB/s1064 MB/s
DDRInterface
PCI-X
ControlFPGA
XC2V6000
2128 MB/s
On-Board Memory(24 MB)
/4800 MB/s(6x64 bits)
FPGA 1XC2V6000
FPGA 2XC2V6000
/
4800 MB/s(6x 64 bits)
/
4800 MB/s(6x 64 bits)
2400 MB/s(192 bits)
/
/ /
(108 bits)
ChainPorts 2400 MB/s
(108 bits)
/
1064 MB/s
½ MAPBoard
uPBoard
22400MB/s
SRC-6E Hardware Architecture
![Page 7: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/7.jpg)
Gaj 7 MAPLD 2005/1016
Storage Area Storage Area Network Network
Local Area Local Area Network Network
Wide Area Wide Area Network Network DiskDisk
Customers’ Existing NetworksCustomers’ Existing Networks
• Hi-Bar sustains 1.4 GB/s per port with 180 ns latency per tier• Up to 256 input and 256 output ports• Common Memory (CM) has controller with DMA capability• Up to 8 GB DDR SDRAM supported per CM node
PCI-XPCI-XPCI-XPCI-X
SRC Hi-Bar Based Systems
MAPMAP®®
SRC-6SRC-6
MAPMAP
PP
MemoryMemory
SNAPSNAP™™
PP
MemoryMemory
SNAPSNAP
Gig EthernetGig Ethernetetc.etc.
Common Common MemoryMemory
ChainingChainingGPIOGPIO
Common Common MemoryMemory
SRC Hi-Bar SwitchSRC Hi-Bar Switch
Source: [SRC, MAPLD04]
![Page 8: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/8.jpg)
Gaj 8 MAPLD 2005/1016
SRC Programming
HLL (C)
HDL (VHDL)
SRCP system
FPGA system
ApplicationProgrammer
LibraryDeveloper
![Page 9: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/9.jpg)
Gaj 9 MAPLD 2005/1016
C function for P
C function for FPGAs
VHDL macro for FPGAs
SRC Program Partitioning
P system
FPGA system
HLL
HDL
![Page 10: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/10.jpg)
Gaj 10 MAPLD 2005/1016
Main program
Function_1(a, d, e)
Function_2(d, e, f)
Function_1
Function_2
Macro_1(a, b, c)
Macro_2(b, d)Macro_2(c, e)
Macro_3(s, t)
Macro_1(n, b)Macro_4(t, k)
FPGA……
……
……
Macro_1
Macro_2 Macro_2
a
b c
d e
FPGA contents afterthe Function_1 call
Program in C or Fortran
Run Time Reconfiguration in SRC
![Page 11: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/11.jpg)
Gaj 11 MAPLD 2005/1016
SRC Development Environment
Objectfiles
Application sources
MAP CompilerP Compiler
Logic synthesis
Place & Route
Linker.bin files
.edf files
.o files .o files
Applicationexecutable
Configurationbitstreams
HDLsources.c or .f files .vhd or .v files
Objectfiles
Application sourcesUser
Macro Sources
MAP CompilerP Compiler
Logic synthesis
Place & Route
Linker
.edf files
.bin files
. files
.o files .o files
Applicationexecutable
Configurationbitstreams
HDL
.c or .f files .vhd or .v files
.v files
![Page 12: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/12.jpg)
Gaj 12 MAPLD 2005/1016
Advantages of reconfigurable computers
• can be programmed by mathematicians themselves using traditional programming languages or GUI environments
• encourage innovation and experimentation
• general-purpose: cost distributed among multiple users with different needs
• behave like hardware: - parallel processing - distributed memory - specialized functional units, etc.
![Page 13: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/13.jpg)
Gaj 13 MAPLD 2005/1016
Conditions necessary for the success of reconfigurable computers
• ease of use of library macros and functions
• existence of comprehensive libraries of user macros and functions capable of running on FPGAs
• significant speed-ups ( 100 x) of basic functions running on FPGAs compared to state-of-the-art microprocessors
![Page 14: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/14.jpg)
Gaj 14 MAPLD 2005/1016
Development and Maintenance of SRC
Libraries
![Page 15: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/15.jpg)
Gaj 15 MAPLD 2005/1016
Structure of the macro repository < top of repository >
<lib # 1 >
common rev_d rev_e
hdlfile InfoFile BlkBoxFile
macro1 macro2 macro3
< macros >
<lib # 2 > <lib # 3 >
rev_f
DebugCodeFile
DataSheet
![Page 16: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/16.jpg)
Gaj 16 MAPLD 2005/1016
common: • These are macros that have no connections to external
pins nor to any specific FPGA type specific feature. This type of macro can be used on any MAP
rev_d: • These macros have a specific dependency on the dual MAP
rev_e: • These macros have a specific dependency on the single
MAP rev_f:
• These macros have a specific dependency on compact MAP
Macro Types
![Page 17: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/17.jpg)
Gaj 17 MAPLD 2005/1016
Files describing the macro
Platform independent HDL file: macro.v or macro.vh
• Verilog or VHDL code defining the macro
Debug Code File: macro.c • provides the equivalent C functionality for the macro
Platform dependent Blk Box File: blackbox.v
• Interface (black box) definition for the macro in Verilog
Data sheet file: datasheet• contains the documentation for the macro
Info File: info• Info file entry for the given macro, containing macro type, latency, names of input/output/control signals, etc.
![Page 18: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/18.jpg)
Gaj 18 MAPLD 2005/1016
To properly manage a distribution of macros a CVS repository must be setup.
This allows the source code changes to be controlled and permits multiple developers
to work on the code.
CVS repository
![Page 19: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/19.jpg)
Gaj 19 MAPLD 2005/1016
The Installed Macro Library Structure
<xxx lib>
map 3 (built for the Xilinx Virtex2) map 4 (built for the Xilinx Virtex2Pro)
common rev_d rev_e
ngo blkbox.v macros.info
macro1 macro2 macro3 ......
common rev_d rev_e
Single info file
Single blackbox file
Obtained by running a special script developed by SRC
![Page 20: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/20.jpg)
Gaj 20 MAPLD 2005/1016
Library Script
Usage:
build_libs [OPTION][-b, --branch br] Specify CVS branch[-c, --checkout] Checkout only[-d, --CVSROOT cvsroot] Specify CVSROOT[-M, --MAP maptype] Build for MAP maptype[-m, --module mod] Build mod only[-r, --restart mmddyy-hhmm] Restart previous build[-s, --step target] Run build step target[-v, --version N.n] Package as version N.n[-V, --vendor vend] Specify distribution vendor[-w, --workspace path] Create workspace in path
![Page 21: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/21.jpg)
Gaj 21 MAPLD 2005/1016
Building libraries
• build_libs will checkout library and perform a build in /var/tmp/builds in a folder with a time stamp (i.e. 080405-1705)
• If there is an error check file called ‘output’ in the /var/tmp/builds. Fix the error and restart build by:
• build_libs --restart 080405-1705• You can also do a partial build, say only build
the library and not the CD• build_libs --step lib
• To build only a particular subset of a library, you can do so using a command such as:
• build_libs --module crypto
![Page 22: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/22.jpg)
Gaj 22 MAPLD 2005/1016
Structure for the repository of MAP C functions
< top of repository >
<lib # 1 >
common rev_d rev_e
routine1 routine2 routine3
< userlib >
<lib # 2 > <lib # 3 >
rev_f
![Page 23: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/23.jpg)
Gaj 23 MAPLD 2005/1016
Source file: • This is the .mc or .mf file defining the MAP routine
proto.h: • This file provides a prototype of the MAP routine
Makefile: • This is a standard Carte Makefile, with the exception that no BIN
environment variable is provided.
Docfile:• This file provide a man page format documentation
of the MAP routine.
Files describing the MAP C routine
![Page 24: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/24.jpg)
Gaj 24 MAPLD 2005/1016
The Installed MAP Routine Library Structure
<userlib >
map 3 map 4
common rev_d rev_e
lib1.a lib1.so lib2.a
common rev_d rev_e
lib2.so ......
![Page 25: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/25.jpg)
Gaj 25 MAPLD 2005/1016
Known problems:No support for variable size
of operands
![Page 26: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/26.jpg)
Gaj 26 MAPLD 2005/1016
We would like to be able to create and maintain a library of generic components that work for various operand sizes.
Problem statement
Example:
Basic arithmetic operations (addition, subtraction, multiplication, division) of multiprecision (n-bit) integers.
![Page 27: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/27.jpg)
Gaj 27 MAPLD 2005/1016
Possible solutions
1. Fixed-size interface to a macro
• using streams• without using streams
2. Variable-size interface to a macro cell
![Page 28: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/28.jpg)
Gaj 28 MAPLD 2005/1016
Input (64-bits)
Output (64-bits)
Process Process
![Page 29: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/29.jpg)
Gaj 29 MAPLD 2005/1016
Passing variable-size operandswithout streams
for (i=0; i<3*N+1; i++) { if (i < N) A_in = c[i]; B_in = d[i]; else A_in = 0; B_in = 0;
mul (i, A_in, B_in, &C_out);
if (i > N) e[i-N] = C_out;}
![Page 30: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/30.jpg)
Gaj 30 MAPLD 2005/1016
Passing variable size operandsusing streams
#pragma src section { for (i=0; i<N; i++) { put_stream (&S0, A[i], 1); // put A[i] to S0 put_stream (&S1, B[i], 1); // put B[i] to S1 } } #pragma src section { mul (&S0, &S1, &S2); // read from S0 and S1, write to S2 } #pragma src section { for (i=0; i<2*N; i++) get_stream (&S2, &C[i]); // take from S2 and write to C[i] }
![Page 31: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/31.jpg)
Gaj 31 MAPLD 2005/1016
Process Process
![Page 32: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/32.jpg)
Gaj 32 MAPLD 2005/1016
Multiprecision Integer Library Generator
Multiprecision Integer Library
Generator(C engine)
C/VHDL Wrapper
Black Box Info file
Size of operands - N
In-line MAP Cfunction
![Page 33: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/33.jpg)
Gaj 33 MAPLD 2005/1016
Inline MAP C functionfor N=2
int mul (int64_t *A, int64_t *B, int64_t *C, N){int64_t A0, A1;int64_t B0, B1;int64_t C0, C1, C2, C3;
A0=A[0];A1=A[1];B0=B[0];B1=B[1];Mul_128(A0, A1, B0, B1, &C0, &C1, &C2, &C3);C[0] = C0;C[1] = C1;C[2] = C2;C[3] = C3;}
![Page 34: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/34.jpg)
Gaj 34 MAPLD 2005/1016
Pros and cons of both methods
1. Fixed-size interface to a macro
Pros: Interface independent of the operand size
Cons: input/output overhead
2. Variable-size interface to a macro cell
Pros: minimum overhead
Cons: need to generate automatically several macro files,
need for changes in the compiler
![Page 35: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/35.jpg)
Gaj 35 MAPLD 2005/1016
GMU/GWU Libraries
![Page 36: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/36.jpg)
Gaj 36 MAPLD 2005/1016
Cryptographic Libraries
Secret Key Ciphers
Secret key ciphers encryption and breaking – SecCiph
Public Key Ciphers • Elliptic Curve Cryptosystems arithmetic - ECC• Binary Galois Field GF(2m) arithmetic in Polynomial Basis - GF2n_PB• Binary Galois Field GF(2m) arithmetic in Normal Basis - GF2n_NB• Multiprecision integer arithmetic (in collaboration with University of South Carolina) – Long_Int• Operations supporting factorization of large integers using Number Field Sieve - NFS
![Page 37: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/37.jpg)
Gaj 37 MAPLD 2005/1016
Digital Image Processing Libraries
Image Enhancement / Restoration Single-Resolution
Noise Reduction (Convolution Filtering) Smoothing (Lowpass) Gaussian (Lowpass) Blurring (Lowpass) Sharpening (Highpass)
Edge Detection (Derivative Filters) Prewitt Sobel
Multi-Resolution Discrete Wavelet Transform (DWT) Inverse Discrete Wavelet Transform (IDWT)
Similarity Measures Correlation
![Page 38: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/38.jpg)
Gaj 38 MAPLD 2005/1016
Miscellaneous Libraries
Sorting
Stream-searching
BMM - Bit Matrix Multiply
DARPA benchmarks
![Page 39: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/39.jpg)
Gaj 39 MAPLD 2005/1016
Performance of selected
applications based on GMU/GWU
libraries
![Page 40: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/40.jpg)
Gaj 40 MAPLD 2005/1016
1. input/output intensive applications• bulk data encryption
(DES, IDEA, and RC5 encryption) • image processing (Sobel Edge Detection, Median Filter,
Wavelet Hyperspectral Dimension Reduction)
2. computationally intensive applications• secret-key cipher breaking based on
the exhaustive key search (DES, IDEA, RC5 breakers)
• public-key cipher breaking based on factoring
3. latency-critical applications• cipher key agreement and signature (ECC schemes, RSA)
Classes of applications
![Page 41: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/41.jpg)
Gaj 41 MAPLD 2005/1016
PC based on Pentium IV, 2.4 GHz clock,
512 MB of RAM, 512 KB of cache
Reference Platform
Treated as a basic building block of a clusterof microprocessor boards.
Platform used in experiments
SRC-6E from SRC Computers, Inc.
![Page 42: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/42.jpg)
Gaj 42 MAPLD 2005/1016
Timing Measurements
MAPAlloc.
MAP
FreeDMA
DataOut
DMA
Data In
FPGA
Computation
.c file .mc file
End-to-End time (SW)
MAPfunction
MAP function
FPGA
Configure
Configuration time
MAP
Allocation
time
MAP
Release
Time
End-to-End time (HW)
MAP – SRC Reconfigurable Processor based on two User FPGAs
![Page 43: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/43.jpg)
Gaj 43 MAPLD 2005/1016
Application
ComputationalThroughput
(Mbits/s)
DataTransfer InThroughput
(Mbits/s)
DataTransfer OutThroughput
(Mbits/s)
End-to-End Throughput
(Mbits/s)Speed up
SRC 6E SRC 6E SRC 6E SRC 6E Pentium IV
DESEncryption 6,398 2,488 1,705 863 58 14.9
IDEAEncryption 12,788 2,487 1,799 938 165 5.7
RC5Encryption 6,398 2,505 1,590 836 366 2.3
Sobel EdgeDetection 5,680 2,493 1,701 849 76 11.0
MedianFilter 5,681 2,484 1,710 850 5 170
WaveletHyperspectral
DimensionReduction
6395 2,573 1,477 81867 – 159
(5 levels –1 level)
5 – 12(1 level –5 levels)
Input/Output Intensive ApplicationsP3 version of SRC-6E
![Page 44: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/44.jpg)
Gaj 44 MAPLD 2005/1016
Wavelet Hyperspectral Dimension ReductionTime contributions
P3 version of SRC-6E vs. Pentium IV PC
![Page 45: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/45.jpg)
Gaj 45 MAPLD 2005/1016
Application
ComputatinalThroughput
(Mbits/s)
DataTransfer InThroughput
(Mbits/s)
DataTransfer OutThroughput
(Mbits/s)
End-to-End Throughput
(Mbits/s)Speed up
SRC 6E SRC 6E SRC 6E SRC 6E Pentium IV
IDEAEncryption 12,790 10,627 10,583 3,479 165 21
RC5Encryption 6398 6371 6373 2,098 366 5.7
Sobel EdgeDetection 5,683 6,384 6,380 2,044 76 27
MedianFilter 5,684 6,384 6,383 2,044 5 409
WaveletHyperspectral
DimensionReduction
6,394 6,349 3,185 1,62667 – 159
(5 levels –1 level)
10 – 24(1 level – 5 levels)
Input/Output Intensive ApplicationsP4 version of SRC-6E
![Page 46: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/46.jpg)
Gaj 46 MAPLD 2005/1016
Wavelet Hyperspectral Dimension ReductionTime contributions
P4 version of SRC-6E vs. Pentium IV PC
![Page 47: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/47.jpg)
Gaj 47 MAPLD 2005/1016
Application
ComputationalThroughput
(Mbits/s)
DataTransfer InThroughput
(Mbits/s)
DataTransfer OutThroughput
(Mbits/s)
End-to-End Throughput
(Mbits/s)Speed up
SRC 6E SRC 6E SRC 6E SRC 6E Pentium IV
IDEAEncryption
(no overlapping)12,790 10,627 10,583 3,479 165 21
IDEAEncryption
(with overlapping)10,857 9,792 10,564 4,887 165 30
RC5Encryption
(no overlapping)6398 6371 6373 2,098 366 5.7
RC5Encryption
(with overlapping)6398 6,372 6,349 3,110 366 8.5
Input/Output Intensive ApplicationsP4 version of SRC-6E
without and with overlappingcomputations and data transfers
![Page 48: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/48.jpg)
Gaj 48 MAPLD 2005/1016
Application
ComputationalThroughput
(Mbits/s)
DataTransfer InThroughput
(Mbits/s)
DataTransfer OutThroughput
(Mbits/s)
End-to-End Throughput
(Mbits/s)Speed up
SRC 6 SRC 6 SRC 6 SRC 6 Pentium IV
DESEncryption
(no overlapping)19,200 11,350 10,760 4,240 58 73
IDEAEncryption
(no overlapping)19,200 11,350 10,760 4,240 165 26
RC5Encryption
(no overlapping)19,200 11,350 10,760 4,240 366 12
Input/Output Intensive ApplicationsSRC Hi-Bar Based System
![Page 49: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/49.jpg)
Gaj 49 MAPLD 2005/1016
Application
ComputationalThroughput
DataTransfer InThroughput
DataTransfer
OutThroughput
End-to-End Throughput
(mln keys/s) (mln keys/s) (mln keys/s) (mln keys/s)
SpeedupSRC 6E SRC 6E SRC 6E SRC 6E
PentiumIV
DES Breaker
800 N/A N/A 800 0.469 1706
IDEA Breaker
1000 N/A N/A 500 1.701 294
RC5
Breaker100 N/A N/A 100 0.516 194
Computationally Intensive ApplicationsP3 version of SRC-6E
![Page 50: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/50.jpg)
Gaj 50 MAPLD 2005/1016
Latency-Critical Applications
Application
ComputatinalLatency
DataTransfer In
Latency
DataTransfer
OutLatency
End-to-End Latency
(μs) (μs) (μs) (μs)
Speedup
SRC 6E SRC 6E SRC 6E SRC 6EPentium
IV
ECC DHKey Agreementover GF(2233),
Optimal Normal Basis
201 39 17 592 364,000 615
ECC DH Key Agreement
over GF(2233), Polynomial Basis
560 66 7 943 31,050 33
![Page 51: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/51.jpg)
Gaj 51 MAPLD 2005/1016
RSA: SRC vs. OpenSSL Software Comparison
Data SizeSW Function
Time (ms)SW Speedup
vs. MAP SW
1024 47.248 4.821 x
1536 138.466 3.642 x
2048 269.948 3.321 x
3072 853.050 3.468 x
4096 1755.266 3.624 x
![Page 52: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/52.jpg)
Gaj 52 MAPLD 2005/1016
Sparse matrix by vector multiplication
MatrixSize
K
OneMultiplicationTime in SW
(ns)
OneMultiplicationTime in HW
(ns)
Speedup
144x144(Mesh12x12)
70 3440 12 282
Reference Optimized SW Implementation:
PC, Pentium IV, 2.768 GHz, 1 GB RAM
![Page 53: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/53.jpg)
Gaj 53 MAPLD 2005/1016
Summary &
Conclusions
![Page 54: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/54.jpg)
Gaj 54 MAPLD 2005/1016
Summary
Type of applicationEnd-to-end
speed-up of SRC vs. P4
Computationally intensive(cipher breaking)
200-1700
Latency critical RSA 0.2-0.3 ECC polynomial bases, general fields 33 ECC polynomial bases, special fields 12-27 ECC optimal normal bases 600
Input/output intensive 3-30(secret key encryption/decryption)
![Page 55: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/55.jpg)
Gaj 55 MAPLD 2005/1016
Summary & conclusions (1)
General methodology for the design and maintenanceof SRC user libraries developed and tested
Existing libraries evaluated in terms of - performance - ease of use - flexibilityfor three wide classes of applications
Initial results very encouraging
![Page 56: Gaj1MAPLD 2005/1016 Development and Maintenance of User Libraries for SRC Reconfigurable Computers Kris Gaj 1, Tarek El-Ghazawi 2, Paul Gage 3, Dan Poznanovic](https://reader035.vdocuments.net/reader035/viewer/2022070414/5697c0121a28abf838ccc1f0/html5/thumbnails/56.jpg)
Gaj 56 MAPLD 2005/1016
Selected files from the SRC libraries can be usedfor development of comparable librariesfor other reconfigurable computers
Full compatibility with other reconfigurable computers difficult to achieve because of the technical differences and intellectual property constraints
Summary & conclusions (2)