0839-118-1 mit lincoln laboratory radar pulse compression using the nvidia cuda sdk stephen bash,...
TRANSCRIPT
![Page 1: 0839-118-1 MIT Lincoln Laboratory Radar Pulse Compression Using the NVIDIA CUDA SDK Stephen Bash, David Carpman, and David Holl HPEC 2008 September 23-25,](https://reader030.vdocuments.net/reader030/viewer/2022032805/56649ee55503460f94bf408c/html5/thumbnails/1.jpg)
0839-118-1
MIT Lincoln Laboratory
Radar Pulse Compression Using the NVIDIA CUDA SDK
Stephen Bash, David Carpman, and David Holl
HPEC 2008
September 23-25, 2008
This work is sponsored by the Air Force Research Laboratory under Air Force contract FA8721-05-C-0002. Opinions, interpretations, conclusions and recommendations are those of the author and not necessarily endorsed by the United States Government.
![Page 2: 0839-118-1 MIT Lincoln Laboratory Radar Pulse Compression Using the NVIDIA CUDA SDK Stephen Bash, David Carpman, and David Holl HPEC 2008 September 23-25,](https://reader030.vdocuments.net/reader030/viewer/2022032805/56649ee55503460f94bf408c/html5/thumbnails/2.jpg)
MIT Lincoln Laboratory0839-118-2
NVIDIA Compute Unified Device Architecture SDK
• Create custom kernels that run on GPU
• Extension of C language
• Provides driver- and runtime-level APIs
• Includes numerical libraries– CUFFT– CUBLAS
• $/GFLOP GPU=$1.27 CPU=$29.18
Benchmarking the NVIDIA 8800GTX with the CUDA Development Platform
Michael McGraw-Herdeg, MIT
Douglas P. Enright, The Aerospace Corporation
B. Scott Michel, The Aerospace Corporation
©2007 The Aerospace Corporation
HPEC 07:
![Page 3: 0839-118-1 MIT Lincoln Laboratory Radar Pulse Compression Using the NVIDIA CUDA SDK Stephen Bash, David Carpman, and David Holl HPEC 2008 September 23-25,](https://reader030.vdocuments.net/reader030/viewer/2022032805/56649ee55503460f94bf408c/html5/thumbnails/3.jpg)
MIT Lincoln Laboratory0839-118-3
NVIDIA Compute Unified Device Architecture SDK
• Create custom kernels that run on GPU
• Extension of C language
• Provides driver- and runtime-level APIs
• Includes numerical libraries– CUFFT– CUBLAS
• $/GFLOP GPU=$1.27 CPU=$29.18
Benchmarking the NVIDIA 8800GTX with the CUDA Development Platform
Michael McGraw-Herdeg, MIT
Douglas P. Enright, The Aerospace Corporation
B. Scott Michel, The Aerospace Corporation
©2007 The Aerospace Corporation
HPEC 07:
![Page 4: 0839-118-1 MIT Lincoln Laboratory Radar Pulse Compression Using the NVIDIA CUDA SDK Stephen Bash, David Carpman, and David Holl HPEC 2008 September 23-25,](https://reader030.vdocuments.net/reader030/viewer/2022032805/56649ee55503460f94bf408c/html5/thumbnails/4.jpg)
MIT Lincoln Laboratory0839-118-4
Radar Pulse Compression
• Waveform design and processing to achieve higher range resolution and sensitivity*
• Processing consists of convolution with FIR filter
– Doppler tolerant (top): traditional frequency domain convolution
– Doppler intolerant (bottom): additional FFT and Doppler correction required
* Skolnik, Radar Handbook, Second Edition. McGraw Hill Publishing, Boston, MA, 1990.
Replica
Replica
Fast Time IFFT
Fast Time FFT
Fast Time FFT
Slow Time FFT
Fast Time IFFT
Fast Time IFFT
Doppler Correction
![Page 5: 0839-118-1 MIT Lincoln Laboratory Radar Pulse Compression Using the NVIDIA CUDA SDK Stephen Bash, David Carpman, and David Holl HPEC 2008 September 23-25,](https://reader030.vdocuments.net/reader030/viewer/2022032805/56649ee55503460f94bf408c/html5/thumbnails/5.jpg)
MIT Lincoln Laboratory0839-118-5
GPU vs. CPU Comparison
• CPU vs GPU comparison in real-world conditions
– 2 GHz dual quad-core AMD Opterons vs eVGA eGeForce 8800 Ultra
– Memory transfer to and from GPU included in timing
0
10
20
30
40
50
60
Processing Time Per Stage500
480
490CPUGPU
Sta
ge
Tim
e (
ms
)
Batch Size
1D FFT
1 4 16 64 256
1024
2048
4096
8192
16384
32768
65536
1000
1960
4725
10368
27000
FF
T S
ize
3
2
1
0.5
0.3
GP
U S
pe
ed
up
Extract Range Region
Fast Time IFFT
Multiply Replica 3
Extract Range Region
Fast Time IFFT
Multiply Replica 2
Extract Range Region
Doppler Correction
Doppler FFT
Doppler Window
Fast Time IFFT
Multiply Replica 1
Fast Time FFT
![Page 6: 0839-118-1 MIT Lincoln Laboratory Radar Pulse Compression Using the NVIDIA CUDA SDK Stephen Bash, David Carpman, and David Holl HPEC 2008 September 23-25,](https://reader030.vdocuments.net/reader030/viewer/2022032805/56649ee55503460f94bf408c/html5/thumbnails/6.jpg)
MIT Lincoln Laboratory0839-118-6
Backups
![Page 7: 0839-118-1 MIT Lincoln Laboratory Radar Pulse Compression Using the NVIDIA CUDA SDK Stephen Bash, David Carpman, and David Holl HPEC 2008 September 23-25,](https://reader030.vdocuments.net/reader030/viewer/2022032805/56649ee55503460f94bf408c/html5/thumbnails/7.jpg)
MIT Lincoln Laboratory0839-118-7
Reference: $/GFLOP
As of July 2007, these products represent the top of the line consumer CPU and graphics card according to floating point computational power:
1. Kentsfield Core 2 Extreme QX680037.7 GFLOPS – fastest CPU as of 7/16/2007
http://www.tomshardware.com/2007/07/16/cpu_charts_2007/page36.html
$1100 – price as of March 10, 2008 http://www.google.com/products?q=Kentsfield+Core+2+Extreme+QX6800
$/GFLOPS = $29.18Notes: Price excludes motherboard + power supply + memory + GPU
2. EVGA GeForce 8800 Ultra Superclocked (NVIDIA)576 GFLOPS – theoretical peak
http://en.wikipedia.org/wiki/GeForce_8_Series
$730 – price as of March 10, 2008
http://www.google.com/products?q=768-P2-N887-AR&scoring=p
$/GFLOPS = $1.27Notes: Price includes 768 MB GDDR3 memory, but excludes: motherboard + power supply + CPU