accelerating real-time processing of the atst adaptive ... · vivek venugopal • atmospheric...
TRANSCRIPT
Accelerating Real-time Processing of the ATST Adaptive Optics SystemVivek Venugopal
• Atmospheric turbulence distorts the wavefront by generating phase variations in the incoming light and limits the resolution of large solar telescopes such as the four meter solar telescope, Advanced Technology Solar Telescope (ATST) now beginning construction at Maui's Haleakala.
HOAO Real-time system
[1] S. L. Keil, T. R. Rimmele, J. Wagner, and ATST team. Advanced Technology Solar Telescope: A status report. Astronomische Nachrichten, 331:609–615, 2010. [2] V. Venugopal, et. al. Accelerating Real-time processing of the ATST Adaptive Optics System using Coarse-grained Parallel Hardware Architectures. In International Conference on Engineering of Reconfigurable Systems and Algorithms (ERSA 2011), pages 296–301, Las Vegas, USA, July 2011.[3] Nvidia Inc. (Last Accessed: February 2012) Nvidia Tesla C2050 GPU Computing Processor. [Online]. Available: http://www.nvidia.com/object/product_tesla_c2050_us.html
Implementation platforms and Results
• The high speed camera sends 1750 20x20 pixel raw sub-aperture images to the processing system. • The sub-apertures undergo a dark field correction followed by a flat field correction, which is the equivalent to correcting the images for zero level and gain equalization. • The 2D cross-correlation step determines the shift in the x and y direction of each sub-aperture, as compared to the reference image. • The wavefront reconstruction step consists of a precomputed 3500x1900 reconstruction matrix, which is multiplied with the x and y shifts.
• Nvidia's GPUs provide computational speedup as compared to the CPU implementation. • Although DSPs are faster, using 48 DSPs are more expensive than 3 GPUs.• GPUs provide flexibility in terms of problem scalability as it can handle the computational complexity and at the same time provides more FLOPS/$ as compared to DSPs.
Introduction
Advanced Technology Solar Telescope
FPGA-DSP solution
Conclusion
References
Processors
Uncorrected light
Corrected light
Tip/Tilt Mirror
DeformableMirror (DM)
Beamsplitter
Shack-Hartmann Lenslet Array
CCD Camera
DM drive signal
Tilt drive signal
Adaptive Optics system
•The adaptive optics (AO) system senses the wavefront aberrations and applies the corresponding correction to the adjustable deformable mirror to improve the resolution of the telescope.
WFS Camera
Cross-correlation
slope computation
Average slope
Offscale slope
detectionMatrix
multiplyActuator servos
Data collection
Zernike offload process
Tip/Tilt servos
X X
Dark field Flat field Reference
imageSlope offsets
Recon-struction matrix
Actuator offsets
Actuator gains
Servo parameters
Offscale slope
tolerance
Tip/Tilt mirror
Deformable mirror
Dark pixels20x20
Raw pixels20x20
flat correction
Flat pixels20x20
2D cross-correlation find maximum
dark correction
Reference pixels20x20
x and y interpolation
GPUFPGA
reconstruction
FFTFFT
Complex conjugate Multiplication
IFFT
reference image
flat corrected
imageoriginal reference
image 26x26 pixels
precomputed reference
(20x20 pixels)
Precomputed Reference pixels 20x20 (49 regions)
precomputed reference
(20x20 pixels)
Region 1
Region 2
precomputed reference
(20x20 pixels)Region 49
Camera data half
Camera data half
FPGA 1
FPGA 2
12 optical fiber
channels
12 optical fiber
channels
PCI-e
bus
GPU/CPU
Camera data half
Camera data half
FPGA 1
FPGA 2
12 optical fiber
channels
48 DSPs
12 channels
12 channels
12 optical fiber
channels
FPGA-GPU solution
x y
1750 1750x
x and y shifts for 1750 sub-aperture images
1900
3500
reconstruction matrix 1900x3500
1900
accumulated values for 1900 actuators
Process partitioning and mapping
FFT correlation
7x7 correlation
1900x3500 reconstruction matrix
0
550
1100
1650
2200
1 50
1889
15101619
1188
Tim
e in
us
No. of images
Tesla C1060Tesla C2050
FFT correlation results
0
100
200
300
400
1 50 584
300.93307.49312.9281.39279.35278.36
Tim
e in
us
No. of images
Tesla C1060Tesla C2050
7x7 correlation results
1
10
100
1000
10000
100000 46769
229
956964
Tim
e in
us
Tesla C1060Tesla C2050DSPCPU
Wavefront reconstruction results
• Test platform: 2 quad-core 2.6GHz AMD processors running Ubuntu Linux OS with Nvidia Tesla C1060 and C2050 • Implementation strategy: 48 DSPs or GPUs?• 7x7 correlation faster than FFT correlation on GPUs
Compute-intensive