an fft/ifft accelerator for oct application
DESCRIPTION
An FFT/IFFT Accelerator for OCT Application. Zhenhong Liu. What is OCT?. OCT = Optical Coherence Tomography An optical analogy of Ultrasound Tomography Provide micrometer-resolution Light source is not harmful (unlike x-ray). Optical coherence tomogram of a fingertip - PowerPoint PPT PresentationTRANSCRIPT
An FFT/IFFT Accelerator for OCT Application
Zhenhong Liu
• OCT = Optical Coherence Tomography
• An optical analogy of Ultrasound Tomography
• Provide micrometer-resolution
• Light source is not harmful (unlike x-ray)
“™ø¥µΩ’‚∏ˆÕºœÒ–Ë“™ QuickTime˛ ∫ÕGIF Ω‚—πÀı≥Öڰ£
Optical coherence tomogram of a fingertip(http://en.wikipedia.org/wiki/File:HautFingerspitzeOCT.gif)
What is OCT?
Data Processing
•3 FFT/IFFT in the algorithm
•# of data point is large: 1024/2048
Sample Data
• 16-bit int for image data, converted to floating-point during processing
• single precision floating-point for background and calibration data
• output to a gray scale bmp file, 1024x1024 pixels
Using Floating-point
Using Fixed-pointFixed-point number (WL, FL):•Keep twiddle factors (32, 30)•Change the fractional length for input/output data.•Prevent overflow during FFT/IFFT: arithmetic right shift the output by 1 bit after every butterfly operation in FFT or IFFT.
Using Fixed-point
(32, 2) (32, 4)
Using Fixed-point
(32, 4) (32, 6)
Using Fixed-point
(32, 6)f-p
Fixed-point + Approx. Twiddle Factor
• Very sensitive to twiddle factor• Simply reduce the fraction length is not
effective:• OK for the twiddle factors >> 0• Large errors for twiddle factors ~ 0
Approx. Twiddle FactorA suitable approx. multiplier
•Finish a multiplication in n iterations•Round A to a number that has n 1’s at most•Store the positions of the 1’s in SRAM•Requires that A does not change often•The larger n is, the more accurate the product is
Fixed-point + Approx. Twiddle Factor
n=1 n=2
Fixed-point + Approx. Twiddle Factor
n=3 n=4
Fixed-point + Approx. Twiddle Factor
fp n=4
Hardware Implementation•Original design only supports positive A
•need an extra sign bit in SRAM for each entry•xor B with the sign bit.•Support for complex multiplication•two units share one SRAM•no add/sub operation after multiplying•Cannot pipeline the design, use multiple unit in the butterfly unit to increase throughput•n iteration -> n units in a butterfly unit•For IFFT, only need a 1-bit control signal.
Hardware Implementationschemati
c:
One complex multiplying in 2n cycles
Hardware Implementationschemati
c:
Butterfly unit, DIF FFT/IFFT
Hardware Implementation
•For N-point FFT/IFFT, each stage takes N/2 cycles•Hardware cost even smaller than using fixed-point accurate multiplier• Should be more power efficient•No visible changes to the output images
Thank you!