guillermo güichal utn – frbb · integer division of system clock (sample rate = clock / 4 is...
TRANSCRIPT
Introduction to
DSP Using FPGAs
Guillermo GüichalUTN – FRBB
Program
Morning
Introduction to DSP What is DSP? How is it done?
Why use FPGAs for DSP? Comments on DSP algorithms and FPGA implementations.
Issues related to DSP using FPGAs Clock frequency, sampling, bit count, arithmetic operations.
FPGA Design flow for DSP applicationsDesign alternatives: HDLs, dedicated tools, etc.
Basic design examples
Program
Afternoon Intro to Xilinx System Generator
Xilinx Sysgen and its interaction with Matlab, Simulink & ISE.
Use of Xilinx SysGen for DSP Use of Sysgen for simulation and synthesis.
Examples
Additional Topics Other tools and additional comments
References
“On the Roots of Digital Signal Processing, Parts 1 & 2”, IEEE Circuits and Systems Magazine, Vol. 7 Number 1 and 4
Berkeley Design Technology, Inc. whitepapers, www.bdti.com
DSP-FPGA.com articles, www.dsp-fpga.com
Andraka Consulting articles, www.andraka.com
Programmable Logic Design Line articles, www.pldesignline.com
FPGA and Structured ASIC Journal articles, www.fpgajournal.com
ACM Queue articles, www.acmqueue.com
“Applying Data Converters”, Texas Instruments
“The Scientist and Engineer’s Guide to DSP”, Steven W Smith, www.DSPGuide.com
IEEE papers, www.ieeexplore.ieee.org
“Digital Signal Processing with FPGAs”, Uwe Mayer-Baese, Springer
Xilinx, Altera and Lattice documentation, www.xilinx.com, www.altera.com, www.latticesemi.com
The Mathworks documentation, www.mathworks.com
etc.
Let’s go over some backgroundinformation on DSP
A Propos of the “Treatise on Cubic Form" by Juan de Herrera
Salvador Dali, 1960
What is DSP?
From wikipediaDigital signal processing (DSP) is the study of signals in a digital representation and the processing methods of these signals. DSP and analog signal processing are subfields of signal processing. DSP includes subfields like: audio and speech signal processing, sonar and radar signal processing, sensor array processing, spectral estimation, statistical signal processing, digital image processing, signal processing for communications, biomedical signal processing, seismic data processing, etc.
What is DSP?
Digital Signal Processing (not Processors)
DIGITAL: Digital domain, as opposed to analog. Everything is digital nowadays…
SIGNAL: A physical quantity that changes over time.
PROCESSING: Do something with the signal, manipulate it in useful ways.
What is DSP?
We have always “processed” signals…… to communicate
… to understand and summarize scientific data
… for entertainment
... etc.
Now we do it digitally, research new methods and algorithms and constantly find new challenging and complex applications.
Many of the mathematical methods and algorithms used for signal processing are well known, and were developed within other contexts.
Refer to the Circuits and Systems magazine series “On the Roots of DSP”, by Andreas Antoniou for a history of DSP.
What is DSP?
So…
We want to manipulate signals… which are usually real signals like audio, temperature, currents and voltages, seismic, sonar, RF waves (communications), images, biological, etc.
We take them into the digital domain because it makes life easier for us.
We manipulate (process) the signals using algorithms and methods to transform them in ways that are useful for our purposes.
What is DSP?
Real signals Analog signal conditioning Bandwidth, amplitude, etc.
Make this as simple as possible
Digital domain Sampling (discretization and quantization) We try to do this as early as possible in the process
Processing Signal processing algorithms and methods Implies mathematical operations, delay lines. Lots of theory and tools… implementation issues! An interesting blend of theory and practice
And we want to do all this this in the simplest, cheapest manner…
What is DSP?
DSP is made possible by mathematical research, the digital computer and IC technology
Discretization and interpolation has been part of mathematics since classical times
Work by Fourier, Poisson, Laurent and others during the 1700 an 1800s Work during 1900s by Nyquist, Shannon, Bode and others. Calculating machines, ENIAC and the modern digital computer Integrated circuit technology in 1950s Numerical filtering methods during 1960s Creation of specific processors (DSPs), ADC and DACs Powerful processors, IC technology and alternatives to ASICs Tools, compilers, simulators make our job easier
How is DSP Done?
DSP algorithms… Filters
FIR IIR
Discrete Fourier Transform
DSP algorithms… Direct Digital
Synthesis (DDS)
Digital Up-converter
How is DSP Done?
DSP algorithms… OFDM Receiver (used in benchmark article)
How is DSP Done?
DSP applications (DSPs: Back to the Future, ACM Queue article)
How is DSP Done?
DSP algorithms… shape DSP architectures Fast multiplication and other DSP tasks
Single cycle, multiply accumulate (MAC), ALUs, shifter, wide accumulators
Flexible and efficient memory accessData delay lines, FIFOs, dedicated address generation (inverted, circular addressing), high bandwidth (multiple busses, coefficient )
Efficient Looping Zero overhead looping, addressing and calculations in parallel
Real time, speed High clock frequency, parallelism (MAC, ALU, address generation, SIMD), special instruction sets (low end DSP), multiple execution units (high end, VLIW
Streamlined I/O and interfacesMust connect to ADC, DACs and transfer data in and out in real time and with little overhead
Data formatsDiverse precisions, accumulator guard bits. Support for rounding, saturation and shifting. Speed, cost & power-> Fixed point, Numeric fidelity-> floating point
How is DSP Done?
What else does wikipedia say…DSP algorithms have traditionally run on specialized processors called digital signal processors (DSPs). Algorithms requiring more performance than DSPs could provide were typically implemented using application-specific integrated circuit (ASICs). Today however there are a number of technologies used for digital signal processing.
These include more powerful general purpose microprocessors, field-programmable gate arrays (FPGAs), digital signal controllers (mostly for industrial apps such as motor control), and stream processors, among others.
How is DSP Done?
Nowadays there are several options for DSP applications ASICs (Application Specific Integrated Circuits)
ASSP (Application Specific Standard Product)
DSP (Digital Signal Processor)
FPGA (Field Programmable Gate Array)
GPP and MCUs with DSP enhancements
High end CPUs
We have to choose the right platform for each problem!Speed? Cost? Power? Tools? Time to market? Flexibility? Other tasks?
How is DSP Done?
How do we choose? What are my needs? What are each options’ strengths? What are each options’ limitations? What are my strengths and weaknesses? Are there tools available? What will be around the DSP portion of my design?
Strengths and limitations of each option change over time… and they change quickly!
Update your information and don’t take anything for granted
How is DSP Done?
FPGA Technology Overview
Reading
Salvador Dali, 1981
FPGA Overview
An FPGA is a “sea of gates”. Lots of logic that can be connected together to form different combinational and sequential digital circuits.
An FPGA inside
Function generation (combinational logic) Registers and latches (sequential logic) Memory Clock management Power management
DSP functions!!!
FPGA Overview
Xilinx Spartan 3 FPGA – General FPGA Architecture
FPGA Overview
Xilinx Spartan 3 FPGA
CLB Structure
FPGA Overview
Xilinx Spartan 3 Memory
FPGA Overview
Xilinx Spartan 3 Clock Management
FPGA Overview
Xilinx Spartan 3 Routing
FPGA DSP Functions
High end FPGAs – Function generators & registers Xilinx Virtex 5
Altera Stratix III
DSP: Low cost FPGAs Xilinx Spartan 3 has multipliers
Altera Cyclone III
FPGA DSP Functions
DSP: Low cost FPGAs LatticeECP-DSP
FPGA DSP Functions
DSP: Low cost FPGAs LatticeECP-DSP vs Spartan 3 (Lattice Article)
FPGA DSP Functions
DSP: Low cost FPGAs Xilinx Spartan 3A-DSP (XtremeDSP DSP48 slices)
FPGA DSP Functions
DSP: High end FPGAs
Xilinx and Altera both have High End FPGAs with DSP enhancements
High speed multipliers
Flexible Multiply–Accumulate logic
DSP block cascading and interconnection
Rounding and saturation units
Barrel shifter
Support for floating point multiplication
Advanced clock and power management
Support for additional DSP Intellectual Property (IP)
FPGA DSP Functions
DSP: High end FPGAs Altera Stratix III DSP Blocks
FPGA DSP Functions
DSP: High end FPGAs Xilinx Virtex 5 DSP48 Slice
FPGA DSP Functions
Do we want to use an FPGA?
Tower of Enigmas
Salvador Dali, 1981
FPGA Overview
Remember… a DSP is essentially a sequential processing machine, with support to execute (although most DSP do several things in parallel)
Analog Devices’AD21xx architecture
FPGA Overview
… but some are very powerful processing machines!
TI’s C6712 AD’s Blackfin
When to use FPGAs
When do we choose FPGAs to do DSP?
FPGAs are good for… Lots of parallel processing Many simple and rigid, repetitive tasks High sampling rates and data bandwidth Fixed point operations Implementing small DSPs blocks within lots of digital logic Prototyping or replacing ASICs Flexible or dynamic hardware configuration Mapping a block diagram directly into hardware Multirate systems Configurable word lengths and precision What else?
When do we choose FPGAs to do DSP?
FPGAs are not that good for… Sequential tasks (if we have C code available) Complex tasks with lots of decision making and branching Very low power applications (but that is changing) Floating point operations … What else?
When to use FPGAs
When to use FPGAs
From Xilinx slides..
FPGA vs DSPFrom an ACMQueue article
From FPGA vendor’s article (Altera)
HighHighEasyHighLowShortRISC/GPP
HighLowEasyLowLowestShortMCU
HighHighHardHighHighShortFPGA
HighLowEasyLowHighShortDSP
LowLowEasiestLowHighShortestASSP
LowLowHardestLowHighLongestASIC
FlexibilityPowerEase of UsePricePerformanceTime to Market
When to use FPGAs
FPGA vs DSPFrom FPGA vendor (Altera at FPGA-DSP.com article)
When to use FPGAs
FPGA vs DSPFrom FPGA vendor (Xilinx at DSP Engineering article)
FPGAs for high end applications
Improved performance (parallelism)
Lower system power (compared to DSP clusters)
Reconfigurable hardware (evolving standards)
Custom bit precision
Optimization of computation hardware (not possible in DSPs)(distributed arithmetic, etc, see Andraka)
High I/O bandwidth
When to use FPGAs
FPGA vs DSP and other options
ASICs (Application Specific Integrated Circuits)
ASSP (Application Specific Standard Product)
DSP (Digital Signal Processor)
FPGA (Field Programmable Gate Array)
GPP and MCUs with DSP enhancements (dsPIC, ARM DSP extensions)
High end CPUs (Intel, AMD doing processing for audio and images)
Comments? Opinions? Other issues?
When to use FPGAs
We’ve decided to use an FPGA!What issues affect our implementation?
Portrait of Mrs. Mary Sigall
Salvador Dali, 1948
Implementation Issues
Issues that affect the implementation on an FPGA Data frequency, sampling frequency, clock frequency
Number representation, word widths, precision, rounding
Arithmetic operations, parallel, serial, distributed, overflow, underflow, saturation
Look-Up tables, block ram or distributed memory, optimizations
Implementation Issues
Frequencies Sampling frequency: Frequency at which samples of the data are
taken and processed.
Clock frequency: Frequency of the system clock (Clock driving the FPGA registers)
Data rate: Rate at which new data arrives and needs to be processed
Multiple frequencies: several different data rates, multiple sampling frequencies and/or different clock domains
These rates and frequencies will affect and limitthe possible architectures and solutions
Implementation Issues
Sampling frequency Data sampling must meet the Nyquist criterion
External data must be band limited before it is sampled (filters)
Data can be oversampled or undersampled Oversampling used to increase SNR or reduce effects of
quantization noise Undersampling used in IF or RF signals
Multirate systems have several sampling frequencies Relationship between them affects data transfers between them
FPGA will drive ADC control signals ADC timing, relationship between system clock and ADC signals Meet data setup and hold times in FPGA
Implementation Issues
Sampling frequency Manage synchronization to system clock
Integer division of system clock (Sample rate = Clock / 4 is easy!) Manage multiple sample rates (Downsampling by 43 is hard) Use asynchronous FIFOs to get data to and from processing
logic to the DAC and ADC FPGA will probably drive ADC and DAC control signals
ADC and DAC timing Relationship between system clock and converter signals Must meet data setup and hold times for FPGA and DAC
Implementation Issues
Clock frequency Higher clock frequencies will enable higher sampling rates
Higher clock frequencies will consume more power
Use as few clock domains as possible and control rates with “Clock Enable” input
Clock dividers generate “CE” signals at lower rates
Logic can be reused
Implementation Issues
Clock frequency If sampling frequency is lower than system clock, several
clocks can be used to process the data
Implementation Issues
Data rate Useful data might come at lower frequencies than the
sampling frequency
Some data may not need to be sampled at same as others
Implementation Issues
Clock frequency and sample rate
Data at clock rate
Filter result available every on clock cycle
Data at CE rate
Filter result takes several clock cycles to complete
Note: Data has N bits and each FF represents N registers
Implementation Issues
Clock frequency and sample rate Clock at high speed
Data at CE rate (sample frequency)
Each coefficient is multiplied at CE2 rate
Multiplication is implemented with shift-add logic, and takes several cycles to complete
Filter result takes several clock cycles to complete
Implementation Issues
Clock frequency and sample rate In the filter shown, timing between all signals must be synchronized to
achieve results Use SYNCHRONOUS logic, as recommended for FPGAs
Different CE signals control what parts of the process are activated by enabling FFs
Implementation Issues
Number representation, Bits and Word Widths Fixed point or floating point. Number representation.
Operations on the data will change the word size to maintain full precision
Scaling
Overflow, underflow, rounding
Implementation Issues
Number systems for binary representations
Fixed point numbers on FPGAs (For now! High end FPGAs have support for some floating point operations)
Each system has advantages and disadvantages for implementations in digital circuits or arithmetic operations.
Our examples will use fixed point two’s complement representations
Fixed point
Traditional Non traditional
•Two’s complement•One’s complement•Sign-Magnitude•Diminshed-1
•Signed digit•RNS
Implementation Issues
Fixed point binary numbers Fixed point sets the decimal point at a fixed location within
the binary word
Implementation Issues
Operation results have longer word lengths
Implementation Issues
Overflow, saturation, rounding & scaling Overflow results when an arithmetic operation requires more
bits than are available in the result register
Rounding will help maintain the number of bits low
May introduce offsets or accumulative errors
Scaling can be used to reduce the number of bits used
If all numbers are between -1 and 1 multiplication will result in a number between -1 and 1
Will result in larger round-off or quantization errors
Implementation Issues
Overflow Consider these 3-bit two’s complement numbers: 010, 011
Overflow! Maintaining the same number of bits gives an incorrect result.
An extra bit for the result will give the correct answer
Sign extension: 0011 + 0010 = 0101 (4 bit number = 5)
To avoid overflow we can use extra bits in the accumulator (guard bits)
Implementation Issues
Saturation
In previous operation, 010 + 011:
Sign extension: 0011 + 0010 = 0101 (4 bit number = 5)Result is OK but has an extra bit
Overflow is detected by checking the old sign bit position with the new sign bit bit 4 = bit 3? No Overflow
Saturate the result0101 saturated to 011 (maximum number that can be represented by 3 bits)
In filters, maybe we can saturate the result, but not the intermediate values
Implementation Issues
Rounding
Get rid of least significant bits in result (multiplication)
Round coefficients and/or datapath
Different rounding methods: Truncation, round floor, round ceiling, round half-up, round half-even, etc.
Refer to Programmable Logic Design Line article (Jan 4, 2006)
“An introduction to different rounding algorithms”
Implementation Issues
Rounding
Implementation Issues
Arithmetic Operations and Implementation StructuresOperations can be done using different approaches
Different word widths, number representations, etc.
Different clock frequencies and data rates
Others (Distributed arithmetic, optimizations)
Several factors determine the type of implementation
Required precision
Required data rates and sampling frequency
Resources available in the FPGA
Dedicated DSP blocks, multipliers, logic, memory, etc
Others (Power. External logic.)
Implementation Issues
Arithmetic Operations: Some multipliers(from Andraka’s web site, www.andraka.com) Scaling Accumulator Ripple-Carry Array Row Adder Tree
Computed Partial Product Partial Product LUT
Implementation Issues
Structures, Calculations and Optimizations Filter structures
Pipelining
Power of 2 operations
Cordic algorithms
Resource sharing
Look up tables for operations, look up table optimization
Type of memory used (block or distributed)
Etc.
Implementation Issues
Filter Structures (from Peled and Liu paper)
Implementation Issues
Pipelining
Implementation Issues
Operations and StructuresRefer to papers and articles:
“Multiplication in FPGAs”, Ray Andraka, www.andraka.com
“FPGAs: the high end alternative for DSP applications”, Chris Dick, DSP Engineering
“A New Hardware Realization of Digital Filters”, Peled and B Liu, IEEE Trans on Acoust., Speech, Signal Processing, Dec. 1974
“Application of Distributed Arithmetic to Digital Signal Processing: A Tutorial Review“, Stanley White, IEEE ASSP Magazine, July 1989
“High Speed Binary Addition“, Robert Jackson, Sunil Talwar, IEEE Signals, Systems and Computers, 2004
Etc.
How do we go about doing DSP on an FPGA?
The Disintegration of the Persistence of Memory
Salvador Dali, 1954
Implementation Issues
Typical DSP design flow
Task Work on…
Design – Simulate Models
Code – Compile – Simulate Code (Assembly, C, HDLs)
Run - Test – Debug Platform
This is valid for CPU, DSP or FPGA based approaches… we are probably more careful if building and ASIC.
Implementation Issues
Possible design flows Code-based
Model-based
Mixed
Tools?
A basic example: Filtering
Three Apparitions of the Visage of Gala
Salvador Dali, 1945