the big picturethe big picture - sase · titis’s arm processor arm processor-based sitara™...

13
Eng. Julian S. Bruno REAL TIME DIGITAL SIGNAL PROCESSING www.electron.frba.utn.edu.ar/dplab Simposio Argentino de Sistemas Embebidos UTN - FRBA 2011 Introduction Why Digital? A brief comparison with analog. UTN - FRBA 2011 Eng. Julian S. Bruno Advantages Advantages Flexibility. Easily modifiable and upgradeable. Reproducibility. Don’t depend on components tolerance. Exactly reproduced from one unit to other. Reliability. No age or environmental drift. Comple it Allo s sophisticated applications Complexity. Allows sophisticated applications in only one chip. UTN - FRBA 2011 Eng. Julian S. Bruno The BIG picture The BIG picture Dt Real time Real time Data Real time algorithms Real time algorithms Results FAST UTN - FRBA 2011 Eng. Julian S. Bruno Real Time DSP System FAST

Upload: others

Post on 12-Oct-2019

45 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The BIG pictureThe BIG picture - SASE · TITIs’s ARM Processor ARM Processor-Based Sitara™ ARM™ ARM Microprocessors Cortex™-A8 and ARM9™-based embedded microprocessors Clock

Eng. Julian S. Bruno

REAL TIMEDIGITALSIGNAL

PROCESSING

www.electron.frba.utn.edu.ar/dplab

Simposio Argentino de Sistemas EmbebidosUTN - FRBA 2011

IntroductionWhy Digital? A brief comparison with analog.

UTN - FRBA 2011 Eng. Julian S. Bruno

AdvantagesAdvantages

Flexibility. Easily modifiable and upgradeable. Reproducibility. Don’t depend on p y p

components tolerance. Exactly reproduced from one unit to other.

Reliability. No age or environmental drift.Comple it Allo s sophisticated applications Complexity. Allows sophisticated applications in only one chip.

UTN - FRBA 2011 Eng. Julian S. Bruno

The BIG pictureThe BIG picture

D t

Real timeReal time

Data

Real time algorithmsReal time algorithms

Results

FAST

UTN - FRBA 2011 Eng. Julian S. Bruno

Real Time DSP SystemFAST

Page 2: The BIG pictureThe BIG picture - SASE · TITIs’s ARM Processor ARM Processor-Based Sitara™ ARM™ ARM Microprocessors Cortex™-A8 and ARM9™-based embedded microprocessors Clock

Sampling signals: A very important fifirst step.

Fc Fs

N

LF

MIPSMFLOPS

UTN - FRBA 2011 Eng. Julian S. Bruno

Real Time DSP System

Sampling low-pass signals (CT)Sampling low pass signals (CT)

The sampling theorem indicates that a continuous signalThe sampling theorem indicates that a continuous signal can be properly sampled, only if it does not contain frequency components above one-half of the sampling

trate.

UTN - FRBA 2011 Eng. Julian S. Bruno

NS ff 2Nyquist sampling theorem

Aliasing and frequency ambiguityAliasing and frequency ambiguityFs = 6KHzFs = 6KHz

UTN - FRBA 2011 Eng. Julian S. Bruno

Sampling band-pass signalsSampling band pass signals

IF samplingHarmonic

samplingsamplingSub-Nyquist

samplingUndersamplingp g

for any positive integer m22 BfBf

UTN - FRBA 2011 Eng. Julian S. Bruno

for any positive integer m, where fs ≥ 2B is accomplished.1

22

mBff

mBf c

sc

Page 3: The BIG pictureThe BIG picture - SASE · TITIs’s ARM Processor ARM Processor-Based Sitara™ ARM™ ARM Microprocessors Cortex™-A8 and ARM9™-based embedded microprocessors Clock

Sampling band-pass signalsSampling band pass signals

m (2Fc-B)/m (2Fc-B)/(m+1) Optimum Fs

1 35.0 MHz 22.5 MHz 22.5 MHz

2 17.5 MHz 15.0 MHz 17.5 MHz

3 11.66 MHz 11.25 MHz 11.25 MHz

4 8 75 MH 9 0 MH4 8.75 MHz 9.0 MHz -

5 7.0 MHz 7.5 MHz -

Optimum Fs is defined here as that optimum frequency p q ywhere spectral replications do no butt up against each other

UTN - FRBA 2011 Eng. Julian S. Bruno

except at zero Hz

Reconstruction signalsReconstruction signals

UTN - FRBA 2011 Eng. Julian S. Bruno

Real Time DSP System

Reconstruction signalsReconstruction signals

Analog signal XAnalog signal X can be Analog signal XAnalog signal X can be reconstructed from its samples by using thef ll i f l

S

SS

k TkTtkTXtX sinc)()(

following formula: The reconstructionreconstruction is

based on the interpolationinterpolation

Sk

based on the interpolationinterpolationof shifted sincsinc functionsfunctions..

It is very difficult to generate i f ti b l t isinc functions by electronic

circuitry. An approximation of a sinc An approximation of a sinc

function is a pulse. Sample Sample and holdand hold circuit performs this approximationthis approximation.

UTN - FRBA 2011 Eng. Julian S. Bruno

Reconstruction ErrorsReconstruction ErrorsSample and hold circuits

The gain in the desired central band is not constant

UTN - FRBA 2011 Eng. Julian S. Bruno

The are high-frequency replica of the signal spectrum

Page 4: The BIG pictureThe BIG picture - SASE · TITIs’s ARM Processor ARM Processor-Based Sitara™ ARM™ ARM Microprocessors Cortex™-A8 and ARM9™-based embedded microprocessors Clock

Reconstruction SolutionsReconstruction Solutions

The gain in the desired central band is not constant

It is possible to compensate for this non-ideality by p p y yusing an inverse filter as part of the DSP component

The are high-frequency replica of the signal spectrum

UTN - FRBA 2011 Eng. Julian S. Bruno

which can be removed by using a lowpass filter

Reconstruction Errors ExampleReconstruction Errors Example

UTN - FRBA 2011 Eng. Julian S. Bruno

Real Time constraintsReal Time constraints

Signal Path

UTN - FRBA 2011 Eng. Julian S. Bruno

Real Time DSP System

Real time constraintsReal time constraints

Algorithms time (tA) MUST fit between two consecutive sampling periods (tS).

Thus tA limits the maximum frequency that a system can work.

The definition of real time is VERY application dependant (faster speed of evolution of the system).( y )

UTN - FRBA 2011 Eng. Julian S. Bruno

Page 5: The BIG pictureThe BIG picture - SASE · TITIs’s ARM Processor ARM Processor-Based Sitara™ ARM™ ARM Microprocessors Cortex™-A8 and ARM9™-based embedded microprocessors Clock

Real time constraintsReal time constraints

Bl k P i M d Block Processing Mode 4 memory buffers of

length N are required f fffor double-buffering method.

2 memory buffers (in y (and out) are needed for internal processing by the processor.

A delay of 2NTs is incurred in block processing.

More complicated programming is needed More complicated programming is needed to manage the switching between buffers.

Can be configured the ADC and DAC to transfer data samples into the internal

UTN - FRBA 2011 Eng. Julian S. Bruno

transfer data samples into the internal memory of processor using the serial ports and the DMA.

DSP hardwareDSP hardware

UTN - FRBA 2011 Eng. Julian S. Bruno

Real Time DSP System

What can we do with a DSP?What can we do with a DSP?

Almost any linear and nonlinear system (PID controller).

Digital filters (FIR-IIR). Adaptive systems (LMS algorithm) Adaptive systems (LMS algorithm). Modulators and demodulators. Any mathematical intensive algorithm (FFT-

DCT-WT). Image compression algorithms Real time video processing

UTN - FRBA 2011 Eng. Julian S. Bruno

Real time video processing

Linear systems implementationLinear systems implementation

Being x(n) and h(n) are arrays of Being x(n) and h(n) are arrays of numbers. If we want to compute y(n) we have to multiply and sum the last M samples being M the length of h(n)samples, being M the length of h(n). This repeated for every new sample received from de ADC.

As you can see, any linear system uses multiplications, accumulations ( ) d l i t i l

UTN - FRBA 2011 Eng. Julian S. Bruno

(sums), and loops intensively.

Page 6: The BIG pictureThe BIG picture - SASE · TITIs’s ARM Processor ARM Processor-Based Sitara™ ARM™ ARM Microprocessors Cortex™-A8 and ARM9™-based embedded microprocessors Clock

Fast Fourier Transform FFTFast Fourier Transform FFT

UTN - FRBA 2011 Eng. Julian S. Bruno

Summary of desirable features of a DSPDSPF t i th ti ti d Fast in mathematics operations, and combinations of them (multiply and sum

i ll )specially). Flexible addressing modes (bit reversal, circular

ff )buffers, zero overhead loops) DSP specific instruction set (arithmetic shifting, ( g

saturating arithmetic, rounding, normalization) Minimum overhead peripherals (communications p p (

devices specially) DSP instructions for specific applications (Video, DSP instructions for specific applications (Video,

Control, Audio)UTN - FRBA 2011 Eng. Julian S. Bruno

So those are DSP math featuresSo, those are DSP math features

M lti l d A l t Multiply and Accumulators (MAC’s) units.

ALU’s (fixed and floating ALU s (fixed and floating point).

Barrel shifters. Depending on DSP

application, more than one it t i dunit are present in modern

DSP’s, allowing parallelism. Harvard (modified) Harvard (modified)

architecture provide multiple operations per cycle.

UTN - FRBA 2011 Eng. Julian S. Bruno

Architectural Features for Efficient P iProgramming

Specialized addressing modesp g Hardware Loop Constructs

Cacheable memories Cacheable memories Multiple operations per cycle Interlocked pipeline Another important features Another important features

UTN - FRBA 2011 Eng. Julian S. Bruno

Page 7: The BIG pictureThe BIG picture - SASE · TITIs’s ARM Processor ARM Processor-Based Sitara™ ARM™ ARM Microprocessors Cortex™-A8 and ARM9™-based embedded microprocessors Clock

Architectural Features for Efficient P iProgramming

SpecializedSpecialized addressingaddressing modesmodespp gg Hardware Loop Constructs

Cacheable memories Cacheable memories Multiple operations per cycle Interlocked pipeline Another important features Another important features

UTN - FRBA 2011 Eng. Julian S. Bruno

Specialized Addressing ModesSpecialized Addressing Modes

Circular buffering Bit-Riversal

B0 = 0x00; L0 = 44; // Base and lengthI0 = 0x00; M0 = 16; // Index and increment

B0 = 0x00; L0 = 0; // Base and lengthI0=0; M0=1; // Index and incrementI2=256; P0 = 8;

R0 = [I0++M0]; // R0=1 & I0=0x10R0 = [I0++M0]; // R0=5 & I0=0x20R0 = [I0++M0]; // R0=9 & I0=0x04R0 [I0 M0] // R0 2 & I0 0 14

LOOP(start, end) LC0 = P0;start: // I0 automatically incremented in B-R progressionR0 = [I0] || I0 += M0 (BREV);end: // I2 point to bit-riversed buffer

R0 = [I0++M0]; // R0=2 & I0=0x14R0 = [I0++M0]; // R0=6 & I0=0x24

end: // I2 point to bit riversed buffer[I2++] = R0;

UTN - FRBA 2011 Eng. Julian S. Bruno

Architectural Features for Efficient P iProgramming

Specialized addressing modesp g Hardware Hardware LoopLoop ConstructsConstructs

Cacheable memories Cacheable memories Multiple operations per cycle Interlocked pipeline Another important features Another important features

UTN - FRBA 2011 Eng. Julian S. Bruno

Hardware Loop ConstructsHardware Loop Constructs

Looping is a critical feature in communications processing algorithms.

There are two key looping-related features that can improve performance on a wide variety of p p yalgorithms: “zero-overhead hardware loop” zero overhead hardware loop “hardware loop buffers”

UTN - FRBA 2011 Eng. Julian S. Bruno

Page 8: The BIG pictureThe BIG picture - SASE · TITIs’s ARM Processor ARM Processor-Based Sitara™ ARM™ ARM Microprocessors Cortex™-A8 and ARM9™-based embedded microprocessors Clock

Architectural Features for Efficient P iProgramming

Specialized addressing modesp g Hardware Loop Constructs

CacheableCacheable memoriesmemories CacheableCacheable memoriesmemories Multiple operations per cycle Interlocked pipeline Another important features Another important features

UTN - FRBA 2011 Eng. Julian S. Bruno

Cacheable memoriesCacheable memories

Today’s high-speedprocessors would effectively run at much yslower speeds because larger applications would only fit in slower external memory.

Programmers would be forced to manually move ykey code in and out of internal SRAM.

Adding data and instruction gcaches into the architecture, external memory becomes much

blmore manageable.

UTN - FRBA 2011 Eng. Julian S. Bruno

Architectural Features for Efficient P iProgramming

Specialized addressing modesp g Hardware Loop Constructs

Cacheable memories Cacheable memories MultipleMultiple operationsoperations per per cyclecycle Interlocked pipeline Another important features Another important features

UTN - FRBA 2011 Eng. Julian S. Bruno

Multiple operations per cycleMultiple operations per cycleIn addition to In addition toperforming multiple ALU/MAC operations eachoperations each core processor cycle, additional data loads anddata loads and stores can also be completed in the same cycle.y

The memory is typically portioned into sub-banks thatinto sub banks that can be dual-accessed by the core and optionally p yby a DMA controller.

UTN - FRBA 2011 Eng. Julian S. Bruno

There are two multi-issue architectures: VLIW and superscalar

Page 9: The BIG pictureThe BIG picture - SASE · TITIs’s ARM Processor ARM Processor-Based Sitara™ ARM™ ARM Microprocessors Cortex™-A8 and ARM9™-based embedded microprocessors Clock

Architectural Features for Efficient P iProgramming

Specialized addressing modesp g Hardware Loop Constructs

Cacheable memories Cacheable memories Multiple operations per cycle InterlockedInterlocked pipelinepipeline Another important features Another important features

UTN - FRBA 2011 Eng. Julian S. Bruno

Interlocked pipelineInterlocked pipeline

I d t i th h t DSP d i d t In order to increase throughput, DSPs are designed to be pipelined

When assembly programming is required the pipeline When assembly programming is required, the pipeline can make programming more challenging.

The processor automatically handles stalls and The processor automatically handles stalls and bubbles.

UTN - FRBA 2011 Eng. Julian S. Bruno

Architectural Features for Efficient P iProgramming

Specialized addressing modesp g Hardware Loop Constructs

Cacheable memories Cacheable memories Multiple operations per cycle Interlocked pipeline AnotherAnother importantimportant featuresfeatures AnotherAnother importantimportant featuresfeatures

UTN - FRBA 2011 Eng. Julian S. Bruno

Another important featuresAnother important features

RISC lik i t d i t ti t RISC like registers and instruction set Multiple data/program buses. DMA controller for handling peripherals In traditional fixed-point DSPs, word sizes are In traditional fixed point DSPs, word sizes are

usually fixed. However, there is an advantage to having data registers that can be treated as:to having data registers that can be treated as: One 64-bit word Two 32 bit word Two 32-bit word Four 16-bit word Eight 8-bit word

UTN - FRBA 2011 Eng. Julian S. Bruno

Page 10: The BIG pictureThe BIG picture - SASE · TITIs’s ARM Processor ARM Processor-Based Sitara™ ARM™ ARM Microprocessors Cortex™-A8 and ARM9™-based embedded microprocessors Clock

DSP clasificationDSP clasification

Fixed or Floating point arithmetic. Millions of multiply–accumulate operations per p y p p

second, MMACs. Millions of floating-point operations per Millions of floating point operations per

second, MFLOPS. Application specific feat res ( ideo a dio Application specific features (video, audio, control, communications).

Memory

UTN - FRBA 2011 Eng. Julian S. Bruno

Why DSP hardware?Why DSP hardware?

Special-purpose (custom) chips such as application-specific integrated circuits (ASIC).

Field-programmable gate arrays (FPGA). Field programmable gate arrays (FPGA). General-purpose microprocessors or microcontrollers (μP/μC). General-purpose digital signal processors (DSP processors).

UTN - FRBA 2011 Eng. Julian S. Bruno

DSP processors with application-specific hardware (HW) accelerators.

TI ProcessorsTI Processors

C6000™ Hi h P f DSP C6000™ High Performance DSPsIdeal for imaging, broadband infrastructure and performance audio applications.

C6000™ Performance Value DSPs C6000 Performance Value DSPsIdeal for broadband infrastructure and performance audio applications. Lower cost.

C6000™ Floating-point DSPsIdeal for professional audio products, biometrics, medical, industrial, digital imaging, speech recognition, conference phones and voice-over packetimaging, speech recognition, conference phones and voice over packet

C5000™ Power-Efficient DSPsOptimized for power- and cost-efficient embedded signal processing solutions

C2000™ 32-bit Real-time MCUsOptimized core can run multiple complex control algorithms at speeds

f d di t l li ti

UTN - FRBA 2011 Eng. Julian S. Bruno

necessary for demanding control applications

C5000™ DSP Platform RoadmapC5000 DSP Platform Roadmap

UTN - FRBA 2011 Eng. Julian S. Bruno

Page 11: The BIG pictureThe BIG picture - SASE · TITIs’s ARM Processor ARM Processor-Based Sitara™ ARM™ ARM Microprocessors Cortex™-A8 and ARM9™-based embedded microprocessors Clock

C6000™ DSP Platform RoadmapC6000 DSP Platform Roadmap

UTN - FRBA 2011 Eng. Julian S. Bruno

TI’s ARM Processor-BasedTI s ARM Processor BasedSitara™ ARM Microprocessors Sitara™ ARM Microprocessors Cortex™-A8 and ARM9™-based embedded microprocessors Clock Speed: 300 MHz to 1.5 GHz 3D Graphics Accelerator and Power technology (OMAP) 3D Graphics Accelerator and Power technology (OMAP)

Stellaris® MCU ARM Cortex-M3

Cl k S d U t 100 MH Clock Speed: Up to 100 MHz Up to 125 MIPS (at 100 MHz) Advanced integration: Serial interfaces, motion control, system, analogOMAP™ A li ti P OMAP™ Applications Processors ARM9-based devices. LP, general-purpose, multimedia and graphics

processing ARM® Cortex™ A8 core ARM® Cortex™-A8 core C64x+™ DSP and Video Accelerators 3-D Graphics AccelerationDaVinci™ Digital Media Processors DaVinci™ Digital Media Processors Optimized for digital video systems ARM9 only, ARM9 + DSP and DSP only.

UTN - FRBA 2011 Eng. Julian S. Bruno

TI’s ARM Processor-Based ProductsTI s ARM Processor Based Products

UTN - FRBA 2011 Eng. Julian S. Bruno

ADI ProcessorsADI Processors TigerSHARC® Processors TigerSHARC® Processors

32-bit fixed-point as well as floating-point Clock Speed: 250MHz to 600MHz 4.8 GMACs of 16-bit performance / 3.6 GFLOPs 4.8 GMACs of 16 bit performance / 3.6 GFLOPs 24 Mbits of on- chip memory 5 Gbytes of I/O bandwidth

SHARC® Processors SHARC® Processors 32-Bit floating-point Clock Speed: 150MHz to 400MHz / 2.4 GFLOPs. Accelerator Architecture: FIR, IIR, FFT.

Blackfin® Processors 16/32-bit fixed point Clock Speed: 200MHz to 756MHz / 1.5 GMACsp Very low power consumption: 0.23mW/Mhz RTOS supported. Multicore 600MHz / 2.4 GMACs.

ADSP-21xx Processors

UTN - FRBA 2011 Eng. Julian S. Bruno

16/32-bit fixed point Clock Speed: 75MHz to 160MHz Analog Devices brought first programmable processor to market in 1986

Page 12: The BIG pictureThe BIG picture - SASE · TITIs’s ARM Processor ARM Processor-Based Sitara™ ARM™ ARM Microprocessors Cortex™-A8 and ARM9™-based embedded microprocessors Clock

ADSP-21xx ProcessorsADSP-2191 BLOCK DIAGRAM

UTN - FRBA 2011 Eng. Julian S. Bruno

Blackfin ProcessorsADSP-BF536/ADSP-BF537 BLOCK DIAGRAM

UTN - FRBA 2011 Eng. Julian S. Bruno

SHARC ProcessorsADSP-2146x BLOCK DIAGRAM

UTN - FRBA 2011 Eng. Julian S. Bruno

TigerSHARC ProcessorgADSP-TS201S BLOCK DIAGRAM

UTN - FRBA 2011 Eng. Julian S. Bruno

Page 13: The BIG pictureThe BIG picture - SASE · TITIs’s ARM Processor ARM Processor-Based Sitara™ ARM™ ARM Microprocessors Cortex™-A8 and ARM9™-based embedded microprocessors Clock

Markets and ApplicationsMarkets and Applications

UTN - FRBA 2011 Eng. Julian S. Bruno

Recommended bibliographyRecommended bibliography

RG L U d t di Di it l Si l P i RG Lyons, Understanding Digital Signal Processing 2nd ed. Prentice Hall 2004. Ch2: Periodic Sampling Ch2: Periodic Sampling

SW Smith, The Scientist and Engineer’s guide to DSP. California Tech. Pub. 1997. Ch1: The Breadth and Depth of DSP Ch3: ADC and DAC SM K BH L R l Ti Di it l Si l P i SM Kuo, BH Lee. Real-Time Digital Signal Processing 2nd ed. John Wiley and Sons. 2006 Ch1:Introduction to Real-Time Digital Signal Processing Ch1:Introduction to Real Time Digital Signal Processing

NOTE: Many images used in this presentation were extracted from the

UTN - FRBA 2011 Eng. Julian S. Bruno

recommended bibliography.

Questions?

Thank you!Thank you!

UTN - FRBA 2011Eng. Julian S. Bruno