digital signal processing with field programmable gate arrays · digital signal processing with...

95
Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA Javier Serrano, CERN, Geneva, Switzerland

Upload: others

Post on 09-Jul-2020

17 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Digital Signal processing with Field Programmable Gate Arrays

Beam Instrumentation Workshop 2008Tahoe City, CA, USA

Javier Serrano, CERN, Geneva, Switzerland

Page 2: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Outline

Analog vs. digital.DSP vs. FPGA.Digital design techniques.

Performance enhancement techniques.Safe design.

Digital Signal Processing blocks.Examples.

2

Page 3: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Outline

Analog vs. digital.DSP vs. FPGA.Digital design techniques.

Performance enhancement techniques.Safe design.

Digital Signal Processing blocks.Examples.

3

Page 4: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

The “DSP Revolution”

DSP has displaced analog solutions in many consumer and industrial markets. Main reasons:

Reliability.Repeatability.Programmability.Performance (e.g. FIR filters have no analog equivalent).

4

Page 5: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

An example: amplifier and mixer distortion

5Courtesy Tony Rohlev

Page 6: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Saturation (IP 3) and its effects on phase accuracy

6

Page 7: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Strategy: model and “invert”

7

Page 8: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Finding coefficients D and G

8

Page 9: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Extracting the original Vi

Direct?

Taylor Expansion?

Newton’s Method?

NO.

PLEASE NO!

9

Page 10: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

FPGA implementation

Or unroll the loop and increase throughput…

10

Page 11: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Simulation

11

Page 12: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Simulation

12

Page 13: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Simulation

13

Page 14: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Simulation

14

Page 15: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Simulation

15

Page 16: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Simulation

16

Page 17: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Simulation

17

Page 18: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Outline

Analog vs. digital.DSP vs. FPGA.Digital design techniques.

Performance enhancement techniques.Safe design.

Digital Signal Processing blocks.Examples.

18

Page 19: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Basic FPGA architecture

19

Page 20: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

FPGA: a “box” of DSP blocks

20

Page 21: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Programmable logic: one (simplistic) way of thinking of it

21

S1

S2

D

C ENB

MultiplexerIN1

IN2

OUT

PROGRAM

Page 22: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Why FPGAs for DSP? (1)

Conventional DSP Device(Von Neumann architecture)

Data Out

RegData In

MAC unit....C0

Data Out

C1 C2 C255

FPGA

Reg0 Reg1 Reg2 Reg255Data In

All 256 MAC operations in 1 clock cycle256 Loops needed to process samples

Reason 1: FPGAs handle high computational workloads

FPGAs benefit twice from Moore’s law: speed and space.22

Page 23: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Why FPGAs for DSP? (2)

Q = (A x B) + (C x D) + (E x F) + (G x H)

can be implemented in parallel

××

×× +

+

+

+

+

+

A

BC

DE

FG

H

Q

Reason 2: Tremendous Flexibility

But is this the only way in the FPGA?23

Page 24: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

××

×× +

+

+

+

+

+ ×+

+D Q

××

+

+

+

+D Q

Parallel Semi-Parallel Serial

Customize Architectures to Suit Your Ideal Algorithms

FPGAs allow Area (cost) / Performance tradeoffs

Optimized for?Speed Cost

24

Page 25: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

DDCDDC

A/D

A/D

D/A

D/A

MACs

ControlDDCDDC

DUCDUC

DUCDUC

MACs

Control

DSPProcs.

DUCDUC

DUCDUC

DDCDDC DDCDDC

SDRAMAFE

FPGA DSPCard

Hundreds of Termination Resistors

PowerPC

SDRAM

SSTL3TranslatorsQuad

TRx

QuadTRx

ASSP

FPGA NetworkCard

SDRAM

A/D

A/D

D/A

D/A

Control

Control

PL4

CORBA

PowerPC

MACs, DUCs, DDCs, Logic

PowerPC

PowerPCPowerPC

3.125 Gbps

ASSP

SDRAM

Reason 3: Integration simplifies PCBs

Why FPGAs for DSP? (3)

25

Page 26: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Outline

Analog vs. digital.DSP vs. FPGA.Digital design techniques.

Performance enhancement techniques.Safe design.

Digital Signal Processing blocks.Examples.

26

Page 27: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Basics of digital design

Combinational logic: state of outputs depend on current state of inputs alone (forgetting about propagation delays for the time being). E.g. AND, OR, mux, decoder, adder...

D-type Flip flops propagate D to Q upon a rising edge in the clk input.

Unless you really know what you are doing, stick to synchronous design: sandwiching bunches of combinational logic in between flip flops.

Synchronous design simplifies design analysis, which is good given today’s logic densities.

27

Page 28: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Don’t do this!

Toggle flip-flops get triggered by glitches produced by different path lengths of counter bits.

28

Page 29: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Basics of (synchronous) Digital Design

dataSelectC

dataAC[31:0]

dataBC[31:0]

dataSelectCD1

dataACd1[31:0]DataOut_3[31:0]

0

1DataOut[31:0]

sum_1[31:0]+

sum[31:0]

DataOut[31:0][31:0]DataInB[31:0] [31:0]

DataInA[31:0] [31:0]

DataSelect

Clk

Q[0]D[0]

[31:0]Q[31:0][31:0] D[31:0]

[31:0]Q[31:0][31:0] D[31:0]

Q[0]D[0]

[31:0]Q[31:0][31:0] D[31:0] [31:0]

[31:0][31:0] [31:0]Q[31:0][31:0] D[31:0]

[31:0]

[31:0][31:0] [31:0]Q[31:0][31:0] D[31:0]

High clock rate: 144.9 MHz on a Xilinx Spartan IIE.

Higher clock rate: 151.5 MHz on the same chip.

dataSelectC

dataAC[31:0]

dataBC[31:0]

sum[31:0]+

DataOut_3[31:0]

0

1DataOut[31:0]

DataOut[31:0][31:0]

DataInB[31:0] [31:0]

DataInA[31:0] [31:0]

DataSelect

ClkQ[0]D[0]

[31:0]Q[31:0][31:0] D[31:0]

[31:0]Q[31:0][31:0] D[31:0]

[31:0]

[31:0][31:0]

[31:0]

[31:0][31:0] [31:0]Q[31:0][31:0] D[31:0]

6.90 ns

6.60 ns

Illustrating the latency/throughput tradeoff 29

Page 30: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Basic FPGA architecture

30

Page 31: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

The logic block: a summary view

Example: using a LUTas a full adder.

31

Page 32: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

FPGA state of the art

In addition to logic gates and routing, in a modern FPGAyou can find:

Embedded processors (soft or hard).Multi-Gb/s transceivers with equalization and hard IP for serial standards as PCI Express and Gbit Ethernet.Lots of embedded MAC units, with enough bits to implement single precision floating point arithmetic efficiently.Lots of dual-port RAM.Sophisticated clock management through DLLs and PLLs.System monitoring infrastructure including ADCs.On-substrate decoupling capacitors to ease PCB design.Digitally Controlled Impedance to eliminate on-board termination resistors.

32

Page 33: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

A practical example: Xilinx Virtex II Pro family

Overview Configurable Logic Block (CLB)

Embedded PowerPC

Digitally Controlled Impedance (DCI)33

Page 34: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

A practical example: Xilinx Virtex II Pro family

Slice

Detail of half-slice

34

Page 35: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

A practical example: Xilinx Virtex II Pro family

Routing resources

35

Page 36: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Traditional design flow 1/3

HDL

Synthesis

Implementation

Download

HDLImplement your design using VHDL or Verilog

Functional Simulation

TimingSimulation

In-Circuit Verification

BehavioralSimulation

36

Page 37: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Traditional design flow 2/3

BehavioralSimulationHDL

Synthesis

Implementation

Download

HDL

Synthesize the design to create an FPGA netlist

Functional Simulation

TimingSimulation

In-Circuit Verification

37

Page 38: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Traditional design flow 3/3

BehavioralSimulationHDL

Synthesis

Implementation

Download

HDL

Translate, place and route, and generate a bitstream to download in the FPGA

Functional Simulation

TimingSimulation

In-Circuit Verification

38

Page 39: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

VHDL 101

Both VHDL code segments produce exactly the same hardware.39

Page 40: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Simulink design flow

40

Page 41: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Simulink design flow

41

Page 42: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Outline

Analog vs. digital.DSP vs. FPGA.Digital design techniques.

Performance enhancement techniques.Safe design.

Digital Signal Processing blocks.Examples.

42

Page 43: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Buffering

Delay in modern designs can be as much as 90% routing, 10% logic. Routing delay is due to long nets + capacitive input loading.Buffering is done automatically by most synthesis tools and reduces the fan out on affected nets:

net1 net2 net1

net2

net3

Before buffering After buffering

43

Page 44: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Replicating registers (and associated logic if necessary)

Producer

Consumer 1

Consumer 2

Consumer 4

Consumer 3

Consumer 1

Consumer 2

Consumer 4

Consumer 3

Producer

Before After44

Page 45: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Retiming (a.k.a. register balancing)

Large combinational logic delay

Large combinational logic delay

Small Delay

Small Delay

Balanced delay

Balanced delay

Balanced delay

Balanced delay

Before

After

45

Page 46: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Pipelining

Large combinational logic delay

Large combinational logic delay

Small delay

Small delay

Before

After

Small delay

Small delay

Small delay

Small delay

46

Page 47: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Time multiplexing

Data In

100 MHz

50 MHz logic

50 MHz logic

50 MHz logic

50 MHz logic

50 MHz logic

50 MHz logic

50 MHz logic

50 MHz logic

50 MHz logic

50 MHz logic

50 MHz logic

50 MHz logic

Data Out

50 MHz

De-multiplexer Multiplexer

47

Page 48: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

An example: boosting the performance of an IIR filter (1/2)

Simple first order IIR: y[n+1] = ay[n] + b x[n]

Z-1X +

b

x

Xa

y

Performance bottleneck in the feedback path

Problem found in the phase filter of a PLL used to track bunch frequency in CERN’s PS

48

Page 49: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

An example: boosting the performance of an IIR filter (2/2)

Look ahead scheme: From y[n+1] = ay[n] + b x[n] we gety[n+2] = ay[n+1] + bx[n+1] = a2y[n] + abx[n] + bx[n+1]

Z-1

X

+

x

Xa2

y

abX

b

Z-1 + Z-2

FIR filter (can be pipelined to increase throughput)

Now we have two clock ticks for the feedback!

49

Page 50: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Outline

Analog vs. digital.DSP vs. FPGA.Digital design techniques.

Performance enhancement techniques.Safe design.

Digital Signal Processing blocks.Examples.

50

Page 51: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Single Event Effects (SEE) created by neutrons

Cosmic raysCosmic rays

NeutronsNeutronsSpaceSpaceAtmosphereAtmosphere

EarthEarth

p-

n+

Alpha Alpha particleparticle

n+

-+ -+ -+ -

+

NeutronNeutronGateGate DrainDrainSourceSource

Silicon Silicon nucleusnucleus

Sensitive Sensitive regionregion

Memory Cell: Memory Cell: CMOSCMOSConfiguration Latch (Configuration Latch (CCLCCL))

Sensitive Sensitive regionregion

UpsetUpset: 0 : 0 --> 1> 1 1 1 --> 0> 0

51

Page 52: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Classification of SEEs

SingleEvent Effect(SEE)

Single Event Functional Interrupt (SEFI)

Bit-Flip specifically in a control register – POWER ON RESET/JTAG etc.

Single Event Upset (SEU)

Bit-Flip Somewhere

Single Event Latch-Up (SEL)

Parasitic transistors activated in a device, causing internal short

Single Event Transient (SET)

A signal briefly fluctuates somewhere in design

52

Page 53: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

SEU Failures in Time (FIT)Defined as the number of failures expected in 109 hours.In practice, configuration RAM dominates. Example:

Average of only 10% of FPGA configuration bits are used in typical designs

Even in a 99% full design, only up to 30% are usedMost bits control interconnect muxesMost mux control values are “don’t-care”

Must include this ratio for accurate SEU FIT rate calculations.

FPGA Interconnect

ONOFF

DON’T-CAREActive Wire

Virtex XCV1000 memory Utilization

Memory Type# of bits %

Configuration 5,810,048 97.4

Block RAM 131,072 2.2

CLB flip-flops 26,112 0.4

53

Page 54: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Not all parts of the design are critical

Average of only 40% of circuits in FPGA designs are critical

Substantial circuit overhead for startup logic, diagnostics, debug, monitoring, fault-handling, control path, etc.

Must also include this ratio in SEU FIT rate calculations

FPGA Design

Critical Non-critical54

Page 55: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Safe state machines

One-hot encoding:s0 => 0001s1 => 0010s2 => 0100s3 => 100012 “illegal states” not covered, or covered with a “when others” in VHDL or equivalent.→ Use option in synthesis tool to prevent optimization of illegal states.

55

Page 56: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Mitigation techniques: scrubbingReadback and verification of configuration.

Most internal logic can be verified during normal operation.Sets limits on duration of upsets.

Partial configurationNot supported by all FPGA vendors/families.Allows fine grained reconfiguration.Does not reset entire device.

Allows user logic to continue to function.Complete reconfiguration

Required after SEFI.No user functionality for the duration of reconfiguration.

Verification by dedicated deviceUsually radiation tolerant antifuse FPGASecure storage of checksums and configuration an issue

FLASH is radiation sensitiveSelf verification

Often the only option for existing designsNot possible in all device families

Utilizes logic intended for dynamic reconfigurationVerification logic has small footprint

• Usually a few dozen CLBs and 1 block RAM (for checksums).56

Page 57: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Triple Module Redundancy (TMR)

Feedback TMRThree copies of user logicState feedback from voter

• Counter exampleHandles faultsResynchronizes

• Operational through repairSpeed penalty due to feedbackDesirable for state based logic

Counter

Counter

Counter

Voter

Voter

Voter

57

Page 58: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

AlternativesAntifuse

Configuration based on physical shortsInvulnerable to upsetCannot be altered

Over 90% smaller upset cross section for comparable geometrySignal routing more efficient

Much lower power dissipation for similar device geometryLags SRAM in fabrication technology

Usually one generation behindLatch up more of a problem than in SRAM devices

Rad-hard AntifuseAll flip-flops TMRed in silicon

Unmatched reliabilityHigh (extreme) costUnimpressive performance

• Feedback TMR built in• Usually larger geometry• Not available in highest densities offered by antifuse

FLASH FPGAsMiddle ground in base susceptibilityReadback/Verification problematic

Usually only JTAG (slow) supportedMaximum number of write cycles an issue 58

Page 59: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Outline

Analog vs. digital.DSP vs. FPGA.Digital design techniques.

Performance enhancement techniques.Safe design.

Digital Signal Processing blocks.Examples.

59

Page 60: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Fixed point (2’s complement) binary numbers

Example: 3 integer bits and 5 fractional bits

60

Page 61: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Fixed point truncation vs. rounding

Note that in 2’s complement, truncation is biased while rounding isn’t.

61

Page 62: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Interesting property of 2’s complement numbers

62

Page 63: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

The Full Adder (FA)

63

Page 64: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Add/subtract circuit

S = A+B when Control=‘0’S = A-B when Control=‘1’

64

Page 65: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Saturation

You can’t let the data path become arbitrarily wide. Saturation involves overflow detection and a multiplexer. Useful in accumulators (like the one in a PI controller).

65

Page 66: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Multiplication: pencil & paper approach

66

Page 67: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

A 4-bit unsigned multiplier using Full Adders and AND gates

Of course, you can use embedded multipliers if your chip has them!67

Page 68: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Constant coefficient multipliers using ROM

For “easy” coefficients, there are smarter ways. E.g. to multiply a number A by 31, left-shift A by 5 places then subtract A.

68

Page 69: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Smart multiplier examples

69

Page 70: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Division: pencil & paper

Uses add/subtract blocks presented earlier.MSB produced first: this will usually imply we have to wait for whole operation to finish before feeding result to another block.Longer combinational delays than in multiplication: an N by N division will always take longer than an N by N multiplication.

70

Page 71: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Pipelining the division array

71

Page 72: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Square root

Take a division array, cut it in half (diagonally) and you have square root. Square root is therefore faster than division!Although with less ripple through, this block suffers from the same problems as the division array.Alternative approach: first guess with a ROM, then use an iterative algorithm such as Newton-Raphson. 72

Page 73: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Distributed Arithmetic (DA) 1/2

∑−

=

⋅=1

0][][

N

nnxncy

∑ ∑−

=

=

⋅⋅=

1

0

1

02][][

N

n

B

b

bb nxncy

Digital filtering is about sums of products:

Let’s assume:c[n] constant (prerequisite to use DA)x[n] input signal B bits wide

Then:xb[n] is bit number b of x[n] (either 0 or 1)

And after some rearrangement of terms: ∑ ∑

=

=

⋅⋅=

1

0

1

0][][2

B

b

N

nb

b nxncy

This can be implemented with an N-input LUT

73

Page 74: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Distributed Arithmetic (DA) 2/2

∑ ∑−

=

=

⋅⋅=

1

0

1

0][][2

B

b

N

nb

b nxncy

xB[0] …… x1[0] x0[0]

xB[1] …… x1[1] x0[1]

xB[N-1] …… x1[N-1] x0[N-1]

……

....

……

....

……

.... LUT

+ Register

2-1

y

Generates a result every B clock ticks. Replicating logic one can trade off speed vs. area, to the limit of getting one result per clock tick. 74

Page 75: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

COrdinate Rotation DIgital Computer

75

Page 76: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Pseudo-rotations

76

Page 77: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Basic CORDIC iterations

77

Page 78: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Angle accumulator

78

Page 79: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

The scaling factor

79

Page 80: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Rotation Mode

80

Page 81: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Example: calculate sin and cos of 30º

81

Page 82: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Vectoring Mode

Vector magnitude

82

Page 83: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Outline

Analog vs. digital.DSP vs. FPGA.Digital design techniques.

Performance enhancement techniques.Safe design.

Digital Signal Processing blocks.Examples.

83

Page 84: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Courtesy Christos Zamantzas84

LHC BLM Tunnel CardBasic components used:

Off-the-shelves Current-to-Frequency Converter (x8)

Irradiation tests at cyclotrons of Lauvain and PSI

Actel FPGA (54SX32A) (x1)208 pin32,000 LEOne-time-programmableRedundant Inputs

Triple counter inputsRedundant outputs

Double 16bit data busDouble 4bit control bus

Custom Radiation Tolerant ASICsA/D Converter AD74240 CMOS (x2)

Quad 12bit40Ms/s Parallel output

Line Driver LVDS_RX CMOS (x6)8 LVDS to CMOS line receivers

Temperature Sensor DCU2 (x1)12 bit output

GOL (Gigabit Optical Link) (x2)Analogue parts needed to drive the laser. Algorithm running that corrects SEU.8b/10b encoding.16 or 32 bit input.Error reporting (SEU, loss of synchronisation,..)

Part Name Integral Dose (KGy)*

CFC 0.5

ACTEL 3.2

AD41240 10

LVDS_RX 10

DCU2 10

GOH 3.14

Table: Radiation dose withstood by component without error.

Figure: CFC card top view.

* For 20 years of nominal operation it is expected to receive around 200 Gy.

Page 85: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Courtesy Christos Zamantzas85

LHC BLM Tunnel Card

Page 86: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

CERN PS Beam Trajectory and Orbit

86

LPF

DDS

Phase table

PU signalInitial F

GATE, BLR

PLL to track non constant revolution frequencyCourtesy Greg Kasprowicz

Only adders

Page 87: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

CERN PS Beam Trajectory and Orbit

87

Sum

∆Y

∆X BLR GATE LO

.

FPGA

Loop Gain Fmax Fmin

ADC

ARM SINGLE BOARD

COMPUTER

CLOCK DISTRIBUTION

ADC

BASELINE RESTORER INTEGRATOR

FILTER

DDS

PHASE TABLE

Local Bus REGISTER

SET

ETHERNET INTERFACE

DDR II SDRAM

MEMORY

MEMORY

CONTROLLER

POINTER MEMORY

& SYNCHRONISATION

C timing

HC timing

INJ timing

ST timing

EMBEDDED

SIGNAL ANALYSER

CHIPSCOPE ANALYSER

JTAG

BASELINE RESTORER

BASELINE RESTORER

INTEGRATOR

INTEGRATOR ADC

Page 88: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

CERN PS Beam Trajectory and Orbit

88

Page 89: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Courtesy Uli Raich 89

Baseline restoration

Low pass filter the signal to get an estimate of the base lineAdd this to the original signal

Page 90: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Courtesy Uli Raich90

Problems with the baseline restorer

Simulated signal After high pass filter

Expected baseline corrected signal What we get!

Page 91: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Courtesy Uli Raich 91

Baseline correction solution

0.008

-0.25

BLR

PU signal Y

Page 92: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

LHC tune PLL

92Courtesy Andrea Boccardi

PLL functional diagram

Phase detector

Page 93: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

LHC tune FFT

93

• No multiplications in FFT4.• 32 bit in, 32 bit out.• Rounding strategy for 0.5: keep last decision in memory.• 1024 point FFT in 0.5 ms, dominated by memory access (acquisition time 100 ms).

Page 94: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

The future

94

K

LLRF1FPGA

LLRF2FPGA

K

LLRF15FPGA

K

LLRF16FPGA

K

LLRF17FPGA

K

LLRF3FPGA

LLRF4FPGA

K

LLRF21FPGA

MatrixModule

144 X 144144 X 144 EthernetNMPref

Reference Distribution

FPGA

72X72

16

16

16

Matrix Backplane ( TCA™)

16

16

16

16

1616

72 X 72Full-Duplex

Cross Point Switch

FPGASerial Links“Rocket I/O”

BackplaneSerial Links

Fiber SFPRx/Tx

Fiber I/O

Page 95: Digital Signal processing with Field Programmable Gate Arrays · Digital Signal processing with Field Programmable Gate Arrays Beam Instrumentation Workshop 2008 Tahoe City, CA, USA

Acknowledgements

Many thanks to the following people for material used in this presentation:

Tony Rohlev, Elettra.Jeff Weintraub, Xilinx University program.Prof. Stewart, University of Strathclyde.Andrea Boccardi, CERN.Christos Zamantzas, CERN.Greg Kasprowicz, CERN.Uli Raich, CERN.Matt Stettler, LANL. 95