fft circuit design - welcome. wits lab. 無線資訊...

Post on 03-Apr-2020

9 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

FFT Circuit Design

2

Applications of FFT in Communications

Fundamental FFT Algorithms

FFT Circuit Design Architectures

Conclusions

Outline

3

DAB Receiver

256/512/1024/2048 – point FFT

Tuner OFDMDemodulator

ChannelDecoder

Mpeg2Audio

Decoder

PacketDemux

Controller

Control Panel

4

WLAN OFDM System

RECEIVER

FECCoder S/P IFFT

64-pt

GuardInterval

Insertion

D/ALPF

UpConverter

TRANSMITTER

MACLayer6Mbps

~54Mbps

FECDecoder P/S FFT

64-pt

GuardInterval

Removal

A/DLPF

DownConverter

5

ADSL (Discrete Multi-tune) System

receive filter

+A/D

P/S QAM decoders FEQ

S/P QAM encoders

IFFT512-pt

add cyclic prefix

P/SD/A +

transmit filter

FFT512-pt S/P

remove cyclic prefix

TRANSMITTER

RECEIVER

TEQ

channel

DataIn

DataOut

6

Applications of FFT in Communications

Comm.System

WLAN DAB DVB ADSL VDSL

FFT Size 64 256/512/1024/2048

2048/8192

512 512/1024/2048/4096

OFDM DMT

7

Applications of FFT in Communications

Fundamental FFT Algorithms

FFT Circuit Design Architectures

Conclusions

Outline

8

Fundamental FFT Algorithms

Discrete Fourier Transfer Pair

Radix-2 FFT (N = 2ν)Decimation-in-time (DIT)Decimation-in-frequency (DIF)

FFT for composite N (N = N1 N2)Cooley-Tukey AlgorithmsRadix-r FFT

9

Discrete Fourier Transform Pair

][ ][ kXnx DFT⎯⎯ →←

. )/2( NjN eW π−=

,1 ..., ,1 ,0 , ][ ][1

0

−== ∑−

=

NkWnxkX knN

N

n

,1 ..., ,1 ,0 , ][N1 ][

1

0

−== −−

=∑ NnWkXnx kn

N

N

n

Let

denote a DFP pair.

Where,

We have

10

Observations

WNk is N-periodic.

WNk is conjugate symmetric.

Both x[n] and X[k] are N-periodic.

If x[n] is real, then X[k] is conjugate symmetricand vice versa.

11

Observations

A direct calculation requires approximately N2

complex multiplications and additions. FFT algorithms reduce the computation complexity to the order of N • log N.

Algorithms developed for FFT also works for IFFT with only minor modifications.

12

Example: Zero-Padding (WLAN)

WLAN 52 sub-carriers: use 64-point FFT.Null#1#2..#26NullNullNull#-26..#-2#-1

IFFT

012

2627

3738

6263

012

2627

3738

6263

TimeDomainOutputs

Sub-carriers

13

Decimation-in-Time Radix-2 FFT

1 , ,0

]12[ ]2[

][ ][

2/

12/

02/

12/

0

1

0

−=

+=

++=

=

∑∑

∑−

=

=

=

Nk

H[k]WG[k]

WrxWWrx

WnxkX

kN

krN

N

r

kN

krN

N

r

knN

N

n

K

Assume N is an even number.

14

Observations

G[k] is DFT of even samples of x[n].

H[k] is DFT of odd samples of x[n].

G[k] and H[k] are N/2-periodic.

WNk+N/2 = - WN

k.

15

DIT Radix-2 FFT

.2/0 , -

, ]2/[

, ][)2/(

NrH[r]WG[r]

H[r]WG[r]NrX

H[r]WG[r]rX

rN

NrN

rN

<≤=

+=+

+=+

WNr

-WNr

X[r]

X[r+N/2]

G[r]

H[r]

16

Decimation-in-Time Radix-2 FFTButterfly for Radix-2 DIT FFT

(M-1)th stage Mth stageWN

r

-WNr

(M-1)th stage Mth stage

WNr -1

In-place Computation

17

Decimation-in-Time Radix-2 FFTFirst layer decimation

N/2-pointDFT

x[0]

x[2]

x[4]

x[6]

N/2-pointDFT

x[1]

x[3]

x[5]

x[7]

G[3]

H[3]

H[2]

H[0]

G[2]

G[0]

G[1]X[0]

X[1]

X[7]

X[6]

X[5]

X[4]

X[3]

X[2]

H[1] WN0

WN3

WN2

WN1

-1

-1

-1

-1

18

Decimation-in-Time Radix-2 FFT x[0]

x[4]-1WN

0

x[6]

x[2]

-1WN0

x[5]

x[1]

-1WN0

x[7]

x[3]

-1WN0

-1

-1

WN0

WN2

-1

-1WN0

WN2

X[0]

X[7]

X[6]

X[5]

X[4]

X[3]

X[2]

X[1]

-1

-1

-1

-1

WN0

WN2

WN1

WN3

19

Bit Reversal

x[n2 n1 n0] x[1 1 0]

x[1 0 0]

x[0 1 0]

x[0 0 0]

x[0 0 1]

x[1 0 1]

x[0 1 1]

x[1 1 1]

0

1

0

1

0

1

0

1

n0

0

1

n2

0

n1

1

1

0

20

Decimation-in-frequency Radix-2 FFT

12 , ,0

])2/[][( ]12[

])2/[][( ]2[

1 , ,0 , ][ ][

2/

1)2/(

0

2/

1)2/(

0

1

0

-N/r

WWNnxnxrX

WNnxnxrX

NkWnxkX

rnN

nN

N

n

rnN

N

n

knN

N

n

K

K

=

+−=+

++=

−==

=

=

=

Assume N is an even number.

21

Decimation-in-frequency Radix-2 FFT

12 , ,02 2 where,

, ][ ]12[

, ][ ]2[

2/

1)2/(

0

2/

1)2/(

0

-N/r])N/x[n(x[n]h[n]])N/x[n(x[n]g[n]

WWnhrX

WngrX

rnN

nN

N

n

rnN

N

n

K=+−=++=

=+

=

∑−

=

=

22

Decimation-in-frequency Radix-2 FFT

Butterfly for Radix-2 DIF FFT

(M-1)th stage Mth stage

WNn-1

In-place Computation

23

Decimation-in-frequency Radix-2 FFTFirst layer decimation

x[7]

x[0]

x[1]

x[6]

x[5]

x[4]

x[3]

x[2]

-1

-1

-1

-1

h[2]

h[3]

h[1]

h[0]

g[3]

g[2]

g[0]

g[1]

N/2-pointDFT

N/2-pointDFT

X[0]

X[2]

X[4]

X[6]

X[1]

X[3]

X[5]

X[7]

WN0

WN2

WN1

WN3

24

Decimation-in-frequency Radix-2 FFTX[0]

X7]

X3]

X[5]

X[1]

X[6]

X[2]

X[4]-1

-1

-1

-1

WN0

WN0

WN0

WN0

-1

-1

-1

-1

WN0

WN0

WN2

WN2

-1

x[0]

x7]

x[6]

x5]

x[4]

x[3]

x[2]

x[1]

-1

-1

-1

WN0

WN2

WN1

WN3

25

Butterfly ComparisonButterfly (decimation-in-frequency)

(M-1)th stage Mth stage

WNn-1

Butterfly (decimation-in-time)

(M-1)th stage Mth stage

WNr -1

26

Cooley-Tukey Algorithm

].k X[ ][], x[ ][

:tarrangemen-repoint 2D,1 0,1 0

,

,1 0,1 0

,

211

212

22

11211

22

11212

NkkXnnNnx

NkNk

nNkk

NnNn

nnNn

+=+=

⎩⎨⎧

−≤≤−≤≤

+=

⎩⎨⎧

−≤≤−≤≤

+=21 NNN ⋅=

27

Cooley-Tukey Algorithms

, ] [ ][ 22

2

2

2

211

1

11

1

1

0

1

0212

nkN

N

n

nkN

N

n

nkN WW WnnNxkX ∑ ∑

=

= ⎥⎥⎦

⎢⎢⎣

⎡⎟⎟⎠

⎞⎜⎜⎝

⎛+=

] ,[ 12 knG

Twiddle factor

] ,[~12 knG

28

N1 = 2, N2 = N/2 -> 1st stage of the decimation in frequency radix-2 FFT.

N1 = N/2, N2 = 2 -> 1st stage of the decimation in time radix-2 FFT.

In general, N = N1 N2 … Nn.

If N = r n -> Radix-r.

Observations

29

Radix-3 FFT (DIF)

rnN

nN

N

r

jj

rnN

nN

N

r

jj

rnN

N

n

knN

N

n

WWeNnxeNnxnxrX

WWeNnxeNnxnxrX

WNnxNnxnxrX

WnxkX

3/2

1)3/(

0

32

32

3/

1)3/(

0

32

32

3/

1)3/(

0

1

0

)]3/2[]3/[][( ]23[

)]3/2[]3/[][( ]13[

])3/2[ ]3/[][( ]3[

][ ][

=

=

=

=

++++=+

++++=+

++++=

=

ππ

ππ

Assume N is a multiple of 3.

30

Radix-3 FFT (DIF)

Butterfly for Radix-3 DIF FFT

32πje

32πje−(M-1)th stage Mth stage

32πje−

32πje

WNn

WN2n

31

Radix-4 FFT (DIF)

rnN

nN

N

r

rnN

nN

N

r

rnN

nN

N

r

rnN

N

n

WWNnxjNnxNnjxnxrX

WWNnxNnxNnxnxrX

WWNnjxNnxNnxjnxrX

WNnxNnxNnxnxrX

4/3

1)4/(

0

4/2

1)4/(

0

4/

1)4/(

0

4/

1)4/(

0

])4/3[)(]4/2[)1(]4/[][( ]34[

])4/3[)1(]4/2[]4/[)1(][( ]24[

])4/[]4/2[)1(]4/[)(][( ]14[

])4/3[]4/2[ ]4/[][( ]4[

=

=

=

=

+−++−+++=+

+−++++−+=+

+++−++−+=+

++++++=

Assume N is a multiple of 4.

32

Radix-4 FFT (DIF)

Butterfly for Radix-4 DIF FFT

(M-1)th stage Mth stage

33

Split Radix FFT

Mix Radix-2 and Radix-4 architecture.

Compute even transform coefficients based on Radix-2 strategy and odd coefficients based on Radix-4 strategy.

Can perform FFT for N = 2ν.

34

Simplify Butterfly Representations

Radix-2

Radix-4

35

Split-Radix FFT

36

Computational Complexity

Method # of Complex Multiplications

# of Complex Additions

DFT N2 N(N-1)

Radix-2 (N/2) log2N N log2N

Radix-4 (3N/8) log2N (3N/2) log2N

The above numbers do not tell the whole story!Architecture is the key issue to trade of among performance, cost, hardware complexity, etc.

37

Outline

Applications of FFT in Communications

Fundamental FFT Algorithms

FFT Circuit Design Architectures

Conclusions

38

FFT Architecture Design Considerations

Trade-off among accuracy, speed, hardware complexity, and power consumption – best fit architecture should be application dependent.

Main architecture differences in:Degrees of parallelism – number and complexity of processing elements,Control schemes - hardware utilization and data flow control.

39

Degree of ParallelismOne simple processing unit or multiple simple processing units

x[0]

x[7]

x[3]

x[5]

x[1]

x[6]

x[2]

x[4]

X[0]

X[7]

X[6]

X[5]

X[4]

X[3]

X[2]

X[1]-1

-1

-1

-1

-1

-1

-1

-1

-1

-1

-1

-1

WN0

WN0

WN0

WN0

WN0

WN0

WN0

WN2

WN2

WN2

WN1

WN3

40

Degree of Parallelism

Simple processing units versus complicate processing units

41

Memory-based FFT architecture

Single butterfly or processing element.Required memory size = N.A control unit ensures the right data flows to compute FFT.Firmware Like.Low complexity.Low speed.

42

Memory-based FFT Block Diagram

Butterflyor

Processing Element

Input Buffer

Coefficients ROM or

Generator

RAM

Control Unit

DataIn Data

Out

Control

43

Pipeline Architectures

FFT Signal Flow Graph

Multiple path delay commutator

Single path delay commutator

Single path delay feedback

44

Radix-2 Signal Flow Graph (DIT)

BF2B

uffer

ROM

BF2

Buffer

ROM ROM

BF2

x[0]

x[4] -1WN0

x[6]

x[2]

-1WN0

x[5]

x[1]

-1WN0

x[7]

x[3]

-1WN0

-1

-1WN

0

WN2

-1

-1WN0

WN2

X[0]

X[7]

X[6]

X[5]X[4]

X[3]

X[2]X[1]

-1

-1-1-1

WN0

WN2

WN1

WN3

45

Radix-2 Signal Flow Graph (DIF)

ROM

BF2

Buffer

ROM

BF2

Buffer

ROM

BF2

X[0]

X7]X3]X[5]X[1]X[6]X[2]X[4]

-1

-1

-1

-1

WN0

WN0

WN0

WN0

-1

-1

-1

-1

WN0

WN0

WN2

WN2

-1

x[0]

x7]x[6]x5]x[4]x[3]x[2]x[1]

-1-1-1

WN0

WN2

WN1

WN3

46

Multi-Path Delay Commutator

Commutator(switch)

Delay

Delay

Butterfly

Delay

Delay

47

Radix-2 Multi-Path Delay CommutatorX[0]

X7]X3]X[5]X[1]X[6]X[2]X[4]-1

-1

-1

-1

WN0

WN0

WN0

WN0

-1

-1-1

-1

WN0

WN0

WN2

WN2

-1

x[0]

x7]x[6]x5]x[4]x[3]x[2]x[1]

-1-1-1

WN0

WN2

WN1

WN3

7 6 5 4 3 2 1 03 2 1 0

4 5 6 73 2 1 04 5 6 7

3 2 1 04 5 6 7

3 2 1 04 5 6 7

5 4 1 07 6 3 2

5 4 1 07 6 3 2

5 4 1 07 6 3 2

5 4 1 07 6 3 2

6 4 2 07 5 3 1

6 4 2 07 5 3 1

6 4 2 07 5 3 1

switch

switch

switch

delay butterfly

butterfly

butterfly

delay

delay

delay

delay

48

Radix-2 Multi-Path Delay Commutator

C2

4

BF2

2

C2

2

BF2

1

C2

1

BF2C2

8

BF2

4

N=16

49

Radix-4 Multi-Path Delay Commutator

C4 BF4

3

2

1

C4

12

BF4

3

2

18

4

C4

48

BF4

12

8

432

16

C4

192

BF4

48

32

16128

64

N=256

50

Single Path Delay Commutator

DelayCommutator Butterfly

51

Radix-2 Single Path Delay Commutator

DC2 BF2 DC2 BF2 DC2 BF2 DC2 BF2

N=16

52

Radix-4 Single Path Delay Commutator

DC4 BF4 DC4 BF4 DC4 BF4 DC4 BF4

N=256

53

Single Path Delay Feedback

Butterfly

Delay

54

Radix-2 Single Path Delay Feedback

BF2

4

BF2

2

BF2

1

BF2

8

N=16

55

Radix-4 Single Path Delay Feedback

BF4

4x3

BF4

1x3

BF4

16x3

BF4

64x3

N=256

56

R22SDF

BF2II

4

BF2I

2

BF2I

8

BF2II

1

N=256

BF2II

64

BF2I

32

BF2I

128

BF2II

16

57

Hardware Comparison

Multiplier # Adder # Memory Size ControlArchitecture

R2MDCR2SDFR4MDCR4SDFR4SDCR22SDF

2(log4 N-1)2(log4 N-1)3(log4 N-1)

log4N-1log4N-1log4N-1

4 log4 N4 log4 N8 log4 N8 log4 N3 log4 N4 log4 N

3N/2-2N-1

5N/2-4N-12N-2N-1

simplesimplesimple

mediumcomplexsimple

58

Conclusions

Effect FFT computation is essential to many communication applications utilizing OFDM or DMT technique.

A pipelined FFT architecture is applied where a high real-time performance is required. A memory-based FFT architecture can be adopted when cost is more concerned than speed.

A best fit FFT architecture depends on application specific requirements to trade–off among accuracy, speed, chip size, power consumption, etc.

top related