7. discrete fourier transform file7. discrete fourier transform ... k jω c = dft ( v ). vector c is...
TRANSCRIPT
1
7. Discrete Fourier Transform
⎟⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜⎜
⎝
⎛
⋅
⎟⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜⎜
⎝
⎛
=
⎟⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜⎜
⎝
⎛
−−−−−
−
−
− 1
2
1
0
)1)(1()1(21
)1(242
12
1
2
1
0
1
11
1111
1
nnnnn
n
n
n v
vvv
n
c
ccc
M
L
MMMM
L
L
L
M
ωωω
ωωωωωω
)2exp()( niconj πωω −==
1,...,1,0,1 1
0−== ∑
−
=
nkvn
c jkn
jjk ω
c = DFT ( v ). Vector c is the Discrete Fourier Transform of vector v.
2
Discrete Fourier Transform
1,...,1,0,1 1
0−== ∑
−
=
nkvn
c jkn
jjk ω
DFT is nothing else than matrix times vector or n inner products.
Therefore, costs sequentially: O(n2)in parallel: n*log(n) processors, log(n) time steps by
n fan-in processes.
DFT is very important in many applications.
Therefore, fast algorithms have been developed by divide-and-conquer
FFT = Fast Fourier Transform
3
Odd – even Partitioning
∑∑
∑∑
∑
−
=+
−
=
−
=+
−
=
−
=
⎟⎠⎞
⎜⎝⎛⋅⋅+⎟
⎠⎞
⎜⎝⎛⋅=
⎟⎠⎞
⎜⎝⎛ +
⋅+⎟⎠⎞
⎜⎝⎛⋅=
=⎟⎠⎞
⎜⎝⎛⋅=
1
012
1
02
12/
012
12/
02
1
0
2exp2exp
)12(2exp22exp
2exp
m
kk
jm
kk
n
kk
n
kk
n
kkj
mijkc
mijkc
nkijc
nkijc
nijkcv
πωπ
ππ
π
aj aj + ωj bjButterfly:
bj aj - ωj bj
ωj
∑∑
∑∑−
=+
−
=
−
=+
+−
=
+
⎟⎠⎞
⎜⎝⎛⋅⋅−⎟
⎠⎞
⎜⎝⎛⋅=
⎟⎠⎞
⎜⎝⎛ +
⋅⋅+⎟⎠⎞
⎜⎝⎛ +
⋅=
=
1
012
1
02
1
012
1
02
2exp2exp
)(2exp)(2exp
m
kk
m
k
jk
m
kk
mjm
kk
jm
mijkc
mijkc
mkmjic
mkmjic
v
πωπ
πωπ
4
FFT: Recursive Algorithm
FUNCTION(v0,...,vn-1) = IDFT(c0,…,cn-1,n)IF n==1
v0 = c0 ;ELSE
m=n/2 ;(g0,…, gm-1) = IDFT(c0, c2,…, cn-2,m) ;(u0,…, um-1) = IDFT(c1, c3,…, cn-1,m) ; ω = exp(2iπ/n) ;FOR j=0:m-1
vj = gj + ωj uj ;vj+m = gj - ωj uj ;
ENDEND
Partitioning the sums in the IDFT in odd and even coefficients deliversthe first and the second part of v = IDFT(c):
12
(c0, c1, c2, c3, c4, c5, c6, c7)
(c0, c2, c4, c6)
(c0, c4) (c2, c6)
(c0) (c4) (c2) (c6)
(c1, c3, c5, c7)
(c1, c5)
(c1)
13
(c0, c1, c2, c3, c4, c5, c6, c7)
(c0, c2, c4, c6)
(c0, c4) (c2, c6)
(c0) (c4) (c2) (c6)
(c1, c3, c5, c7)
(c1, c5)
(c1) (c5)
14
(c0, c1, c2, c3, c4, c5, c6, c7)
(c0, c2, c4, c6)
(c0, c4) (c2, c6)
(c0) (c4) (c2) (c6)
(c1, c3, c5, c7)
(c1, c5)
(c1) (c5)
15
(c0, c1, c2, c3, c4, c5, c6, c7)
(c0, c2, c4, c6)
(c0, c4) (c2, c6)
(c0) (c4) (c2) (c6)
(c1, c3, c5, c7)
(c1, c5) (c3, c7)
(c1) (c5) (c3)
16
(c0, c1, c2, c3, c4, c5, c6, c7)
(c0, c2, c4, c6)
(c0, c4) (c2, c6)
(c0) (c4) (c2) (c6)
(c1, c3, c5, c7)
(c1, c5) (c3, c7)
(c1) (c5) (c3) (c7)
17
(c0, c1, c2, c3, c4, c5, c6, c7)
(c0, c2, c4, c6)
(c0, c4) (c2, c6)
(c0) (c4) (c2) (c6)
(c1, c3, c5, c7)
(c1, c5) (c3, c7)
(c1) (c5) (c3) (c7)
18
(c0, c1, c2, c3, c4, c5, c6, c7)
(c0, c2, c4, c6)
(c0, c4) (c2, c6)
(c0) (c4) (c2) (c6)
(c1, c3, c5, c7)
(c1, c5) (c3, c7)
(c1) (c5) (c3) (c7)
19
(c0, c1, c2, c3, c4, c5, c6, c7)
(c0, c2, c4, c6)
(c0, c4) (c2, c6)
(c0) (c4) (c2) (c6)
(c1, c3, c5, c7)
(c1, c5) (c3, c7)
(c1) (c5) (c3) (c7)
20
(c0, c1, c2, c3, c4, c5, c6, c7)
(c0, c2, c4, c6)
(c0, c4) (c2, c6)
(c0) (c4) (c2) (c6)
(c1, c3, c5, c7)
(c1, c5) (c3, c7)
(c1) (c5) (c3) (c7)
21
c0
c1
c2
c3
c4
c5
c6
c7
c0
c2
c4
c6
c1
c3
c5
c7
c0
c4
c2
c6
c1
c5
c3
c7
c0
c2
c1
c3
c4
c6
c5
c7
g0
g1
g2
g3
u0
u1
u2
u3
v0
v1
v2
v3
v4
v5
v6
v7
g0
g0
u0
g0
u0
g0
u0
u0
g0
g1
u0
u1
g0
g1
u0
u1
22
FFT sequentiallyThe recursive formulation of the FFT can be written by log(n) simple loops.
Thereby, the first step is the reordering of the variables Bitreversal
Index k = (kp, … , k1)2 (k1, … , kp )2 ,
e.g. 5 = (0 0 1 0 1)2 (1 0 1 0 0)2 = 16 + 4 = 20, c5 c20
After permutation, the butterflies have to be applied between elements of
certain distance.
aj aj + ωj bjButterfly:
bj aj - ωj bj
ωj
24
c0
c1
c2
c3
c4
c5
c6
c7
c0
c2
c1
c3
c4
c6
c5
c7
g0
u0
g0
u0
g0
u0
g0
u0
g0
g1
u0
u1
g0
g1
u0
u1
u0 v2
g0 v0
u1 v3
g1 v1
g0 v0
u0 v2
g1 v1
u1 v3
25
c0
c1
c2
c3
c4
c5
c6
c7
c0
c2
c1
c3
c4
c6
c5
c7
g0
u0
g0
u0
g0
u0
g0
u0
g0
g1
u0
u1
g0
g1
u0
u1
g0
g1
g2
g3
u0
u1
u2
u3
g0 v0
u0 v4
g3 v3
u3 v7
g1 v1
u1 v5
g2 v2
u2 v6
26
FFT in ParallelCosts in parallel: n processors k = 0, 1, 2, …, 2p
each processor computes Bitreversal k and sendsits entry to the resulting processor π(k)
Then each two neighbouring processors compute butterfly,
log(n) times with growing distance
n processors, log(n) time steps.
Advantage over trivial parallelization only in number of processors
27
c0
c1
c2
c3
c4
c5
c6
c7
c0
c2
c1
c3
c4
c6
c5
c7
g0
g1
g2
g3
u0
u1
u2
u3
v0
v1
v2
v3
v4
v5
v6
v7
g0
g0
u0
g0
u0
g0
u0
u0
g0
g1
u0
u1
g0
g1
u0
u1
Bitreversal
28
FFT on HypercubeDistribute entries on vertices of hypercube.
Butterfly has to been applied always between neighbors in distance 1,2,4,…
Hence, the binary indices differ only at one position.
Therefore, butterflies have to computed only between neighboring vertices
c6 c7
c2 c3
2c4 c5
3c0 1 c1
110 111
010 011
2100 101
3000 1 001