vlsi implementation of high throughput mimo …nopr.niscair.res.in/bitstream/123456789/15159/1/ijems...
TRANSCRIPT
Indian Journal of Engineering & Materials Sciences
Vol. 19, October 2012, pp. 307-319
VLSI implementation of high throughput MIMO OFDM transceiver for
4th generation systems
J Raja* & M Kannan
Department of Electronics Engineering, MIT, Anna University, Chennai 600 044, India
Received 24 February 2012; accepted 13 August 2012
This paper aims to maximize throughput by minimizing power as minimum as possible. Scores of optimization
techniques such as FFT, IFFT and memory optimization are available for reducing power of mobile OFDM systems. An
approach for achieving reduction in power of MIMO OFDM system by optimizing FFT architecture is addressed in this
paper. Memory references in MIMO OFDM transceiver are costly due to their long delay and high power consumption. To
implement fast Fourier transform (FFT) algorithms on MIMO OFDM, involves many memory references to access butterfly
inputs and twiddle factors. Conventional FFT implementations require unused memory references to load the same twiddle
factors for butterflies from different stages in FFT diagrams. To minimize memory references due to twiddle factors for
implementing FFT algorithms in MIMO OFDM systems, memory reference reduction method is incorporated here. Twiddle
factor is calculated using binary scaling technique. The proposed FFT structure is the combination of memory reference
reduction method with binary scaling technique and Radix-4 booth multiplier. Here multipliers in FFT are realized with
shifters, adders and subtractors. The proposed structure is evaluated using performance parameters such as BER and SNR.
Structural realization and analysis pertaining to timing, power and throughput are implemented in Virtex-4 and analysis is
carried out in Altera respectively.
Keywords: Multiple input multiple output (MIMO), Orthogonal frequency division multiplexing (OFDM),
Fast Fourier transform (FFT), Inverse fast Fourier transform (IFFT), Inter symbol interference (ISI)
Multiple input multiple output (MIMO) system
consists of multiple antennas at the transmitter and
receiver ends to improve link reliability and data rates
of the wireless communication system. Orthogonal
frequency division multiplexing is efficient in
synchronizing the received signal under fading
environment and has been used in past times in
applications that require a huge data rate. Fast Fourier
transform (FFT)/ inverse FFT (IFFT) processors are
proposed for multiple-input multiple-output
orthogonal frequency division multiplexing based
IEEE 802.11n. Here the processor not only supports
the operation of FFT/IFFT but also provides sufficient
throughput rates but the drawback is hardware
complexity is more, compared with conventional
approaches1. FFT architecture which is optimized for
a processor with a memory system containing a cache
is discussed elsewhere2. The processor uses a cached
memory with increased efficiency and performance
for MIMO system, but the achieved throughput is
very less. The architecture is mainly focused on
minimum mean square error detector in decoding
section for wireless specification, but the difficulty is
hardware complexity3. The reported architecture
4 is
implemented in matrix inversion circuit in MIMO
detection wireless specification, the problem is the
increase in hardware complexity and throughput is
less compared with conventional approach. Variable
length FFT processor design based on Radix 2, Radix
4 and Radix 8 with complex multipliers containing 3
multiplications described earlier5, consumes more
power.
ASIC based MIMO for OFDM provides data rate
of 192 Mbps with a 20 MHz bandwidth for IEEE
802.11a standard. Here the paper mainly focuses on
silicon complexity of MIMO OFDM system.
Throughput of the system is very less compared with
other systems6. MIMO OFDM Base band transceiver
implementation is based on ASIC. Verification is
based on testbed and GUI monitor. Here the authors
have developed separate testbeds for MIMO OFDM
system for verification purpose but haven’t
concentrated on throughput and BER7. In order to
meet IEEE 802.11n requirements8,9
the processor not
only supports the operation of FFT/IFFT in 128 points
and 64 points but can also provide different _______________________
*Corresponding author (E-mail: [email protected])
INDIAN J. ENG. MATER. SCI., OCTOBER 2012
308
throughput rates for simultaneous data sequences. The
difficulty is number of point’s increases more and
more. Hence, the cost and complexity of the system is
also increased. IEEE 802.11a established for WLAN
standard, provides 54 Mbps throughput using SISO
OFDM transceiver. Here the throughput of SISO
OFDM is very less; to increase the throughput,
MIMO OFDM is preferred10
. The IEEE 802-11n
provides data rate up to 600 Mbps with transmission
speed of 80 MHz. Latency is incurred by
pre-processing the channel matrices for MIMO
detection11
. Different approaches for design and
implementation of 4×4 MIMO OFDM transceiver
which provides date rate of 1 Gbps is discussed
elsewhere12-15
. Here MMSE MIMO detector is used to
reduce the latency but the disadvantage is that it
requires larger circuit for high speed transmission. In
any n bit 2’s complement multiplier the maximum
number of addition and subtraction operation is n/216
.
However, the above mentioned results are still
insufficient in day to day requirements like
transmission speed and power. In this paper, the VLSI
implementation of MIMO OFDM transceiver suitable
for low power application has been studied. The FFT
optimization helps to reduce the delay time, power
thereby simultaneously increasing speed.
Orthogonal Frequency Division Multiplexing
Orthogonal frequency division multiplexing
(OFDM) is a technique which is a subset of frequency
division multiplexing in which a single channel uses
multiple sub-carriers on neighbouring frequencies.
The sub-carriers of an OFDM system overlap each
other. Normally the neighbouring channels which
overlap may obstruct one another. Since the sub-
carriers in an OFDM system are exactly orthogonal to
one another, they are capable of overlapping without
interfering. As a result, OFDM systems are capable of
maximizing spectral efficiency without causing
neighbouring channel interference. Since multiple
sub-carriers are used to transmit in an individual
channel, an OFDM communication system must
perform several steps, which are described in Fig.1.
In an OFDM system, each channel is divided into a
number of sub-carriers. A serial bit stream is converted
into several parallel bit streams. After the bit stream
division among the individual sub-carriers, each sub-
carrier is modulated. The receiver performs the vice-
versa process, which first divides the incoming signal
into appropriate sub-carriers and then demodulate them
individually before recovering the original bit stream.
The data is modulated into waveform at inverse
fast Fourier transform (IFFT) stage of the transmitter.
The IFFT is used to modulate each sub-channel onto
the appropriate carrier. The receiver performs the
vice-versa process of the transmitter. FFT is used to
demodulate the data.
One of the main drawback in wireless
communication systems is ISI which is caused due to
the multi-path reflections in the channel. In order to
reduce ISI a cyclic prefix is added. The added cyclic
prefix is a part of the first stage of a symbol which is
described in Fig. 2. This addition is necessary because
it enables multi-path representations of the source
signal to fade, so that it won’t interfere with the
subsequent symbol.
After the addition of cyclic prefix to the sub-carrier
channels, it must be transmitted as a single signal.
Thus, the process of parallel to serial conversion stage
is done by adding all sub-carriers and combining them
into a single signal. At the end of this process all
sub-carriers are generated perfectly in a simultaneous
manner.
MIMO OFDM
MIMO OFDM (multiple input multiple output,
orthogonal frequency division multiplexing) is a
technology that combines MIMO and OFDM together
to transmit data in wireless communications, in order
to deal with frequency selective channel effect. The
OFDM signal on each subcarrier can overcome
narrowband fading. Therefore, OFDM can transform
Fig. 1 OFDM structure
Fig. 2 Cyclic prefix insertion format
RAJA & KANNAN: VLSI IMPLEMENTATION OF HIGH THROUGHPUT MIMO OFDM TRANSCEIVER
309
frequency-selective fading channels into parallel flat
ones. Then by combining MIMO and OFDM
technology together, MIMO algorithms can be
applied in broadband transmission
A MIMO OFDM system transmits data modulated
by OFDM from multiple antennas simultaneously. At
the receiver, after OFDM demodulation, the signals
are recovered by decoding each sub-channel from all
transmitting antennas.
In the basic structure of MIMO OFDM shown in
Fig. 3, the signals are modulated by OFDM
modulator, then they are transmitted by MIMO
system, finally, the signals are recovered by the
OFDM demodulator.
Therefore, MIMO OFDM achieves spectral
efficiency, increased throughput and thus the
inter-symbol interference (ISI) can be reduced.
Fast Fourier Transform
Conventional FFT algorithm
The modulation of OFDM can be done using an
IDFT. The fast implementation of IDFT is
accomplished by deploying IFFT, which reduces
processing time and hardware usage. The
demodulation is through an DFT or better by using
FFT, which is an efficient way of implementation.
In order to reduce numerous calculations involved
in calculation of DFT, FFT is used. To achieve the
increased efficiency an additional step is involved
(to reverse order the data). These additional steps
won’t increase the computational complexity of the
FFT calculation. As a result, FFT is an extremely
efficient algorithm that provides a good
implementation in hardware.
If N is very large, the computation efficiency is
increased by dividing DFT successively in the smaller
calculation. The DFT of discrete signal x(n) can be
computed directly using Eq. (1).
1
0
( ) ( )N
nk
N
n
X K x n W−
=
=∑ , k=0,1,2,3............N-1 … (1)
Calculation of FFT is done using decimation in
time (DIT) or decimation in frequency (DIF)
algorithm. If it is DIT, first twiddle factor is
multiplied and later it is summed up. It is computed
with the help of Eq. (2). The butterfly diagram
pertaining to DIT is shown in Fig. 4.
)1,2,1,0( )()(
)()(
)12()2(
)()()()(
21
12
0 2
2
12
0 2
1
12
0
)12(
12
0
2
1
)(0
1
)(0
1
0
−=+=
+=
++=
+==
∑∑
∑∑
∑∑∑
−
=
−
=
−
=
+
−
=
−
=
−
=
−
=
NkkXWkX
WrxWWrx
WrxWrx
WnxWnxWnxkX
k
N
N
r
rk
N
k
N
N
r
rk
N
N
r
kr
N
N
r
rk
N
N
oddn
nk
N
N
evenn
nk
N
N
n
nk
N
�
… (2)
If it is DIF, first the input is summed up and then
the twiddle factor is multiplied. It is computed using
the Eq. (3). The butterfly diagram pertaining to DIF is
shown in Fig. 5.
)1,1,0( )2
()1()(
)2
()(
)2
()(
)()()()(
12
0
12
0
2
12
0
)2
(1
2
0
1
2
12
0
1
0
−=
+−+=
++=
++=
+==
∑
∑
∑∑
∑∑∑
−
=
−
=
−
=
+
−
=
−
=
−
=
−
=
NkWN
nxnx
WN
nxWnx
WN
nxWnx
WnxWnxWnxkX
N
n
nk
N
k
N
n
nk
N
kN
N
N
n
kN
n
N
N
n
nk
N
N
Nn
nk
N
N
n
nk
N
N
n
nk
N
�
… (3)
The conventional N point FFT, requires (N/2) log2
N multiplication operations and N log2 N addition
operations. If it is N bit multiplier, it will generate 2N
partial products. The entire FFT butterfly module of
Fig. 3 MIMO OFDM structure
Fig. 4 Radix 2 DIT FFT butterfly diagram
INDIAN J. ENG. MATER. SCI., OCTOBER 2012
310
conventional FFT is shown in Fig. 5a. Due to the
partial products the delay is increased and it consumes
more power. Hence, it will affect the throughput of
the mobile system. Thus, there is a need for reducing
the consumed power with decreased delay thereby
dragging down the throughput of the system for
which we propose modified FFT algorithms which
helps to overcome the above mentioned issues.
Modified fast Fourier transform algorithm
During FFT implementation identical twiddle
factors occupies enormous memory. In this paper
identical twiddle factors are grouped together and
computed, before computing the other twiddle factors.
The twiddle factors are fed in to the structure only
once and the amount of memory due to the identical
twiddle factors can be minimized. The twiddle factors
are computed using the binary scaling method. In
binary scaling the twiddle factors are calculated by
simply shifting the bits. The entire multiplication
operation is completely replaced by shifting
operation.
For example, in DIT butterfly diagram shown in
Fig. 3, the butterflies with twiddle factor W160 appears
in Stage 1, Stage 2, Stage 3 and Stage 4. Hence, these
twiddle factors are grouped together and computed in
any order in the 16-pt radix-2 DIT FFT diagram. This
can be achieved using the below identities (5)-(8).
WWWWNm
N
Nm
N
N
N
m
Nj
4/4/4/.
−−
−== m (N/4, N/2) … (5)
= WWWNm
N
Nm
N
N
N
2/2/2/.
−−
−= m (N/2, 3N/4) … (6)
= WWWNm
N
Nm
N
N
Nj
4/34/34/3.
−−
= m (3N/4, N) … (7)
= Wm
N
m (0, N/4) … (8)
The butterflies with twiddle factor W164 which
appears in Stage 4, Stage 2 and Stage 3 are computed.
Hence, twiddle factors are grouped together and
computed in any order in the 16-pt radix-2 DIT FFT
diagram. Similarly, the butterflies with twiddle factors
W162 and W16
6 appears only in Stage3 and Stage4 of the
16-pt radix-2 DIT FFT diagram. These twiddle factors
are grouped together and computed, without affecting
the computations of other butterflies. The butterflies
can be computed together by loading only one twiddle
factor mWm
Nand it is described in Eqs (9)-(11).
=m (n mod N/2i-1
)x2i-1
… (9)
jWWiiiii xNn
N
kxNNn
N −
−−−−+
=+
11111 2))2/mod(2))2/mod())2/((
… (10)
jWWiiii xNn
N
xNn
N −
−−−−
=
1111 2))2/mod(2))2/mod(
… (11)
where i is the number of stages. Finally, twiddle factors W16
1, W16
3, W16
5 and W16
7
appear only in Stage 4. These twiddle factors are
grouped together and computed, without affecting the
computations of other butterflies. Each twiddle factor is
loaded only once and redundant memory references for
identical twiddle factors are removed which is shown in
Fig. 4. Twiddle factor W165 can be replaced by -jW16
1
with a simple derivation as described in Eq. (12).
Fig. 5– Radix 2 DIF FFT butterfly diagram
Fig. 5a Block diagram of conventional FFT butterfly module
Fig. 6 Modified DIT FFT butterfly diagram
RAJA & KANNAN: VLSI IMPLEMENTATION OF HIGH THROUGHPUT MIMO OFDM TRANSCEIVER
311
W16 5
= W16 –j*(2π/16)*5
= (W16 –j*(2π/16)*4
) * (W16 –j*(2π/16)*1
)
= -jW16 1
… (12)
Similarly, twiddle factors W166 and W16
7 can be
replaced by -jW162 and -jW16
3, respectively.
The entire FFT butterfly module of modified FFT
is shown in Fig. 6a. Modified FFT is the combination
of binary scaling, which will convert floating point in
to fixed point and Radix 4 multiplier, which will
generates N/3 partial product.
Here, binary scaling technique is used to calculate
the twiddle factor. Binary scaling is a programming
technique used mainly in DSP to perform a pseudo
floating point using integer arithmetic. It is both faster
and more accurate than directly using floating point.
In binary scaling the floating point is multiplied with
256, which will convert floating point in to fixed
point and it is given in Table 1.
Radix-4 multiplier
The complexity of conventional FFT is dominated
by multiplier. Ordinary multipliers generate 2N partial
products. Due to the partial products, the latencies and
power consumption of the MIMO OFDM transceiver
are increased. The Radix-4 booth multiplier generates
N/3 partial product. It consumes less power and short
delay because of its high speed parallel multiplier. The
proposed FFT structure is the combination of binary
scaling technique and Radix-4 booth multiplier.
In general Radix-4 booth multiplier is preferred rather
than Radix-2 booth multiplier because of the following
disadvantages of Radix-2 booth multiplier: (i) the number
of addition, subtraction and shift operations in Radix-2
booth algorithm are variable and it is very difficult to
incorporate in designing of parallel multipliers, (ii) the
Radix-2 multiplier becomes suitable only when there are
isolated 1’s and (iii) the number of partial product is N.
The above problems are nullified by using
modified Radix-4 booth algorithm. It requires only
N/3 partial product. The procedure of Radix-4 booth
algorithm is described as: (i) if N is even then extend
the sign bit, (ii) append a 0 to the LSB of the
multiplier and (iii) based on the block value, each
partial product will be 0, +X, -X, +2X and -2X. The
negative values of X are obtained by taking the 2’s
complement. Based on the above result the booth
encoded table is developed and tabulated in the Table 2.
The recoding is done by encoding three bits of
multiplier X.
The proposed FFT processor described above has
been implemented using Xilinx Virtex-4 FPGA.
Based on the analysis, device utilization summary is
given in Table 3. From the results, we can observe
that the modified FFT algorithm requires only 10% of
slices, 11% of flip flops and 7.5% of LUTs compared
to the conventional FFT Algorithm.
Fig. 6a Block diagram of modified FFT Butterfly Module
Table1 Binary scaled value
S.No Twiddle factor Binary scaled value
1 1+j0 256+j0
2 0+j1 0+j256
3 0-j1 0-j256
4 0.707+j0.707 181+j181
5 -0.707-j0.707 -181-j181
Table 2 Radix-4 encoded table
Block Partial product
000 0*X
001 1*X
010 1*X
011 2*X
100 -2*X
101 -1*X
110 -1*X
111 0*X
Table 3 Device utilization summary
Modified FFT
algorithm
Conventional FFT
algorithm
Logic
Utilization
Used Available Used Available
No of slices 74 6144 782 6144
No of slices
Flip flop
92 12288 1040 12288
No of 4 input
LUTS
142 12288 1080 12288
No. of
bonded IOBs
187 240 321 240
No of
GCLKS
1 32 1 32
No. of DSP
48s
12 32 32 32
INDIAN J. ENG. MATER. SCI., OCTOBER 2012
312
Throughput of the mobile system entirely depends
on the power consumption and delay. The proposed
system consumes less power and short delay. Hence,
the throughput of the mobile system is increased.
Based on the results, the performance of the MIMO
OFDM is compared and given in Table 4.
Results of Simulation Code development and simulation were carried out
on a system running on Intel P4 configuration having
4 GB RAM working in windows 7 Platform. The
MIMO-OFDM structure for determination of
constellation points and BER were coded and
simulated in MATLAB. Structural realisation of
MIMO OFDM transceiver is implemented in Xilinx.
Analysis pertaining to timing, frequency is done in
Altera. The experimental results and analysis obtained
using Matlab, Xilinx and Altera are presented and
discussed.
MATLAB implementation output
QAM able to carry higher data rates than the
other schemes. QAM sends digital information by
periodically adjusting the phase and amplitude. Four-
QAM uses four combinations of phase and amplitude.
Moreover, each combination is assigned a 2-bit digital
pattern. For example, to generate the bit stream
(1,1,0,0,1,1). Because each symbol has a unique 2-bit
digital pattern, these bits are grouped in two’s so that
they can be mapped to the corresponding symbols. In
our example, the original bit stream (1,1,0,0,1,1)
grouped into three symbols (11,00,11). The
constellation plot in Fig. 7 shows the symbol map of
4-QAM with each possible phase (Θ) and amplitude
(A) of a carrier signal in polar coordinate form.
Bit error rate (BER) is defined as the ratio of
number of error bits and total number of bits. There
are many ways of reducing BER. Here, we focused on
Alamouti code and modulation techniques. In this
study we have taken AWGN is communication
channel where noise gets spread over the entire
spectrum of frequencies. BER has been calculated by
comparing the transmitted signal with the received
signal and computing the error count over the total
number of bits. For any given modulation, the BER is
normally expressed in terms of signal to noise ratio
(SNR). The bit error rate is improved from 10-6
with
the almost need of 10 dB of SNR as shown in Fig. 8.
FPGA implementation output
Table 5 shows the percentage minimization of
power, memory references, space in bytes and clock
cycles, and Table 6 shows the reduction in number of
slices, Flip flops, LUT’s and I/O blocks.
Fig. 7 QPSK signal constellation points
Table 4– Performance Comparisons
S.No Parameter MIMO
OFDM
Mixed Radix
FFT
architecture
based
MIMO
OFDM
Modified
FFT
architecture
based
MIMO
OFDM
1 Power dissipation 597 mW 540 mW 111 mW
2 Number of
memory reference
32 24 15
3 Storage space in
bytes
32 30 12
4 No. of clock cycle 698 600 540
5 Throughput 600 Mbps 700 Mbps 1.4 Gbps
6. Delay 9.43 ns 8.89 ns 4.04 ns
7. Logic elements 9940 9813 5844
Table 5 Percentage reduction of parameters compared with
conventional approaches
Sl. No Parameter A B C
1 % of power
dissipation
90 20 5
2 % of
memory
reference
75 45 11
3 % of space
in bytes
94 38 9
4 % of clock
cycle
86 77 19
A – Mixed Radix FFT + MIMO OFDM Transceiver;
B – Modified FFT + MIMO OFDM Transceiver
C – Modified FFT + SISO OFDM Transceiver
RAJA & KANNAN: VLSI IMPLEMENTATION OF HIGH THROUGHPUT MIMO OFDM TRANSCEIVER
313
Synthesis output SISO OFDM RTL view
The OFDM structure was synthesized and
simulated in Vertex-4 as shown in Fig. 9. The
proposed method can achieve average of 11%
reduction in number of memory references, 9%
saving of memory space due to the twiddle factor and
average of 19% reduction in number of clock cycles.
MIMO OFDM RTL view
MIMO OFDM structure was synthesized and simulated
in Vertex-4 FPGA as shown in Fig. 10. The proposed
structure was constructed using four transmitting and four
receiving antennas. Transmitter outputs and receiver inputs
are combined using spatial multiplexing technique. The
proposed method can achieve average of 20% reduction in
number of memory reference, 38% saving of memory
space due to twiddle factor and average of 77% reduction
in the number of clock cycle as compared with
conventional approaches which is evident in Table 5.
FFT RTL View
To implement the conventional FFT method, it
requires 1040 number of FFs, 321 numbers of I/O
blocks and 1080 number of LUTs. Number of FFs
and number of LUTs decide the complexity and cost
which is shown in Fig. 11.
Table 6 Device utilization expressed in percentage
Sl. No Utilization A B
1 % of Slices 3 13
2 % of Slices flip flop 1 8
3 No. of 4 Input LUTS 1.1 9
4 % of bonded IOBs 78 99.99
A – Modified FFT; B– Conventional FFT
Fig. 8 SNR versus BER
Fig. 9 OFDM transceiver RTL output
INDIAN J. ENG. MATER. SCI., OCTOBER 2012
314
Proposed FFT RTL View
To implement the proposed FFT method, it requires
92 numbers of FFs, 187 numbers of I/O blocks and
142 numbers of LUTs. When compared with conventional
method, the proposed method requires only 1% of FFs, 78%
of IOBs and 1.1% of LUTs. This is depicted in Fig. 12.
Analysis output
Power analysis
Power consumed by the MIMO OFDM
transceiver system is 111 mW is shown in Fig. 13.
Total power is the sum of the static power, dynamic
power and I/O power.
Fig. 10 MIMO OFDM transceiver RTL output
Fig. 11 FFT RTL output
RAJA & KANNAN: VLSI IMPLEMENTATION OF HIGH THROUGHPUT MIMO OFDM TRANSCEIVER
315
Timing analysis
Fmax of MIMO OFDM system is 235 Mhz. Based
on the Fmax Throughput of the system is calculated
as 1.4 Gbps. Fmax of MIMO OFDM transceiver is
shown in the Fig. 14.
Simulation output
FFT Simulation output
Simulation output of FFT is shown in Fig. 15. The
binary inputs in0, in1, in2, in3, in4, in5, in6 and in7
are passed through the butterfly units. Finally, the real
part outputs are taken from outr0,outr1,outr2,outr3,
Fig. 12 Modified FFT with binary scaling technique and radix-4 Booth multiplier RTL output
Fig. 13 Power analysis Output
INDIAN J. ENG. MATER. SCI., OCTOBER 2012
316
outr4,outr5,outr6 and outr7 and imaginary part
outputs are taken from outi0,outi1,outi2,outi3,
outi4,outi5,outi6 and outi7.
Modified FFT simulation output
Simulation output of modified FFT with binary
scaling radix-4 booth multiplier technique is shown in
the Fig. 16. The binary inputs in0, in1, in2, in3, in4,
in5, in6 and in7 are passed through the memory
reference reduction technique and higher radix
multiplier. Finally, the real part outputs are taken
from outr0,outr1, outr2, outr3, outr4, outr5, outr6 and
outr7 and imaginary part outputs are taken from outi0,
outi1, outi2,outi3, outi4, outi5, outi6 and outi7.
Simulation output
Simulation output 4×4 MIMO OFDM transceiver
is shown in Fig. 17. The inputs in, in2, in3 and in4 are
Fig. 14 Timing analysis Output
Fig. 15 FFT simulation output
RAJA & KANNAN: VLSI IMPLEMENTATION OF HIGH THROUGHPUT MIMO OFDM TRANSCEIVER
317
passed through the modified FFT algorithm in the
transmitter section. The transmitted outputs are
spatially multiplexed and the transmitted data are
received by receiver using MRC (maximum ratio
combining) techniques. Finally, the outputs are taken
from out1, out2, out3 and out4.
Performance analysis
Power analysis
Proposed system requires less power which is very
less compared with other systems. Power is inversely
proportional to throughput. Maximum throughput is
achieved by minimizing power as minimum as
possible. The proposed system power is 111 mW
which is minimum. Hence, throughput of the system
is increased as shown in Fig. 18.
Delay analysis
Proposed system generates short delay which is
very less compared with other systems. Delay is
inversely proportional to the throughput. Maximum
throughput is achieved by minimizing delay as
Fig. 16 Modified FFT with binary technique and radix-4 booth multiplier simulation output
Fig. 17 MIMO OFDM transceiver Simulation output
INDIAN J. ENG. MATER. SCI., OCTOBER 2012
318
minimum as possible. The gate delay of proposed
system is 4.04 ns which is very small. Hence, the
throughput of the system can be increased as shown in
Fig. 19.
Throughput analysis
The throughput of the proposed system is 1.4 Gbps,
this can be achieved with the help of modified FFT
architecture. Modified FFT architecture reduces
redundant memory reference, number of partial
products and number of non zero digits as shown in
Fig. 20.
Logic elements analysis
Only 58% of logic elements is required to
implement the proposed system. The number of logic
elements determines the cost and complexity of the
hardware. When compared with other system the
proposed system requires very minimum number of
logic elements, hence the cost and complexity of the
system is reduced as shown in Fig. 21.
Power dissipation versus throughput
Maximum throughput is obtained at minimum
power that is 111 mW power. When the power
increases more and more correspondingly throughput
of the system gradually decreases as shown in Fig. 22.
From the graph we conclude that the proposed system
consumes less power and transmits maximum number
bits per second.
Memory reference versus performance parameters of different
MIMO OFDM systems
The main objective of this paper is to reduce the
power and increase the throughput. Memory reference
Fig. 18 Power versus different MIMO OFDM systems
Fig. 19 Delay versus different MIMO OFDM systems
Fig. 20 Throughput versus different MIMO OFDM systems
Fig. 21 Logic element analysis versus different MIMO OFDM
systems
RAJA & KANNAN: VLSI IMPLEMENTATION OF HIGH THROUGHPUT MIMO OFDM TRANSCEIVER
319
in FFT is costly due to the long delay and high power
consumption. Memory reference reduction method is
used in FFT to eliminate redundant memory
reference. Based on the result the graph is plotted in
Fig. 23. From the graph we conclude that the
proposed method delivers 1.4 Gbps maximum
throughput with minimal number of memory
references of 15 with the corresponding delay of
4.04 ns and the power is 111 mW.
Conclusions
In this work, we have presented a VLSI
implementation of high throughput MIMO OFDM
transceiver for 4G system which achieves 1.4 Gbps
throughput. The circuit was implemented in a 28 nm
(transistor gate size) library with less circuit area and
evaluated in lower power dissipation. This can be
achieved by optimizing FFT architecture (memory
reference reduction and binary scaling methods are
used to minimize redundant memory references) and
Radix-4 booth multiplier algorithm (Radix-4
multiplier technique is used to reduce the power
dissipation) which reduces number of partial products.
References 1 Fu Bo & Ampadu Paul, J Signal Process Syst, 56(1) (2009)
59-68.
2 Chang Y & Park S C, IEICE Trans Fundamentals, E87-A
(11) (2004) 3020-3024.
3 Kim Hun Seok, Zhu Weijun, Bhatia Jatin, Mohammed
Karim, Shah Anish & Daneshrad Babak, EURASIP J Adv
Signal Process, 2008.
4 LaRoche Isabelle & Roy Sebastien, An efficient regular
matrix inversion circuit architecture for MIMO processing,
IEEE Int Symp on Circuits and Systems (ISCAS), May 2006,
pp.4819-4822.
5 Lin Y T, Tsai P Y & Chiueh T D, IEE Proc Comput Digit
Technol, 152(4) (2005) 499-506.
6 Perels D, Haene S, Luethi P, Burg A, Felber N, Fichtner W
& Bolcskei H, IEEE Trans VLSI Syst, 5 (2005) 215-218.
7 Greisen Pierre, Haene Simon & Burg, EURASIP J Embedded
Syst, 2008, Article ID242584.
8 Reisis D & Vlassopoulos N, IEEE Trans Circuits Syst I,
55(11) (2008) 3438-3447.
9 Radhouane R, Liu P & Modlin C, in Proc. IEEE Int Symp
Circuits Syst, 1 (May 2000) 116-119.
10 Yoshizawa Shingo & Miyanaga Yoshikazu, VLSI
implementation of SISO-OFDM and MIMO-OFDM
transceivers, IEEE Int Symp Communtions and Information
Technologies (ISCIT), No. T2D-4, Oct 2006.
11 Yoshizawa Shingo, Yamauchi Yasushi & Miyanaga
Yoshikazu, A complete pipelined MMSE detection
architecture in a 4x4 MIMO-OFDM receiver, IEEE Int
Symp on Circuits and Systems (ISCAS), May 2008, pp.
1248-1251.
12 Yoshizawa Shingo, Yamauchi Yasushi & Miyanaga
Yoshikazu, VLSI Architecture of a 4x4 MIMO-OFDM with
an 80-MHz Channel Bandwidth Transceiver, IEEE Int Symp
on Circuits and Systems (ISCAS), May 2009, pp. 4244-4247.
13 Yoshizawa Shingo, Yamauchi Yasushi & Miyanaga
Yoshikazu, VLSI Implementation of a 4x4 MIMO-OFDM
Transceiver for 1Gbps Data Transmission, IEEE Int Symp
on Circuits and Systems (ISCAS), May 2010, pp. 1743-1746.
14 Lamarca Rey F & Vazquez M G, IEEE Trans Signal
Process, 53 (3) (2009) 1741-1755.
15 Shin M & Lee H, A high-speed four-parallel radix-2 4
FFT/IFFT processor for UWB applications, Proc. IEEE Int.
Symp. Circuits and Systems, May 2008, pp.960-963.
16 Ma G K & Taylor F J, IEEE ASSP Mag, Jan 1990, pp. 6-20.
Fig. 22 Power versus throughput
Fig. 23 Memory Reference versus Performance parameters of
Different MIMO OFDM Systems