docu
DESCRIPTION
signed and unsigned booths multiplierTRANSCRIPT
High speed Modified Booth Encoder multiplier for signed and unsigned numbers
1. INTRODUCTION
In digital computing systems multiplication is an arithmetic operation. The multiplication
operation consists of producing partial products and then adding these partial products the
final product is obtained. Thus the speed of the multiplier depends on the number of partial
product and the speed of the adder. Since the multipliers have a significant impact on the
performance of the entire system, many high performance algorithms and architectures have
been proposed. The very high speed and dedicated multipliers are used in pipeline and vector
computers. The high speed Booth multipliers and pipelined Booth multipliers are used for
digital signal processing (DSP) applications such as for multimedia and communication
systems . High speed DSP computation applications such as Fast Fourier transform (FFT)
require additions and multiplications.
Multiplication is a basic arithmetic operation that is important in microprocessors,
digital signal processing, and other modern electronic machines. VLSI Designers have
recognized this importance by dedicating significant resources and area to integer and
floating point multipliers. As a result, it is desirable to reduce the cost of these multipliers by
using efficient algorithms that do not compromise performance
Dept of ECE,M.Tech(VLSI),SITAMS Page 1
High speed Modified Booth Encoder multiplier for signed and unsigned numbers
2. BRIEF LITERATURE REVIEW
. This report describes an extension to signed and unsigned modified booth’s
encoder(SUMBE) for binary multiplication that reduces the cost of such high performance
multipliers.
To explain this scheme, background material on existing algorithms is presented.
These algorithms are then extended to produce the new method. Since the method is
somewhat complex, this report will attempt to stay away from implementation details, but
instead concentrate on the algorithms in a hardware independent manner.
Multiplication is more complicated than addition, being implemented by shifting as well as
addition.
Multiplication: Partial products generation + accumulation
Because of the partial products involved in most multiplication algorithms, more
time and more circuit area is required to compute, allocate, and sum the partial products to
obtain the multiplication result.
In order that the basic algorithms are not obscured with small details, unsigned
multiplication only will be considered here, but the algorithms presented are easily
generalized to deal with signed numbers.
Multiplication consists of three major steps:
1) recoding and generating partial products;
2) reducing the partial products by partial product reduction schemes (e.g., Wallace
tree to two rows;
3) adding the remaining two rows of partial products by using a carry-propagate
adder (e.g., carry look ahead adder) to obtain the final product. There are already
many techniques developed in the past years for these three steps to improve the
performance of multipliers.
Dept of ECE,M.Tech(VLSI),SITAMS Page 2
High speed Modified Booth Encoder multiplier for signed and unsigned numbers
Fig 1:Types of digital multiplier
Types of high-speed multipliers:
Sequential multiplier - generates partial products sequentially and adds each
newly generated product to previously accumulated partial product
Parallel multiplier - generates partial products in parallel, accumulates
using a fast multi-operand adder
2.1TREE MULTIPLIERS
2.1.1 WALLACE TREE- Reduce the number of operands at the earliest opportunity.
If there are m dots in a column, apply m/3 full adders to that column. Tends to
minimize overall delay by making the final CPA as short as possible.
2.1.2 DADDA TREE -Reduce the number of operands in the tree to the next lower
n(h) number in the table using the fewest FA’s and HA’s possible. Reduces the
hardware cost without increasing the number of levels in the tree.
Dept of ECE,M.Tech(VLSI),SITAMS Page 3
High speed Modified Booth Encoder multiplier for signed and unsigned numbers
2.2UNSIGNED MULTIPLIERS
2.2.1 ARRAY MULTIPLIER
Array multiplier is well known due to its regular structure. Multiplier circuit is based
on add and shift algorithm. Each partial product is generated by the multiplication of the
multiplicand with one multiplier bit. The partial product are shifted according to their bit
orders and then added. The addition can be performed with normal carry propagate adder. N-1
adders are required where N is the multiplier length.
o No separate circuits for generation and accumulation
o Reduced execution time but increased hardware complexity
Fig 2: 4x4 Array Multiplier
2.2.2 BRAN MULTIPLIER
It is a simple parallel multiplier generally called as carry save array multiplier. It
has been restricted to perform signed bits. The structure consists of array of AND gates and
adders arranged in the iterative manner and no need of logic registers. This can be called as
non –addictive multipliers. Architecture: An n*n bit Braun multiplier is constructed with n
(n-1) adders and n^2 AND gates as shown in the fig., where,
The internal structure of the full adder can be realized using FPGA. Each products
can be generated in parallel with the AND gates. Each partial product can be added with
the sum of partial product which has previously produced by using the row of adders. The
Dept of ECE,M.Tech(VLSI),SITAMS Page 4
High speed Modified Booth Encoder multiplier for signed and unsigned numbers
carry out will be shifted one bit to the left or right and then it will be added to the sum
which is generated by the first adder and the newly generated partial product.
Fig 3:Braun Array
The shifting would carry out with the help of Carry Save Adder (CSA) and the Ripple
carry adder should be used for the final stage of the output. Braun multiplier performs well
for the unsigned operands that are less than 16 bits in terms of speed, power and area. But
it is simple structure when compared to the other multipliers. The main drawback of this
multiplier is that the potential susceptibility of Glitching problem due to the Ripple Carry
Adder in the last stage. The delay depends on the delay of the Full Adder and also a final
adder in the last. The power and area can also be reduced by using two bypassing
techniques called Row bypassing technique and Column bypassing technique
2.3 SIGNED MULTIPLIERS
2.3.1 BAUGH WOOLEY ALGORITHM
The Baugh-Wooley algorithm is a different scheme for signed
multiplication, but is not so widely adopted because it may be complicated to deploy on
irregular reduction trees. We use the Baugh-Wooley algorithm in our High Performance
Multiplier (HPM) tree, which combines a regular layout with a logarithmic logic depth.
The Baugh-Wooley (BW) algorithm [8] is a relatively straightforward way of doing signed
multiplications; Fig. 5 illustrates the algorithm for an 8-bit case, where the partial-product
Dept of ECE,M.Tech(VLSI),SITAMS Page 5
High speed Modified Booth Encoder multiplier for signed and unsigned numbers
bits have been reorganized according to Hatamian’s scheme [9]. The creation of the
reorganized partial-product array comprises three steps:
iii) The most significant bit (MSB) of the first N-1 partial-product rows and all bits of
the last partial-product row, except its MSB, are inverted.
ii) A ’1’ is added to the Nth column.
iii) The MSB of the final result is inverted.
Fig 4: Illustration of an 8-bit Baugh-Wooley multiplication
2.3.2 BOOTHS ALGORITHM
The Booth recoding algorithm is the most frequently used method to generate partial
products . This algorithm allows for the reduction of the number of partial products to be
compressed in a carry-save adder tree. Thus the compression speed can be enhanced. This
Booth–Mac Sorley algorithm is simply called the Booth algorithm, and the two-bit
recoding using this algorithm scans a triplet of bits to reduce the number of partial products
by roughly one half.
The 2-bit recoding means that the multiplier B is divided into groups of two bits, and
the algorithm is applied to this group of divided bits. The Booth algorithm is implemented
into two steps: Booth encoding and Booth selecting. The both encoding step is to generate
one of the five values from the adjacent three bits. The booth selector generates a partial
product bit by utilizing the output signals.
Dept of ECE,M.Tech(VLSI),SITAMS Page 6
High speed Modified Booth Encoder multiplier for signed and unsigned numbers
Table 1:Booth’s encoding.
Disadvantages:
1. Sign-extension for negative multiplicands not applicable for negative multipliers
2. long sequences of 1’s in the multiplier large number of summands
Solution for problem 1.: in case of a negative multiplier, negate both operands by applying
the 2’s complement
Solution for problem 2.:analyze groups of 1’s in the multiplier and replace them by a
shorter and more efficient representation (→Booth algorithm)
2.3.3 MODIFIED BOOTH ALGORITHEM
The original Booth algorithm for the multiplication s = x*y recodes partial
products by considering two bits at a time of one of the operands (x) and encoding them
into (-2,-1,0,1,2). Each such encoded higher-radix number is subsequently multiplied with
the second operand (y), yielding one row of recoded partial-product bits. The advantage of
Booth recoding is that the number of recoded partial products is fewer than the number of
un-recoded partial products, and this can be translated into higher performance in the
circuitry, which reduces all partial-product bits into the final product.
The drawback of the original Booth algorithm is that the number of recoded partial
products depends on input operand x, which makes this algorithm unsuitable for
implementation in hardware. The modified-Booth (MB) algorithm by MacSorley remedies
this by looking at three bits at a time of operand x.
Dept of ECE,M.Tech(VLSI),SITAMS Page 7
High speed Modified Booth Encoder multiplier for signed and unsigned numbers
Table 2:Modified booth’s encoding
Then we are guaranteed that only half the number of partial products will be
generated, compared to a conventional partial-product generation using 2-input AND
gates. Since it has a fixed number of partial products, the MB algorithm is suitable for
hardware implementation. An MB multiplier works internally with a two’s complement
representation of the partial products, in order to be able to multiply the encoded (-1,-2)
with the y operand. To avoid having to sign extend the partial products, we are using the
scheme presented by Fadavi-Ardekani In the two’s complement representation, a change of
sign includes the insertion of a ’1’ at the least significant bit (LSB) position (henceforth
called LSB insertion). To avoid an irregular implementation of the partial-product
reduction circuitry, we draw on the idea called modified partial-product array .
Here, the impact of LSB insertion on the two least significant bits positions of the
partial product is pre-computed. The pre-computation redefines the LSB of the partial
product (Eq. 1) and moves the potential ’1’, which results from the LSB insertion, to the
second least significant position (Eq. 2). Note that our Eq. 2 is different from the
corresponding equation used in illustrates an 8-bit MB multiplication using sign extension
prevention and the modified partial-product array scheme.
Fig 5: Illustration of an 8-bit modified-Booth multiplication.
Dept of ECE,M.Tech(VLSI),SITAMS Page 8
High speed Modified Booth Encoder multiplier for signed and unsigned numbers
3. PROBLEM DEFINITION
The papers presents a design methodology for high speed Booth encoded parallel
multiplier. For partial product generation, a new Modified Booth encoding (MBE) scheme is
used to improve the performance of traditional MBE schemes. But this multiplier is only for
signed number multiplication operation
The conventional modified Booth encoding (MBE) generates an irregular partial
product array because of the extra partial product bit at the least significant bit position of
each partial product row. Therefore papers presents a simple approach to generate a regular
partial product array with fewer partial product rows and negligible overhead, thereby
lowering the complexity of partial product reduction and reducing the area, delay, and power
of MBE multipliers. But the drawback of this multiplier is that it functions only for signed
number operands.
The modified-Booth algorithm is extensively used for high-speed multiplier
circuits. Once, when array multipliers were used, the reduced number of generated partial
products significantly improved multiplier performance. In designs based on reduction trees
with logarithmic logic depth, however, the reduced number of partial products has a limited
impact on overall performance. The Baugh-Wooley algorithm is a different scheme for
signed multiplication, but is not so widely adopted because it may be complicated to deploy
on irregular reduction trees. Again the Baugh-Wooley algorithm is for only signed number
multiplication.
The array multipliers and Braun array multipliers operates only on the unsigned
numbers. Thus, the requirement of the modern computer system is a dedicated and very high
speed multiplier unit that can perform multiplication operation on signed as well as unsigned
numbers. In this paper we designed and implemented a dedicated multiplier unit that can
perform multiplication operation on both signed and unsigned numbers, and this multiplier is
called as SUMBE multiplier.
Dept of ECE,M.Tech(VLSI),SITAMS Page 9
High speed Modified Booth Encoder multiplier for signed and unsigned numbers
4. METHODOLOGY
The new MBE recoder was designed according to the following analysis. Table 1 presents
the truth table of the new encoding scheme. The Z signal makes the output zero to
compensate the incorrect X2_b and Neg signals presents the circuit diagram of the encoder
and decoder. The encoder generates X1_b, X2_b, and Z signals by encoding the three x-
signals. The yLSB signal is the LSB of the y signal and is combined with x-signals to
determine the Row_LSB and the Neg_cin signals. Similarly, yMSB is combined with x-
signals to determine the sign extension signals. shows an overview of the partial product
array for an 8 × 8 multiplier. The sign extension circuitry developed . The conventional MBE
partial product array has two drawbacks:
1) an additional partial product term at the (n-2)th bit position;
2) poor performance at the LSB-part.
To remedy the two drawbacks, the LSB part of the partial product array is
modified. Referring to Fig. 2a, the Row_LSB (gray circle) and the Neg_cin terms are
combined and further simplified using Boolean minimization. The new equations for the
Row_LSB and Neg_cin can be written as , respectively.
Table 3: Truth Table of MBE Scheme.
Dept of ECE,M.Tech(VLSI),SITAMS Page 10
High speed Modified Booth Encoder multiplier for signed and unsigned numbers
Fig 6: MBE encoder Fig 7:MBE decoder
Fig 8: 8× 8 MBE partial product array. (a) Traditional MBE partialproductarray. (b) New MBE partial product array.
Dept of ECE,M.Tech(VLSI),SITAMS Page 11
High speed Modified Booth Encoder multiplier for signed and unsigned numbers
The Fig. 8(a) has widely been adopted in parallel multipliers since it can reduce the
number of partial product rows to be added by half, thus reducing the size and
enhancing the speed of the reduction tree. However, as shown in Fig.7,8, the conventional
MBE algorithm generates n/2 + 1 partial product rows rather than n/2 due to
the extra partial product bit (neg bit) at the least significant bit position of each partial
product row for negative encoding, leading to an irregular partial product array and a
complex reduction tree. Therefore, the Modified Booth multipliers with a regular partial
product array produce a very regular partial product array. No only each negi is shifted left
and replaced by ci but also the last neg bit is removed. This approach reduces the partial
product rows from n/2 + 1 to n/2 by incorporating the last neg bit into the sign extension
bits of the first partial product row, and almost no overhead is introduced to the partial
product generator. More regular partial product array and fewer partial product rows result
in a small and fast reduction tree, so that the area, delay, and power of MBE multipliers
can further be reduced.
Dept of ECE,M.Tech(VLSI),SITAMS Page 12
High speed Modified Booth Encoder multiplier for signed and unsigned numbers
5. SYSTEM REQUIREMENTS
i) Hardware requirements : Spartan FPGA Embedded kit
ii) Software requirements : Xililnx
Dept of ECE,M.Tech(VLSI),SITAMS Page 13
High speed Modified Booth Encoder multiplier for signed and unsigned numbers
6. REFERENCES
[1] W. –C. Yeh and C. –W. Jen, “High Speed Booth encoded Parallel Multiplier
Design,” IEEE transactions on computers, vol. 49, no. 7, pp. 692-701, July 2000.
[2] Shiann-Rong Kuang, Jiun-Ping Wang, and Cang-Yuan Guo, “Modified Booth
multipliers with a Regular Partial Product Array,” IEEE Transactions on circuits and
systems-II, vol 56, No 5, May 2009.
[3] Li-Rong Wang, Shyh-Jye Jou and Chung-Len Lee, “A well-tructured Modified
Booth Multiplier Design” 978-1-4244-1617-2/08/$25.00 ©2008 IEEE.
[4] Soojin Kim and Kyeongsoon Cho “Design of High-speed Modified Booth
Multipliers Operating at GHz Ranges” World Academy of Science, Engineering and
Technology2010.
[5] Magnus Sjalander and Per Larson-Edefors. “The Case for HPM-Based Baugh-
Wooley Multipliers,” Chalmers University of Technology, Sweden, March 2008.
[6] Z Haung and M D Ercegovac, “High performance Low Power left to right array
multiplier design” IEEE trans.Computer, vol 54 no3, page 272-283 Mar 2005.
[7] Hsing-Chung Liang and Pao-Hsin Huang, “Testing Transition Delay Faults in
Modified BoothMultipliers by Using C-testable and SIC Patterns”IEEE2007, 1-4244-
1272-2/07.
[8] Aswathy Sudhakar, and D. Gokila, “Run-Time Reconfigurable Pipelined
Modified Baugh-Wooley Multipliers,” Advances in Computational Sciences and
Technology ISSN 0973-6107 Volume 3 Number 2 (2010) pp. 223–235.
[9] Myoung-Cheol Shin, Se-Hyeon Kang, and In-Cheol Park, “An Area-Efficient
Iterative Modified-Booth Multiplier Based on Self-Timed Clocking,” Industry, and Energy
through the project System IC 2010,and by IC Design Education Center (IDEC).
[10] Leandro Z. Pieper, Eduardo A. C. da Costa, Sérgio J. M. de Almeida,
“Efficient Dedicated Multiplication Blocks for2´s Complement Radix-2m Array
Multipliers,” JOURNAL OF COMPUTERS, VOL. 5, NO. 10, OCTOBER 2010.
Dept of ECE,M.Tech(VLSI),SITAMS Page 14