4x4 signed multiplication 2’s complement: x= –x n–1 2 n–1 +unsigned y*x=z
DESCRIPTION
Multiplications. 4x4 Signed Multiplication 2’s Complement: X= –x n–1 2 n–1 +Unsigned Y*X=Z. 2’s C Sequential MUL 1-1. ADD Y if x i =1( i=0 to n–2); Shift SUB Y if x n-1 =1; Shift EX:Y= – 5, X=3. 2’s C Sequential MUL 1-2. ADD Y if x i =1( i=0 to n–2); Shift SUB Y if x n-1 =1; Shift. - PowerPoint PPT PresentationTRANSCRIPT
6-1
• 4x4 Signed Multiplication2’s Complement: X= –xn–12n–1+Unsigned
Y*X=Z
Multiplications
6-2
ADD Y if xi =1( i=0 to n–2); Shift
SUB Y if xn-1 =1; Shift
EX:Y= – 5, X=3
2’s C Sequential MUL 1-1
1 0 1 1 = -5 0 0 1 1 = 3+= 1 0 1 1Shift 1 1 0 1 1+= 1 0 0 0 1Shift 1 1 0 0 0 1Shift 1 1 1 0 0 0 1Shift 1 1 1 1 0 0 0 1
6-3
ADD Y if xi =1( i=0 to n–2); Shift
SUB Y if xn-1 =1; Shift
2’s C Sequential MUL 1-2
1 0 1 1 = -5 1 1 0 1 = -3
1 0 1 1Shift 1 1 0 1 1Shift 1 1 1 0 1 1+= 1 0 0 1 1 1Shift 1 1 0 0 1 1 1 = 0 0 0 1 1 1 1Shift 0 0 0 0 1 1 1 1
6-4
1’s C: X= –xn–1(2n–1–ulp) +Unsigned
A(0)= Y(ulp) if xn–1 =1 else A(0)=0
ADD Y if xi =1( i=0 to n–2); Shift
SUB Y if xn-1 =1; Shift
EX1: Y= 5, X= – 3
1’s C Sequential MUL 1-1
6-5
5* – 3 = – 22 ???
1’s C Sequential MUL 1-2
0 1 0 1 = 5 1 1 0 0 = -3A(0) 0 1 0 1Shift 0 0 1 0 1Shift 0 0 0 1 0 1+= 0 1 1 0 0 1Shift 0 0 1 1 0 0 1 0 1 0 1= 1 1 0 1 0 0 1Shift 1 1 1 0 1 0 0 1
6-6
5* – 3 = – 15
1’s C Sequential MUL 1-3
0 1 0 1 0 0 0 =5 1 1 0 0 =-3A(0) 0 1 0 1Shift 0 0 1 0 1Shift 0 0 0 1 0 1+= 0 1 1 0 0 1Shift 0 0 1 1 0 0 1 0 1 0 1 0 0 0= 1 1 1 0 0 0 0Shift 1 1 1 1 0 0 0 0
6-7
– 5* – 3 = 10 !!! What’s wrong???
1’s C Sequential MUL 1-4
1 0 1 0 = -5 1 1 0 0 = -3A(0) 1 0 1 0Shift 1 1 0 1 0Shift 1 1 1 0 1 0+= 1 1 0 0 0 1 0Shift 1 1 0 0 0 1 0 =1 0 0 0 1 0 1 0Shift 0 0 0 0 1 0 1 0
6-8
End around carry: – 5* – 3 = 12!!!???
1’s C Sequential MUL 1-5
1 0 1 0 = -5 1 1 0 0 = -3A(0) 1 0 1 0Shift 1 1 0 1 0Shift 1 1 1 0 1 0+= 1 1 0 0 0 1 0Shift 1 1 0 0 0 1 1 =1 0 0 0 1 0 1 1Shift 0 0 0 0 1 1 0 0
6-9
Add 1s instead of 0’s: – 5* – 3 = 15 O.K.
1’s C Sequential MUL 1-6
1 0 1 0 1 1 = -5 1 1 0 0 = -3A(0) 1 0 1 0Shift 1 1 0 1 0Shift 1 1 1 0 1 0+= 1 1 0 0 1 0 1Shift 1 1 0 0 1 1 0 =1 0 0 0 1 1 1 0Shift 0 0 0 0 1 1 1 1
6-10
2’C : n-bit adder & 2n-bit Shift Reg.1’C : 2n-bit adder & 2n-bit Shift Reg.
Compare 2’sC & 1’sC Seq. MUL
6-11
• Two SUB; (n-2) ADD X*Y=Z=(z1, z2, …zn)
Multiplications(2)
6-12
• 4x4 Signed MultiplicationSign extensionOne SUB; (n-2) ADD
Multiplications(3)
6-13
• 4x4 Signed Multiplicationall ADD (2n)
Multiplications(4)
6-14
• 4x4 Signed Multiplicationall ADD (2n–1)
Multiplications(5)
6-15
Serial Multiplications (1)
Rin
LastBit
Rout
CLR
XY
CLRY
CLRPin
CinCLR CLR
X
0
1 Pout
Cout
(a)
MUX
FA0
1
0
1FA
CLR
FA0
1
Control path
0
XYXYPinCinRinLastBit
PoutCoutRout
#1
01
FirstBit
LastBit
(b)
X
Y
P
CLR CLR
(d)(c)
0
XYXYPinCinRinLastBit
PoutCoutRout
#2 XYXYPinCinRinLastBit
PoutCoutRout
#(N-1)
6-16
• Serial Multiplication Scheme
Serial Multiplications(2)
0 1 2 3
X iY i
X0Y0
6-17
• Serial Multiplication Scheme
Serial Multiplications(3)
0 1 2 3
X iY iZ0
X1Y1
W10 +W01
W11
6-18
• Serial Multiplication Scheme
Serial Multiplications(4)
0 1 2 3
X iY iZ1
X2Y2
W22V2 +W20 +W02
W21 +W12
6-19
• Serial Multiplication Scheme
Serial Multiplications(5)
0 1 2 3
X iY iZ2
X3Y3
W33V3 +W30 +W03
V4 +W31 +W13
W32 +W23
6-20
• Serial Multiplication Scheme
Serial Multiplications(6)
0 1 2 3
X iY iZ3
V4 +W30 +W03
V5 +W31 +W13
V6 +W32 +W23
W33+W33 +W33
6-21
• Serial Multiplication Scheme
Serial Multiplications(7)
0 1 2 3
X iY iZ4
V5 +W30 +W03
V6 +W31 +W13
+W33 +W33
V7
+W32 +W23
6-22
• Serial Multiplication Scheme
Serial Multiplications(7)
0 1 2 3
X iY iZ5
V6 +W30 +W03
V7 +W31 +W13
+W33 +W33
+W32 +W23
6-23
• Serial Multiplication Scheme
Serial Multiplications(7)
0 1 2 3
X iY iZ6
V7 +W30 +W03
6-24
Serial/Parallel Multiplier(1-1)
C4
C3
C9 C8 C7 C6 C5 C4 C3
W00W10W20W30
W01W11W21W31
W02W12W22W32
W0'3W1'3W2'3W3'3
b
P0P1P2P3P4P5P6P7
4
C1
C
C2
C3
C4
C2
MSB
3
W30
X 31
X 32
C2
6-25
Serial/Parallel Multiplier(1-2)
FA FFR
a3
FFR
FA FFR
a2
FFR
FA FFR
a1
FFR
FA FFR
a0
FFR
STR
MSB
bi& & & &
&
FFR
FA FFR
FFR
6-26
Serial/Parallel Multiplier(2)
C4 C3 C2
W00W10W20W'30
W01W11W21W'31
W02W12W22W'32
W'03W'13W'23W33
11
P0P1P2P3P4P5P6P7
C1
C2
C3
C4inverter
C1
MSB
FFR
&
a3
FA FFR
&
a2
FFR
FA FFR
&
a1
FFR
FA
&
a0
FFRSTR
MSB
bi
FAFA FA
MSB
FFR
FFR
FFR
FFR
FFR
6-27
Serial/Parallel Multiplier(3-1)
C2A
C2B
C4A C4B C3A C3B C2A C2B C1A
W00W10W20W30
W01W11W21W31
W02W12W22W32
W0'3W1'3W2'3W3'3
b
P0P1P2P3P4P5P6P7
2A
C1B
C
C1A
C2B
C2A
C1B
MSB
3
W30
X 31
X 32
C1A
C1B
C1A
C2B
C2A
6-28
Serial/Parallel Multiplier(3-2)
FA
a3
FA
a2
FA
a1
FA
a0MSB
InA& & & &
&
FA
InB& & & &
a3 a2 a1 a0
FA FA FA FA
FF
FAFF
Rising edgetriggered
falling edgetriggered 0
6-29
Serial/Parallel Multiplier(4-1)
C8
C7 C6 C5
C4 C3 C2
W00W10W20W'30
W01W11W21W'31
W02W12W22W'32
W'03W'13W'23W33
1
11
P0P1P2P3P4P5P6P7
STR
C1
C1
C2
C3C4
C5
C1
1
C6
1
C7
1
C8
Discard by
MB
Next STR
6-30
Serial/Parallel Multiplier(4-2)
FA3 SF3R
&
a3
CF3P
FA2 SF2R
&
a2
CF2R
FA1 SF1R
&
a1
CF1R
FA0 SF0R
&
a0
CF0R
STR
MB
bi
D
1
MUX
FFR
D
0
MUX
orD
0
MUX
T2/T4
S@0/S@1
CLK
T5/T7
T1/T1
T2/T4
T3/T2
T1/T2
T4/T6 T3/T5
T4 /T5T6/T5 T5/T4 T4/T3 T3/T2
T7/T5 T4/T7 T3/T6 T2/T5
T6/T5 T6/T4 T4/T6 T4/T3 T3/ T2
T2/T1T2/
T3T3/T4
T4 /T7
T7 /T4
T6 /T3
T4 /T2
T4 / T3
T5 /T4
T7/T5
T6 /T7
P
CF1P
6-31
Serial/Parallel Multiplier(4-3)
Table 1 Test patterns
STR MB Bi An-1 an-2...a0 SFs CFs
T1 1 0 1 0 1 1111...11 1000...00
T2 0 1 0 1 1 1000...00 0111...11
T3 0 0 1 1 0 0011...11 0100...0
T41 0 1 1 0 1 0101...11 0000...0
repeat vector T4 (n-1) Times shift out 0000....0
6-32
Serial/Parallel Multiplier(5)
Table 2 Area and delay of the logic elements and multipliers
NAND2 Inverter XOR2 FF FAArea 1 0.5 1.5 3.5 7.5Delay 1 0.5 2.3 3.8 4.6
[2] n n (n-1) 2(n-1) 2(n-1)[1] n+1 n+1 n 2n+1 n+1[4] n 1 n 2n n
Table 3 Comparisons
clock last clock Area Delay AT Testable I/O pins[2] 12.2 4.6n+8.1 25n-23.5 16.8n-4.1 420n2-497.3n+98.35 ? 2n+4
[1] 12.2 12.2 17.5n+12.5 24.4n+12.2 427n2+518n+152.5 ? n+4
[4] 11.7 11.7 17n-0.5 23.4n 397.8n2-11.7n Yes n+4
6-33
• Reduce Partial Product terms
• Accelerate Addition
• 3 Types of MultiplicationParallel MULHS Seq. MULArray MUL
High-Speed Multiplications
6-34
• Booth AlgorithmConcept: A* 0011...110= A*0100…0(–1)0 Ex. Old PP#=4, New PP# =2
Reduce Partial Product terms
6-35
• Booth AlgorithmA*X A*YConversion Table (right)Start from LSB (add a 0) Overlap 1 bitEX: A*01110011 A*0
11 1 0 0 1 1 (0) A* 100(–1)0 10(–1)
Booth Algorithm
xi xi 1 yi Commnet
0 0 0 0 sting
0 1 1 End of 1s
1 0 ( 1) Start of 1s
1 1 0 1 string
6-36
• 0100*0111 0100*100(–1)
Booth Algorithm(2)
xi xi 1 yi Commnet
0 0 0 0 sting
0 1 1 End of 1s
1 0 ( 1) Start of 1s
1 1 0 1 string
0 1 0 0 = 4 1 0 0 1 = 7
1 1 0 0Shift 1 1 1 0 0Shift 1 1 1 1 0 0Shift 1 1 1 1 1 0 0+= 0 0 1 1 1 0 0Shift 0 0 0 1 1 1 0 0
6-37
• 1011*1101 1011*0(–1)1 (–1)
Booth Algorithm(3)
xi xi 1 yi Commnet
0 0 0 0 sting
0 1 1 End of 1s
1 0 ( 1) Start of 1s
1 1 0 1 string
1 0 1 1 = 5 0 1 1 1 = 3
0 1 0 1Shift 0 0 1 0 1+= 1 1 0 1 1Shift 1 1 1 0 1 1 = 0 0 1 1 1 1Shift2 0 0 0 0 1 1 1 1
6-38
• DrawbacksADD/SUB Variable Inefficient for isolate 1s
• Modified Booth Alg.Scan 3-bit at a timeOverlap 1-bit If n= Even, it can handl
e 2’s C #
Modified Booth Algorithm
xi xi 1 xi 2 yi yi 1
0 0 0 0 00 0 1 0 10 1 0 0 10 1 1 1 01 0 0 ( 1) 01 0 1 0 ( 1)1 1 0 0 ( 1)1 1 1 0 0
6-39
• Original 1011*1101 1011*0(–1)1 (–1)
• Now 1011*110 1(0) 1011*0(–1)01
Modified Booth Algorithm(2)
xi xi 1 xi 2 yi yi 1
0 0 0 0 00 0 1 0 10 1 0 0 10 1 1 1 01 0 0 ( 1) 01 0 1 0 ( 1)1 1 0 0 ( 1)1 1 1 0 0
6-40
• Find min. +/– for MUL: Find min. SD representation Find min. |yi|
• Z=(z1, z2, …zn) is min. if zizi+1 =0
(111) is min. but zizi+1 0
• Canonical RecodingFind such ZUsing sequence step
Canonical Recoding
xi+1 xi ci zi ci+1
0 0 0 0 00 0 1 1 00 1 0 1 00 1 1 0 11 0 0 0 01 0 1 ( 1) 11 1 0 ( 1) 11 1 1 0 1
6-41
• EX: Assume C0=0
X=011001 z0=1, c1=0
X=01100 z1=0, c2=0
X=0110 z2=0, c3=0
X=011 z3= –1, c4=1
X=01 z4=0, c5=1
X=(0)0 z5=1, c6=0
Z= 10(–1)001
Canonical Recoding(2)
xi+1 xi ci zi ci+1
0 0 0 0 00 0 1 1 00 1 0 1 00 1 1 0 11 0 0 0 01 0 1 ( 1) 11 1 0 ( 1) 11 1 1 0 1
6-42
• Conventional CSM
Array Multipliers
x4 x3 x2 x1 x0
y4 y3 y2 y1 y0
1 W’40 W30 W20 W10 W00
W’41 W31 W21 W11 W01
W’42 W32 W22 W12 W02
W’43 W33 W23 W13 W03
W44 W’34 W’24 W’14 W’04
Z9 Z8 Z7 Z6 Z5 Z4 Z3 Z2 Z1 Z0
6-43
• Conventional CSM
Array Multipliers(2)
Z1
Z2
Z3
Z4
Z5
0Z
Z9
Z8
Z7
Z6
FA
FA
FA FA
FA
FAFA
FAFA
FA
FAFA
FAFA
FA
i jW
ij= AND( X , Y )
W00
1
0
W'
W'
W
W
W'
W
W
W W W WW
W
W
W
W
W'
W'
W'
W 1001
12 02
20112130
03
4031
22
13
0414
23
32
41
42
W33
W'24W'34
43
FA
FACin
In1 In2
CoutSum
W44
FAFAFA FA
0 0 0
i jW'
ij= NAND( X , Y )
6-44
• Pezaris Array Multiplier– 1 = – 2•1+10 = – 2•0+ 0
Array Multipliers (3-0)
x4 x3 x2 x1 x0
y4 y3 y2 y1 y0W40 W30 W20 W10 W00
W41 W31 W21 W11 W01W42 W32 W22 W12 W02
W43 W33 W23 W13 W03
W44 W34 W24 W14 W04Z9 Z8 Z7 Z6 Z5 Z4 Z3 Z2 Z1 Z0
6-45
• Pezaris Array Multiplier
Array Multipliers(3)
Z1
Z2
Z3
Z4Z
5
0Z
Z9
Z8
Z7
Z6
FA
FA
FA FA
FA
FAFA
FAFA
FA
FAFA
FAFA
FA
i jW
ij= AND( X , Y )
W000
W
W
W
W
W
W
W
W W W WW
W
W
W
W
W
W
W
W 1001
12 02
20112130
03
4031
22
13
0414
23
32
41
42
W33
W24W34
43
FA
FAC
A B
CoS
W44
FAFAFA FA
0 0 0FA- 2Co+S=A-B-C
FA
- 2Co-S=-A-B-C
-2Co+S=A+B+C
6-46
• Modified Pezaris Array Multiplier
Array Multipliers(4)
Z1
Z2
Z3
Z4Z
5
0Z
Z9
Z8
Z7
Z6
FA
FA
FA FA
FA
FAFA
FAFA
FA
FAFA
FAFA
FA
i jW
ij= AND( X , Y )
W000
W
W
W
W
W
W
W
W W W WW
W
W
W
W
W
W
W
W 1001
12 02
20112130
03
4031
22
13
0414
23
32
41
42
W33
W24W34
43
FA
FAC
A B
CoS
W44
FAFAFA FA
0 0 0FA- 2Co+S=A-B-C
FA
- 2Co-S=-A-B-C
-2Co+S=A+B+C
0
FA2Co-S=B+C-A
6-47
• Baugh-Wooley Array Multiplier–24x4 •yi2i (i=0 to 3)= –(0,0, x4y3,…, x4y0) 24
ADD (1, 1, y’3,…, y’0) 24 ADD 24 if x4=1
or ADD 0 if x4=0
[(1, x’4, x4y’3,…, x4y’0) ADD (0,1,0…, x4)]24
So is –24y4 •xi2i (i=0 to 3)
{(1, x’4 +y’4, [x4y’3+x’3y4],…, [ x4y’0 + y4x’0 + x4+y4]} 24
Array Multipliers (5-01)
6-48
• Baugh-Wooley Array Multiplier
Array Multipliers (5-02)
x4 x3 x2 x1 x0
y4 y3 y2 y1 y0
W40’ W30 W20 W10 W00
W41’ W31 W21 W11 W01
W42’ W32 W22 W12 W02
W43’ W33 W23 W13 W03
W44 W3’4 W2’4 W1’4 W0’4
1 x’4 x4
y’4 y4
Z9 Z8 Z7 Z6 Z5 Z4 Z3 Z2 Z1 Z0
6-49
• Baugh-Wooley Array Multiplier
Array Multipliers(5)
Z1
Z2
Z3
Z4Z
5
0Z
Z9
Z8
Z7
Z6
FA
FA
FA FA
FA
FAFA
FAFA
FA
FAFA
FAFA
FA
i jW
ij= AND( X , Y )
W000
W
W
W
W
W
W
W
W W W WW
W
W
W
W
W
W
W
W 1001
12 02
20112130
03
40'31
22
13
0'41'4
23
32
41'
42'
W33
W2'4W3'4
43'
FA
FACin
In1 In2
CoutSum
W44
FAFAFA FA
0 0 0
i jW
i'j= AND( X' , Y )
FA
X4
Y4
FA
FA
X'4 Y'
4
1
6-50
• The On-the-fly CSM
Array Multipliers(6)
Z1
Z2
Z3
Z4Z
5
0Z
Z9 Z
8Z7
Z6
HA
HA
FA FA
FA
HAFA
HA
FA
FA
FA
FA
FA
i jW
ij= AND( X , Y )
W001
W'
W'
W
W
W'
W
W
W W W WW
W
W
W
W
W'W'W 10
01
12 02
20112130
03
4031
22
13
0414
23
32
41W'42
W33
W'24W'34
43
FACin
In1 In2
CoutSum
W44
i jW'ij
= NAND( X , Y )BB
FAHA
X4
Y4
FA
T
FA
FA
FA
HA
T
M
HA
13
4
23
4
4.55 3.5
2.5
1.5
0.5
5.5
44.5
5.5
6.56.50.5
6-51
• The On-the-fly
Array Multipliers(7)
(d)
MUX0 1
M Cell
VDD
(b)
Binary AdditionOn-the-fly
T Cell
MUX0 1
+
MUX0 1
+
R Cell
+
+ +MUX0 1
+
+
+
MUX0 1
+
MUX0 1
MUX0 1
T Cell
T Cell
HA Cell
M Cell
M Cell
B Cell
Carry-out
T
T
T
RHA
M
MM
B B B
(c)
MUX0 1
+
M* Cell
B* Cell
MUX0 1
(e)
6-52
• The Gray code– Binary to Gray:(Parallel)
• g n-1=bn-1
• gi=bi+1bi
– Gray to Binary:(Sequence)• b n-1=gn-1
• bi=bi+1gi
On-the-fly Application(1)
Binary Gray
000 000
001 001
010 011
011 010
100 110
101 111
110 101
111 100
6-53
• The Gray code adder
On-the-fly Application(2)
+
ga gb
+ Gray code to binary conversion
n-bit Binary adder
+ Binary to Gray code conversion
(n-1)TXOR
TADDn
TXOR
TotalDelay: n T
XORT
ADDn+
G/B G/B
B/G
log n T2 XOR
Chain:
Tree:
G/B Delay
(1+log n) 2
TADDn
+
for chain structure
for tree structure
TXOR
6-54
• The Gray code adder
On-the-fly Application(3)
Binary Addition
+
+
MUX0 1
Z
+
ZZZ
++
MUX0 1
MUX0 1
+
+
gan-1
+
gbn-1
+ga
n-2gb
n-2
++
ga1
gb1
++
ga0gb
0
MUX0 1
MUX0 1
+ + +
n-1 n-2 1 0
G/B
B/G
TXOR
TXOR
TXOR
TXORTXOR
TXOR
2
(n-1)
n
(n+1)
(n+2)
Delay:
On-the-fly
6-55
• Delay: Max{TCi,Tdi, Taibj}+kTFA
• P5-0=A3-0*B1-0 + C3-0 + D1-0
Additive Multiply Module (AMM)
FA FA FA FA
FA FA FA FA
0 5 C3 0 5 C2 0 4 C1 0 3 C0
d0
a3b0 a2b0 a1b0 a0b0
d1
a3b0 a2b0 a1b0 a0b0
AMMA3-0 B3-0
C3-0 D3-0
P5-0
1 4
5 9 5 9 4 8 3 7 2 6
6-56
AMM (2)
3-07-4
5-07-2
9-411-6
9-4
13-815-10
11-6
6-57
AMM (3)
AMM 5-0
7-2
9-4
11-6
9-4
13-8
15-10
11-6
0 0 03-0 1-0
5-2 1-0554321
10 10 9 8 7 610 10 9 8 6 4
15 15 14 13 12 1115 15 14 13 11 9
20 20 19 18 17 1620 20 19 18 16 14
25 25 24 23 22 21
6-58
• Same Column Same a4k, a4k+1, a4k+2, a4k+3
• Same Diagonal Same b2i, b2i+1
• 4m*4m unsigned multiplicationuses m*(2m)=2m2 AMMsDelay= TAND+Tdia+Tcol
Tdia= 5mTFA
Tcol= 5(3m–1) TFA
AMM (4)
m
2m
. . .
. . .