1 integer multipliers. 2 multipliers a must have circuit in most dsp applications a variety of...

139
1 Integer Multipliers

Upload: joaquin-sonn

Post on 14-Dec-2015

214 views

Category:

Documents


1 download

TRANSCRIPT

1

Integer Multipliers

2

Multipliers

• A must have circuit in most DSP applications• A variety of multipliers exists that can be chosen

based on their performance• Serial, Serial/Parallel,Shift and Add, Array, Booth,

Wallace Tree,….

XA

B

P

3

16x16 multiplier

converter

Converter

RB

re set

en

converter

RC

en

re set

RA

re set

en

XA

B

P

4

Multiplication Algorithm

Yn-1X0 Yn-2X0 Yn-3X0 …… Y1X0 Y0X0 Yn-1X1 Yn-2X1 Yn-3X1 …… Y1X1 Y0X1 Yn-1X2 Yn-2X2 Yn-3X2 …… Y1X2 Y0X2 … … … … …. …. …. …. ….

Yn-1Xn-2 Yn-2X0 n-2 Yn-3X n-2 …… Y1Xn-2 Y0Xn-2

Yn-1Xn-1 Yn-2X0n-1 Yn-3Xn-1 …… Y1Xn-1 Y0Xn-1 -----------------------------------------------------------------------------------------------------------------------------------------

P2n-1 P2n-2 P2n-3 P2 P1 P0

X= Xn-1 Xn-2 …………………X0 Multiplicand

Y=Yn-1 Yn-2…………………….Y0 Multiplier

XA

B

P

5

A7 A6 A5 A4 A3 A2 A1 A0 B7 B6 B5 B4 B3 B2 B1 B0

A7.B2 A6.B2 A5.B2 A4.B2 A3.B2 A2.B2 A1.B2 A0.B2 A7.B3 A6.B3 A5.B3 A4.B3 A3.B3 A2.B3 A1.B3 A0.B3

A7.B4 A6.B4 A5.B4 A4.B4 A3.B4 A2.B4 A1.B4 A0.B4 A7.B5 A6.B5 A5.B5 A4.B5 A3.B5 A2.B5 A1.B5 A0.B5

 

1.   Multiplication AlgorithmsImplementation of multiplication of binary numbers boils down to how to do the the additions. Consider the two 8 bit numbers A and B to generate the 16 bit product P. First generate the 64

partial Products and then add them up. 

A7.B0 A6.B0 A5.B0 A4.B0 A3.B0 A2.B0 A1.B0 A0.B0 A7.B1 A6.B1 A5.B1 A4.B1 A3.B1 A2.B1 A1.B1 A0.B1

. A7.B6 A6.B6 A5.B6 A4.B6 A3.B6 A2.B6 A1.B6 A0.B6 A3.B7 A2.B7 A1.B7 A0.B7 A3.B7 A2.B7 A1.B7 A0.B7 

 

P15 P14 P13 P12 P11 P10 P9 P8 P7 P6 P5 P4 P3 P2 P1 P0

 The equation is : .

1

0

1

0

2 A(m)B(n) n)P(mm

i

n

j

jijiba

6

MU(16X16 Multiplier Unit)

 

REG IN1

 REG OUT

Control Unit

Storage

  Multiplier DesignXA

B

P

7

1-bitREG

+

G2

G1

0 00

Serial Register

qdReset=0

x0y0

x0y0

0

0

1

x0y0

0

CLK CLK/(N+1)

CLK

0

0

Slide 1

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

XA

B

P

8

1-bitREG

+

G2

G1

S0 00

Serial Register

qdReset=0

x1y0

x1y0

0

0

1

x1y0

0

CLK CLK/(N+1)

CLK

0

0

Si: the ith bit of the final result Si: the ith bit of the final result

Slide 2

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

9

1-bitREG

+

G2

G1

x1y0 0S0

Serial Register

qdReset=0

x2y0

x2y0

0

0

1

x2y0

0

CLK CLK/(N+1)

CLK

0

0

Si: the ith bit of the final result Si: the ith bit of the final result

Slide 3

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

10

1-bitREG

+

G2

G1

x2y0 S0x1y0

Serial Register

qdReset=0

x3y0

x3y0

0

0

1

x3y0

0

CLK CLK/(N+1)

CLK

0

0

Si: the ith bit of the final result Si: the ith bit of the final result

Slide 4

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

11

1-bitREG

+

G2

G1

x3y0 x1y0x2y0

Serial Register

qd

Reset=1

00 0

S0

0

0

0

0

CLK CLK/(N+1)

CLK

S0

0

Si: the ith bit of the final result Si: the ith bit of the final result

Slide 5

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

12

1-bitREG

+

G2

G1

0 x2y0x3y0

Serial Register

qdReset=0

x0y1

x0y1

x1y0

0

1

S1

C1

CLK CLK/(N+1)

CLK

x1y0

x1y0

S0

Si: the ith bit of the final result

Ci: the only carry from column i

Si: the ith bit of the final result

Ci: the only carry from column i

Slide 6

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

13

1-bitREG

+

G2

G1

S1 x3y00

Serial Register

qdReset=0

x1y1

x1y1

x2y0

1

S20

C1

CLK CLK/(N+1)

CLK

x2y0

x2y0C2

0

S0

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Slide 7

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

14

1-bitREG

+

G2

G1

S20 0S1

Serial Register

qdReset=0

x2y1

x2y1

x3y0

1

S30

C20

CLK CLK/(N+1)

CLK

x3y0

x3y0

C30

S0

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Slide 8

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

15

1-bitREG

+

G2

G1

S30 S1S2

0

Serial Register

qdReset=0

x3y1

x3y1

0

1

S40

C30

CLK CLK/(N+1)

CLK

0

0C4

0

S0

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Slide 9

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

16

1-bitREG

+

G2

G1

S40 S2

0S30

Serial Register

qdReset=1

00 0

S1

0

S50

C40

CLK CLK/(N+1)

CLK

S1

0

S0

C50=0

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Slide 10

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

17

1-bitREG

+

G2

G1

S50 S3

0S40

Serial Register

qd

Reset=0

x0y2

x0y2

S20

1

S2

0

CLK CLK/(N+1)

CLK

S20

C21

S20

S1 S0

Slide 11

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

18

1-bitREG

+

G2

G1

S2 S40S5

0

Serial Register

qd

Reset=0

x1y2

x1y2

S30

1

S31

CLK CLK/(N+1)

CLK

S30

C21

S30

C31

S1 S0

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Slide 12

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

19

1-bitREG

+

G2

G1

S31 S5

0S2

Serial Register

qd

Reset=0

x2y2

x2y2

S40

1

S41

CLK CLK/(N+1)

CLK

S40

C31

S40

C41

S1 S0

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Slide 13

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

20

1-bitREG

+

G2

G1

S41 S2S3

1

Serial Register

qd

Reset=0

x3y2

x3y2

S50

1

S51

CLK CLK/(N+1)

CLK

S50

C41

S50

C51

S1 S0

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Slide 14

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

21

1-bitREG

+

G2

G1

S51 S3

1S41

Serial Register

qd

Reset=1

00 0

S2

0

S60

CLK CLK/(N+1)

CLK

S2

C51

0

S1 S0

C60=0

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Slide 15

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

22

1-bitREG

+

G2

G1

S60 S4

1S51

Serial Register

qd

Reset=0

x0y3

x0y3

S31

1

S3

CLK CLK/(N+1)

CLK

S31

C32

0

S31

S2 S0S1

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

Slide 16

23

1-bitREG

+

G2

G1

S3 S51S6

0

Serial Register

qd

Reset=0

x1y3

x1y3

S41

1

S4

CLK CLK/(N+1)

CLK

S41

C32

S41

C42

S2 S0S1

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Slide 17

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

24

1-bitREG

+

G2

G1

S4 S60S3

Serial Register

qd

Reset=0

x2y3

x2y3

S51

1

S5

CLK CLK/(N+1)

CLK

S51

C42

S51

C52

S2 S0S1

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Slide 18

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

25

1-bitREG

+

G2

G1

S5 S3S4

Serial Register

qd

Reset=0

x3y3

x3y3

S60

1

S6

CLK CLK/(N+1)

CLK

S60

C52

S60

C61

S2 S0S1

Slide 19

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

26

1-bitREG

+

G2

G1

S6 S4S5

Serial Register

qd

Reset=1

00 0

S3

0

S7

CLK CLK/(N+1)

CLK

S3

C61

0

0

S2 S0S1

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Slide 20

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

27

1-bitREG

+

G2

G1

S7 S5S6

Serial Register

qd

Reset=0

00 0

1

CLK CLK/(N+1)

CLK

S4

0

S3 S0S1S2

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Slide 21

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

28

1-bitREG

+

G2

G1

S7 S5S6

Serial Register

qd

Reset=0

00 0

1

CLK CLK/(N+1)

CLK

S4

0

S3 S0S1S2

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Slide 21

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

X: x3x2x1x0 Y:y 3y2y1y0

Input Sequence for G1:

00x3x2x1x00 x3x2x1x0 0x3x2x1x0 0x3x2x1x0

00y 3y3y3y3 0y 2y2y2y2 0y 1y1y1y1 0y 0y0y0y0

Reset:010000100001000010000

29

D

D DD

DD

+ ++

y0 y3y2y1

x0

S0

0

000

000

00

S0 S0S0 S0

Si: the ith bit of the final result Si: the ith bit of the final result

Slide 1

XA

B

P

30

D

D DD

DD

+ ++

y0 y3y2y1

x1

x1y0

x0

000

00x0y1

00

S1

C1

S1 S1 S1 S0

Si: the ith bit of the final result

Ci: the only carry from column i

Si: the ith bit of the final result

Ci: the only carry from column i

Slide 2

XA

B

P

31

D

D DD

DD

+ ++

y0 y3y2y1

x2

x2y0

x1

00C1

0x0y2x1y1

0

S20

C20

S2 S2

x0

C21

S2 S1 S0

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Slide 3

XA

B

P

32

D

D DD

DD

+ ++

y0 y3y2y1

x3

x3y0

x2

0

x0y3x1y2x2y1

x0

S30

C20

S31 S3

x1

S3 S2 S1 S0

C21

C30 C3

1 C32

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Slide 4

XA

B

P

33

D

D DD

DD

+ ++

y0 y3y2y1

0

x3

x1y3x2y2x3y1

x1

S40

C30

S41 S4

x2

C31

C40 C4

1

C32

0

S4 S3 S2 S1 S0

C42

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Slide 5

XA

B

P

34

D

D DD

DD

+ ++

y0 y3y2y1

0x2y3x3y2

x2

C40

C40

S51 S5

x3

S5 S4 S3 S2 S1 S0

C41

C50

C42

0

C510

0 0

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Slide 6

XA

B

P

35

D

D DD

DD

+ ++

y0 y3y2y1

0x3y30

x3

0

0

C50 S6

0

C50

0

C51

0

C60

0 0

S6 S5 S4 S3 S2 S1 S0

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Si: the ith bit of the final result

Ci: the only carry from column i

Sij: the jth partial sum for column i

Cij: the jth partial carry from column i

Slide 7

XA

B

P

36

D

D DD

DD

+ ++

y0 y3y2y1

000

0

0

0

0 S7

0

0

0

C6

0

00

0 0

S7 S6 S5 S4 S3 S2 S1 S0

Si: the ith bit of the final result

Ci: the only carry from column i

Si: the ith bit of the final result

Ci: the only carry from column i

Slide 8

XA

B

P

37

8 bit Adder

MUX

0

INPUT Ain (7 downto 0)

REGA

Result (7 downto 0)Result (15 downto 8)

INPUT Bin (7 downto 0)

CLOCK

REGBREGC

Shift Add Multiplier Design Implementation XA

B

P

38

Synchronous Shift and Add Multipliercontroller

Multiplication process: 5 states: Idle, Init, Test, Add, and Shift&Count. Idle: Starts by receiving the Start signal; Init: Multiplicand and multiplier are loaded into a load

register and a shift register, respectively; Test: The LSB in the shift register which contains the

multiplier is tested to decide the next state;

XA

B

P

39

Synchronous Shift and Add Multiplier ControllerDesign

Add: If LSB is ‘1’, then next state is to add the new partial product to the accumulation result, and the state machine transits to shift&count state ;

Shift&Count: If LSB is ‘0’, then the two shift register shift their contains one bit right, and the counter counts up by one step. After that, the state machine transits back to test state;

When the counter reaches to N , a Stop signal is asserted and the state machine goes to the idle state;

Idle: In the idle state, a Done signal is asserted to indicate the end of multiplication.

XA

B

P

40

Multiplicand

n-bit AdderShift and AddControl Logic

An-1 A0A1An ...C

Multiplier

Qn-1 Q0Q1Qn ...

Shift Right

Add

Slide 1

n-bit Multiplier:

Q0=1: Multiplicand is added to register A; the result is stored in register A; registers C, A, Q are shifted to the right one bit

Q0=0: Registers C, A, Q are shifted to the right one bit

n-bit Multiplier:

Q0=1: Multiplicand is added to register A; the result is stored in register A; registers C, A, Q are shifted to the right one bit

Q0=0: Registers C, A, Q are shifted to the right one bit

41

4-bit AdderShift and AddControl Logic

0

Multiplier

Shift Right

Add

Multiplicand

1 0 1 1

0 000 1 101

Slide 2

Example: 4-bit Multiplier

Initial Values

Example: 4-bit Multiplier

Initial Values XA

B

P

42

4-bit AdderShift and AddControl Logic

0

Multiplier

Shift Right=0

Add=1

Multiplicand

1 0 1 1

1 110 1 101

Slide 3

Example: 4-bit Multiplier

First Cycle--Add

Example: 4-bit Multiplier

First Cycle--Add

XA

B

P

43

4-bit AdderShift and AddControl Logic

0

Multiplier

Shift Right=1

Add=0

Multiplicand

1 0 1 1

0 101 1 011

Slide 4

Example: 4-bit Multiplier

First Cycle--Shift

Example: 4-bit Multiplier

First Cycle--Shift XA

B

P

44

4-bit AdderShift and AddControl Logic

0

Multiplier

Shift Right=1

Add=0

Multiplicand

1 0 1 1

1 1110 010

Slide 5

Example: 4-bit Multiplier

Second Cycle--Shift

Example: 4-bit Multiplier

Second Cycle--Shift

XA

B

P

45

4-bit AdderShift and AddControl Logic

0

Multiplier

Shift Right=0

Add=1

Multiplicand

1 0 1 1

1 101 1 111

Slide 6

Example: 4-bit Multiplier

Third Cycle--Add

Example: 4-bit Multiplier

Third Cycle--Add XA

B

P

46

4-bit AdderShift and AddControl Logic

0

Multiplier

Shift Right=1

Add=0

Multiplicand

1 0 1 1

0 011 1 111

Slide 7

Example: 4-bit Multiplier

Third Cycle--Shift

Example: 4-bit Multiplier

Third Cycle--Shift

XA

B

P

47

4-bit AdderShift and AddControl Logic

1

Multiplier

Shift Right=0

Add=1

Multiplicand

1 0 1 1

0 100 1 111

Slide 8

Example: 4-bit Multiplier

Fourth Cycle--Add

Example: 4-bit Multiplier

Fourth Cycle--Add

XA

B

P

48

4-bit AdderShift and AddControl Logic

0

Multiplier

Shift Right=1

Add=0

Multiplicand

1 0 1 1

1 000 1 111

Slide 9

Example: 4-bit Multiplier

Fourth Cycle--Shift

Example: 4-bit Multiplier

Fourth Cycle--Shift

XA

B

P

49

4*4 Synchronous Shift and Add Multiplier DesignLayout Design

Floor plan of the 4*4 Synchronous Shift and Add Multiplier

XA

B

P

50

Comparison between Synchronous and Asynchronous Approaches

.

XA

B

P

51

Example : (simulated by Ovais Ahmed, Fall_03,project)

Multiplicand = 100010012 = 8916

Multiplier = 101010112 = AB16

Expected Result = 1011011100000112 =5B8316

XA

B

P

52

  

      Regular structure based on add and shift algorithm.      Addition is mainly done by carry save algorithm.      Sign bit extension results in a higher capacitive load and slows down the speed of the circuit.

  Array MultiplierXA

B

P

53

Addition with CLA

a0a1a2a3

Four-bit Adder

a0a1a2a3

a0a1a2a3

Four-bit Adder

a0a1a2a3

Four-bit Adder

b0

b1

b2

b3

Cin

Ci

n

Cin

Cout

Cout

Cout

0

0

0

0

Product (A*B)

A = a3a2a1a0

B = b3b2b1b0

XA

B

P

54

Array Multiplier with CSA

F.A

Ci Si

F.A

Ci Si

F.A

Ci Si

F.A

Ci Si

F.A

Ci Si

F.A

Ci Si

F.A

Ci Si

F.A

Ci Si

F.A

Ci Si

F.A

Ci Si

F.A

Ci Si

F.A

Ci Si

P00P10P01P11P02P12P03 0 0 0

P20P21P22P13

P30P31P32P23

0P33

R0R1R2R3

R4R5R6R7

Total of 16

gates

A0A1A2A3

B0

B1

B2

B3

Pij

Aj Bi

30

30

j

i

**Pij =Ai Bj

XA

B

P

55

Critical Path with Array Multipliers

HAFAFA FA

HAFAFA FA

HAFAFA FA

Two of the possible paths for the Ripple-Carry based 4*4 MultiplierArea = (N*N) AND Gate + (N-1)N Full-Adder

Delay = τ HA + (2N-1) τ FA

XA

B

P

56

XA

B

P

57

x 0y0

x 1y0

x 0y1

x 3y0

x 2y1

x 1y2

x 0y3

x 2y0

x 1y1

x 0y2

x 4y0

x 3y1

x 2y2

x 1y3

x 0y4

x 4y1

x 3y2

x 2y3

x 1y4

x 4y2

x 3y3

x 2y4

x 4y3

x 3y4

x 4y4

P1P2P3P4P5P6P7P8P9 P0

+++

+

+

+

+

+

+

+

+

+

+

+

+

+

+

++

+

Wallace TreeXA

B

P

58

Array Multiplier + Wallace TreeX

A

B

P

5904/18/23 Concordia VLSI Lab 59

Background

Baugh-Wooley Algorithm

• Convert negative partial products to positive representation

• No sign-extension required

)2*2*(*)2*2*(*2

0

11

2

0

11

ik

ii

kk

ik

ii

kk yyxxYX

ikk

i

ikik

k

i

ikji

k

j

ji

k

i

kkk xyyxyxyx

1

2

0

11

2

0

1

2

0

2

0

2211 2*2*)2*2**(

XA

B

P

6004/18/23 Concordia VLSI Lab 60

examples of 5-by-5 Baugh-Wooley

FA

FAFA FA FA

FAFA FA FA

FAFA FA FA

FAFA FA FA

FAFA FA FA FAFA

1

P0

a4b0' a3b0a1b0a2b0 a0b0

P9 P8 P7 P6 P5 P4 P3 P2 P1

0 000

a0b1

a3b1 a2b1a1b1

a0b2a3b2 a2b2 a1b2

a4b3'

a4b2'

a4b1'

a4' b4'

a0b3a3b3 a2b3 a1b3

a0'b4a4b4 a3'b4 a2'b4 a1'b4

a4

b4

The schematic logic circuit diagram of a 5-by-5 Baugh-Wooley two’s complement array multiplier

XA

B

P

61

a7 a6 a5 a4 a3 a2 a1 a0

* a7 a6 a5 a4 a3 a2 a1 a0

-------------

-------------

-------------

-------------

-------------

-------------

-------------

-------------

-------------

-------------

-------------

-------------

-------------

-------------

-------------

a7*a0 a6*a0 a5*a0 a4*a0 a3*a0 a2*a0 a1*a0 a0*a0

a7*a1 a6*a1 a5*a1 a4*a1 a3*a1 a2*a1 a1*a1 a0*a1

a7*a2 a6*a2 a5*a2 a4*a2 a3*a2 a2*a2 a1*a2 a0*a2

a7*a3 a6*a3 a5*a3 a4*a3 a3*a3 a2*a3 a1*a3 a0*a3

a7*a4 a6*a4 a5*a4 a4*a4 a3*a4 a2*a4 a1*a4 a0*a4

a7*a5 a6*a5 a5*a5 a4*a5 a3*a5 a2*a5 a1*a5 a0*a5

a7*a6 a6*a6 a5*a6 a4*a6 a3*a6 a2*a6 a1*a6 a0*a6

a7*a7 a6*a7 a5*a7 a4*a7 a3*a7 a2*a7 a1*a7 a0*a7

-------------

-------------

-------------

-------------

-------------

-------------

-------------

-------------

-------------

-------------

-------------

-------------

-------------

-------------

-------------

a7*a6 a7*a5 a7*a4 a7*a3 a7*a2 a7*a1 a7*a0 a6*a0 a5*a0 a4*a0 a3*a0 a2*a0 a1*a0 ‘0' a0

a7*a7 a6*a5 a6*a4 a6*a3 a6*a2 a6*a1 a5*a1 a4*a1 a3*a1 a2*a1 a1*a1

a6*a6 a5*a4 a5*a3 a5*a2 a4*a2 a3*a2 a2*a2

a5*a5 a4*a3 a3*a3

a4*a4

S15, S14 S13 S12 S11 S10 S9 S8 S7 S6 S5 S4 S3 S2 S1 S0

XA

B

P

62

a1a0a1

‘0’

a2a0

‘0’

‘0’

‘0’

‘0’

a5a0

a4a1

a3a2

a5a1

a4a2a6a0

a6a1

a5a2a7a0

a6a2

a5a3a7a1

a3a1

a4a0

a2a1

a2a3a0‘0’ a0

‘0’a3a3a4

a4

a6a3

a5a4a7a2

a5

a6a4

a7a3

a6a5

a6a7a4

a7a5

a7

a7a6

S0S1S2S4S5S6S7S8S9S10S11S12S13S14S15 S3

Example of an 8bit squarer

N*N N=8bits

XA

B

P

63

Array Multiplier32bits by 32bits multiplier

XA

B

P

64

1 Booth (Radix-4) Multiplier      Radix-4 (3 bit recoding) reduces number of partial products to be added by half.      Great saving in area and increased speed.

 A = -an-12

n-1 + an-22n-2 + an-32

n-3 + …. + a12 + a0

B = -bn-12n-1 + bn-22

n-2 + bn-32n-3 + …. + b12 + b0

      Base 4 redundant sign digit representation of B is (n/2) - 1

B = 22i Ki

i = 0

XA

B

P

65

       Ki is calculated by following equation

  Ki = -2b2i+1 + b2i + b2i-1 i = 0,1,2,….(n-2)/2

      3 bits of Multiplier B, b2i+1, b2i, b2i-1, are examined and

corresponding Ki is calculated.

      B is always appended on the right with zero (b-1 = 0), and n is

always even (B is sign extended if needed).       The product AB is then obtained by adding n/2 partial products. (n/2) - 1

AB= P = 22i Ki A

i = 0

 

 

66

Booth AlgorithmDecoding of multiplier to generate signals for hardware use

Xi+1 Xi Xi-1 OP NEG ZERO TWO

0 0 0 0 0 1 0

1 0 0 2 1 0 1

0 1 0 1 0 0 0

1 1 0 1 1 0 0

0 0 1 1 0 0 0

1 0 1 1 1 0 0

0 1 1 2 0 0 1

1 1 1 0 1 1 0

XA

B

P

67

Booth Algorithm A Booth recoded multiplier examinesThree bits of the multiplicand at a timeIt determine whether to add zero, 1, -1, 2, or -2 of that rank of the multiplicand.The operation to be performed is based on the current two bits of the multiplicand and the previous bit

Xi+1 X Xi-1 Zi/2

0 0 0 0

0 0 1 1

0 1 0 1

0 1 1 2

1 0 0 -2

1 0 1 -1

1 1 0 -1

1 1 1 0

XA

B

P

68

BIT  

 

M is

21 20 2-1OPERATION

multiplied

Xi Xi+1 Xi+2  

by

0 0 0 add zero (no string) +0

0 0 1 add multipleic (end of string) +X

0 1 0 add multiplic. (a string) +X

0 1 1 add twice the mul. (end of string) +2X

1 0 0 sub. twice the m. (beg. of string) -2X

1 0 1 sub. the m. (-2X and +X) -X

1 1 0 sub . the m. (beg. of string) -X

1 1 1 sub. zero (center of string) -0

69

Booth Algorithm-a higher radix Multiplication

 

Multiplicand A = ● ● ● ● Multiplier B = (●●)(●●)

Partial product bits ● ● ● ● (B1B0)2A40

Partial product bits ● ● ● ● (B3B2)A41

Product P = ● ● ● ● ● ● ● ●

XA

B

P

70

 

The following example is used to show how the calculation is done properly.  Multiplicand X = 000011

Multiplier Y = 011101 0 1 1 1 0 1 0 After booth decoding, Y is decoded as to multiply X by +2, -1, +1 separately, then shift the partial product two bits and add them together.

X* +1 000000000011 X* -1 1111111101 X* +2 00000110 -------------------------------------------- 000001010111

Example

Added to the

multiplier

XA

B

P

71

Sign ExtensionX

A

B

P

7204/18/23 Concordia VLSI Lab 72

Sign extension

Traditional sign-extension scheme

• Segment the input operands based on the size of embedded blocks

• Multiply the segmented inputs and extend the sign bit of each partial products

• Sum all partial products

Segmented input

operands

Sign extension

×

+

Final result

partial products

Sign

XA

B

P

73

Booth Algorithm-Example 1

 Example 1:

011101 (+29)

000011 (+3)

0

+2 -1 +1

000000000011111111110100000110

0000010101111 (+87)

XA

B

P

74

Booth Algorithm Example 2

011101 (+29) 111101 (-3)

0

+2 -1 +1

111111111101000000001111111010

1111101010011

2s complement ofmultiplicand

(-87)

Notice sign extensions

XA

B

P

75

Booth Algorithm-Example 3

100011 (-29)

111101 (-3)

0

-2 +1 -1

000000000011111111110100000110

0000010101111

Shifted 2scomplement

(+87)

Notice the sign extensions

XA

B

P

76

Comparison of Booth and parallel multiplier shift and Add

XA

B

P

77

Please note that each operand is 17 bit ie. the 17th bit is the sign bit. Also negative numbers are entered as 1’s complement, this is why you need to add the S in the right hand side of the diagram. If you use 2’complement

then the S’s on right side of the diagram can be removed

Template to reduce sign extensions for Booth Algorithm

For hardware implementation

78

Comparison of Template and the sign extension

S1S1S1S1S1S1S1

S2S2S2S2S2

S3S3S3

S4

B

A

P

S1S1S1

S21

S3

B

A

P

Sign template Sign extension

S1S1S1S1S1S1S1

S2S2S2S2S2

S3S3S3

S4

B

A

P

S1S1S1S1S1S1S1

S2S2S2S2S2

S3S3S3

S4

B

A

P

S1S1S1

S21

S3

B

A

P

S1S1S1

S21

S3

B

A

P

Sign template Sign extension

XA

B

P

79

32

31

30

29

28

27

26

25

24

23

22

21

20

19

18

17

16

15

14

13

12

11

10

9 8 7 6 5 43 2

1 0

                          S

0

S

0

S

0

A

0

A

0

A

0

A

0

A

0

A

0

A

0

A

0

A

0

A

0

A

0

A

0

A

0

A

0

A

0

A

0

A

0

                        1 S

1

A

1

A

1

A

1

A

1

A

1

A

1

A

1

A

1

A

1

A

1

A

1

A

1

A

1

A

1

A

1

A

1

A

1

   

                    1 S

2

A

2

A

2

A

2

A

2

A

2

A

2

A

2

A

2

A

2

A

2

A

2

A

2

A

2

A

2

A

2

A

2

A

2       

                1 S

3

A

3

A

3

A

3

A

3

A

3

A

3

A

3

A

3

A

3

A

3

A

3

A

3

A

3

A

3

A

3

A

3

A

3

           

            1 S

4

A

4

A

4

A

4

A

4

A

4

A

4

A

4

A

4

A

4

A

4

A

4

A

4

A

4

A

4

A

4

A

4

A

4

               

        1 S

5

A

5

A

5

A

5

A

5

A

5

A

5

A

5

A

5

A

5

A

5

A

5

A

5

A

5

A

5

A

5

A

5

A

5

                   

    1 S

6

A

6

A

6

A

6

A

6

A

6

A

6

A

6

A

6

A

6

A

6

A

6

A

6

A

6

A

6

A

6

A

6

A

6

                       

 A

7

A

7

A

7

A

7

A

7

A

7

A

7

A

7

A

7

A

7

A

7

A

7

A

7

A

7

A

7

A

7

A

7

A

7

                           

S8

A

8

A

8

A

8

A

8

A

8

A

8

A

8

A

8

A

8

A

8

A

8

A

8

A

8

A

8

A

8

A

8

                          

   

Partial Product matrix

generated for a 16 * 16 bit

multiplication,

Using booth and the

template given in

previous slide

80

Using the Template 25 * -35

Sign bit

0 0 0 1 1 0 0 1 Add SS 1 1 0 1 1 1 0 1 0 Add inverted SAdd Inverted sign and add 1 1 0 0 0 0 0 1 1 0 0 1 * 1Add Inverted sign bit 1 0 1 1 1 0 0 1 1 1 * -1 1 0 0 1 1 0 0 1 0 * 2 No sign bit 1 1 0 0 1 1 1 * -1

1 1 1 1 0 0 1 0 0 1 0 1 0 1 This is a –ve number. Convert it 0 0 0 0 1 1 0 1 1 0 1 0 1 1

512 256 64 32 8 2 1 = 875

Example of using the template25 * - 35 with -35 as the multiplier. Using 8 bit representation

XA

B

P

81

              

 

Booth Multiplier Components

Multiplier

Multiplicand

Booth Encoder

PPU (Partial products unit) 

  PPA(Partial products adding unit)

  Product

XA

B

P

82

+ + + + + + + + + + +

+ + + + + + + + +

+

P0P1P2P3P4P5P6P7P8P9P10P11P12P13P14P15P16

0

+++++++++++++++

+ + + + + + +

+ + + + +

0

Partial Product PP0,PP1,PP2(15 downto 0)

Partial Product PP3(15 downto 0)

Ripple Carry Adder

Critical Path

Pipeline Register

+ + + + + + + + + + +

+ + + + + + + + +

+

P0P1P2P3P4P5P6P7P8P9P10P11P12P13P14P15P16

0

+++++++++++++++

+ + + + + + +

+ + + + +

0

Partial Product PP0,PP1,PP2(15 downto 0)

Partial Product PP3(15 downto 0)

Ripple Carry Adder

Critical Path

Pipeline Register

Wallace Tree and Ripple Carry Adder Structure.

Of 8*8 multiplier With PipelineX

A

B

P

83

Mulbegin

Stop

A3bit

CLK

Shift

Mux11

Init

Mulend

FSMCLR

Mux12

Mux0

X

SH

LD

D

CLK

CLR

Q16 32

reg _ 2 le ft3 2

A

B

Sum

Cout

Cin

37

37

37

Adder37

10

A 37

B 37Y 37

Sel

Mux37

D 37 Q 37

CLK

CLRR eg is te r3 7

FinishCLK

CLRC ou n ter2 0

StartMulbegin

CLK

A

CLK

Start

Doubleshift

Init

Start

Stop

QA(0-2)

CLK

Doubleshift

Mux11

Init

Mulend

CLK Finish

Start

Result

Start

Mux0

Start

not used

Start

B

InitShift

CLK

Mulend

SH

LD

D

CLK

CLR

Q16 17

reg 2 rig h t1 7

=0; A 16=0=1, A 16=1

F17

endcheck

Start

B

Init

Shift

CLK

2 scom p lem en t

SH

LD

D

CLK

CLR

Q16 32

reg _ 2 le ft3 2

SH

LD

D

CLK

CLR

Q16 32

reg _ 2 le ft3 2

*2 (sh ifte r)

*2 (sh ifte r)

11100100

A 32

B 32Y

32

ctrl1

mux4-32

ctrl0

C 32

D 32

Mul11 Mul12

signexpansion

5

Mux12

Mux0

Hardware implementation of Booth with shift and add X

A

B

P

84

Simulation PlanX

A

B

P

32-bit Signal Generator A

32-bit Signal Generator B

Behavioral Multiplier

A * B

64-bitComparator

A[31:0]

Result

Failed Number

P[63:0]

B[31:0]

My_P[63:0]

My Multiplier

Array MultiplierModified Booth

MultiplierWallace Tree

Multiplier

Modified Booth-Wallace Tree

Multiplier

Twin PipeSerial-Parallel

Multiplier

85

Testing the DesignX

A

B

P

86

Simulation For Parallel MultipliersXA

B

P

Signed Number:

Unsigned Number:

87

Simulation For Signed S/P MultipliersXA

B

P

There are 340 ns delay between the result and the operators because of the D flip-flops delay.

88

FPGA after implementation, areas of programming shown clearlyX

A

B

P

89

Another implementation of the above after pipelining, the place and rout has paced the design in different places.

XA

B

P

90

Spartacus FPGA board

XA

B

P

91

Testing the multiplication system

XA

B

P

92

Comparison of MultipliersXA

B

P

Table 7. Performance comparison for two’s complement multipliers By Chen Yaoquan, M.Eng. 2005

  ArrayMultiplier

Modified Booth Multiplier

Wallace-Tree Multiplier

Modified Booth-Wallace Tree Multiplier

Twin Pipe Serial-Parallel Multiplier

Behavioral Multiplier

Area – Total CLB’s (#) 3076.50 2649.50 3325.50 2672.50 490.00 2993.50

Maximum Delay D(ns) 35.78 24.43 18.93 18.53 107.52 (3.36x32) 49.33

Total Dynamic Power P (W)

7.52 6.33 7.46 6.41 0.28 6.24

Delay ·Power Product (DP) (ns W)

268.98 154.64 141.14 118.76 30.62 307.58

Area•PowerProduct (AP)(# W)

23128.20 16771.60 24793.93 17127.79 139.54 18665.07

Area•DelayProduct (AD)(# ns)

1.10E+05 6.47E+04 6.30E+04 4.95E+04 5.27E+04 1.48E+05

Area•Delay2

Product(AD2)(# ns2)

3.94E+06 1.58E+06 1.19E+06 9.18E+05 5.66E+06 7.28E+06

93

Comparison of MultipliersXA

B

P

Table 7. Performance comparison for Unsigned multipliers By Chen Yaoquan, M.Eng. 2005

  ArrayMultiplier

Modified Booth Multiplier

Wallace-Tree Multiplier

Modified Booth-Wallace Tree Multiplier

Twin Pipe Serial-Parallel Multiplier

Behavioral Multiplier

Area – Total CLB’s (#) 3280.50 2800.00 3321.50 2845.50 487.00 3003.00

Maximum Delay D(ns) 37.23 25.33 18.93 18.33 107.52 44.50

Total Dynamic Power P (W) 7.57 6.66 7.32 6.66 0.29 6.26

Delay ·Power Product (DP) (ns W)

281.88 168.77 138.60 122.13 30.66 278.53

Area•PowerProduct (AP)(# W)

24837.98 18656.40 24319.36 18959.57 138.89 18795.78

Area•DelayProduct (AD)(# ns)

1.22E+05 7.09E+04 6.29E+04 5.22E+04 5.24E+04 1.34E+05

Area•Delay2

Product(AD2)(# ns2)

4.55E+06 1.80E+06 1.19E+06 9.56E+05 5.63E+06 5.95E+06

94

Comparison of MultipliersXA

B

P

The relation of Area and Delay for behavioral multiplier -- "banana curve" 2950

3000

3050

3100

3150

3200

3250

0 20 40 60 80

Del ay (ns)

Area

(#)

Ser i es1

Change the value of “set_max_delay” in Script file (ns)

0 10 20 30 40 50 60 >60

Area(#) 3014.5

3013.0

3110.0

3193.5

3019.5

2999.5

2978.5

2978.5

Power(w)

6.6499

6.6470

7.5683

8.1878

8.0645

8.0419

8.0156

8.0156

Delay(ns) 31.98 31.98 30.93 30.08 39.93 49.88 59.63 59.63

95

Comparison of MultipliersXA

B

P

By Chen Yaoquan, M.Eng. 2005

 ArrayMultiplier

Modified Booth Multiplier

Wallace-Tree Multiplier

Modified Booth-Wallace Tree Multiplier

Twin Pipe Serial-Parallel Multiplier

Behavioral Multiplier

Area Medium Small Large Small Smallest Medium

Critical Delay

Medium Fast Very Fast Fastest Very Large Large

PowerConsumption

Large Medium Large Medium Smallest Medium

Complexity Simple ComplexMore

ComplexMore

ComplexSimple Simplest

Implement Easy Medium Difficut Difficut Easy Easiest

96

Pipelining SimulationX

A

B

P

97

Synthesis for Signed MultipliersXA

B

P

ArrayModified Booth

Wallace Tree

Modified Booth-Wallace Tree

Twin Pipe S/P Behavioral

98

Synthesis for Unsigned MultipliersXA

B

P

ArrayModified Booth

Wallace Tree

Modified Booth-Wallace Tree

Twin Pipe S/P Behavioral

99

Conclusion X

A

B

P

• Modified Booth and Wallace Tree are the best techniques for high speed multiplication.

• Wallace Tree has the best performance, but it is hard to implement.

• Booth algorithm based multipliers have lower area among parallel multipliers.

• For behavioral multipliers, the area will increase while the delay decreases.

100

Comparison  Array

MultiplierModified Booth Multiplier

Wallace Tree Multiplier

Modified Booth & Wallace Tree Multiplier

Twin Pipe Serial-Parallel Multiplier

Area – Total CLB’s (#)

1165 1292 1659 1239 133

Maximum Delay (ns) 

 187.87ns

 

 139.41ns

 101.14ns

 

 101.43ns

22.58ns(722.56ns)

Power Consumption at highest speed (mW)

 16.6506m

W(at 188ns)

 23.136mW(at 140ns)

 30.95mW

(at 101.14ns)

 30.862mW

(at 101.43ns)

 2.089mW

(at 722.56ns) 

Delay PowerProduct (DP)(ns mW) 

 3128.15

 3225.39

 3130.28

 3130.33

 1509.42

 

Area PowerProduct (AP)(# mW) 

 19.397 x

103

 29.891 x 103

 51.346 x 103

 38.238 x 103

 277.837

Area DelayProduct (AD)(# ns) 

 218.868 x

103

 

 180.118 x

103

 167.791 x 103

 125.671 x 103

 96.101 x 103

Area Delay2

Product(AD2)(# ns2) 

 41.119 x

106

 25.110 x 106

 16.970 x 106

 12.747 x 106

 69.438 x 106

XA

B

P

101

NOTICE

     The rest of these slides are for extra information only and are not part of the lecture 

XA

B

P

102

Array Addition

103

Addition of 8

binary numbers using the Wallace

tree principal

104

105

106

107

Baugh-Wooley two's complement multiplier:

FA

FAFA FA FA

FAFA FA FA

FAFA FA FA

FAFA FA FA

FAFA FA FA FAFA

1

P0

a4b0' a3b0a1b0a2b0 a0b0

P9 P8 P7 P6 P5 P4 P3 P2 P1

0 000

a0b1

a3b1 a2b1a1b1

a0b2a3b2 a2b2 a1b2

a4b3'

a4b2'

a4b1'

a4' b4'

a0b3a3b3 a2b3 a1b3

a0'b4a4b4 a3'b4 a2'b4 a1'b4

a4

b4

The schematic logic circuit diagram of a 5-by-5 Baugh-Wooley two’s complement array multiplier

108

Example of Baugh-Wooley Two’s Complement Multiplication

p9 p8 p7 p6 p5 p4 p3 p2 p1 p0 P

a4' a3'b4 a2'b4 a1'b4 a0'b4

X

A

B

a4 a3 a2 a1 a0

b4 b3 b2 b1 b0

a4b0' a3b0 a2b0 a1b0 a0b0

a4b4 a4b3' a3b3 a2b3 a1b3 a0b3

a4b2' a3b2 a2b2 a1b2 a0b2

a4b1' a3b1 a2b1 a1b1 a0b1

+b4' a4

1 b4

1 1 1 0 1 1 1 1 1 1

0 0 1 0 0

= -65

X =13

= -5

0 1 1 0 1

1 1 0 1 1

1 0 0 0 0

0 1 0 1 1

0 0 1 0 1 1

0 1 0 1 1

+1 1

1 0

0 0 0 1 0 0 0 0 0 1

1 0 0 0 0

= 65

X

=13

= 5

0 1 1 0 1

0 0 1 0 1

0 0 0 0 0

0 1 1 0 1

0 0 0 0 0 0

0 1 1 0 1

+1 0

1 0

0 0 0 1 0 0 0 0 0 1

0 1 1 0 0

= 65

X

= -13

= -5

1 0 0 1 1

1 1 0 1 1

0 0 0 1 1

0 0 0 1 1

1 0 0 0 1 1

1 0 0 0 0

+0 1

1 1

1 1 1 0 1 1 1 1 1 1

1 0 0 1 0

= -65

X

=13

= -5

0 1 1 0 1

1 1 0 1 1

0 1 1 0 1

0 1 1 0 1

0 0 1 1 0 1

0 0 0 0 0

+0 0

1 1

109

Cluster MultipliersX

A

B

P

Divide the multiplier into smaller multipliers

110

Cluster MultipliersXA

B

P

Multiplier

A8~A7 A3~A0

4-bit Multiplier

Final Addition Stage

8-bit Latch

8-bit Latch

8

/CLR

CLK

CLK

4-bit Multiplier

8-bit Latch

8-bit Latch

8

/CLR

CLK

CLK

Multiplicand

B8~B7 B3~B0

4-bit Multiplier

8-bit Latch

8-bit Latch

8

/CLR

CLK

CLK

4-bit Multiplier

8-bit Latch

8-bit Latch

8

/CLR

CLK

CLK

44 4 4

EN3 EN2 EN1 EN0

16

P 8-bit cluster low power multiplier

The circuit used to generate the enable signal

111

Cluster Multipliers

• Dividing the multiplication circuit into clusters (blocks) of smaller multipliers

• Applying clock gating techniques to disable the blocks that are producing a zero result.

• Features– Low Power (claims 13.4 % savings)

XA

B

P

112

Multiplexer-Based Array MultipliersXA

B

P

1

0

1

1

2 22n

j

jj

n

j

jjj ZyxP

01Z0

2Z

12Z

03Z

13Z

23Z

04Z

14Z

24Z

34Z

jjjjj yXYxZ 021 ...XXXX jjj

Z j

xjyj

113

Multiplexer-Based Array MultipliersXA

B

P

Two types of cells:

Cell 1: produce the terms Zij2j and includes a full adder of

carry save adder array

Cell 2: produce the terms xjyj 2j and includes a full adder of

carry save adder array

114

Multiplexer-Based Array Multipliers

• Characteristics– Faster than Modified Booth– Unlike Booth, does not require encoding logic– Requires approximately N2/2 cells– Has a zigzag shape, thus not layout-friendly

XA

B

P

115

Multiplexer-Based Array MultipliersXA

B

P

• Improvement

– More rectangular layout – Save up to 40 percent area without penalties – Outperforms the modified Booth multiplier in both speed and power by 13% to 26%

116

Gray-Encoded Array Multiplier XA

B

P

Dec Hyb Dec Hyb Dec Hyb Dec Hyb

0 0000 4 0100 -8 1100 -4 1000

1 0001 5 0101 -7 1101 -3 1001

2 0011 6 0111 -6 1111 -2 1011

3 0010 7 0110 -5 1110 -1 1010

• 2’s complement Hybrid Coding– Having a single bit different for consecutive values

– Reducing the number of transitions, and thus power ( for highly correlated streams ).

117

Gray-Encoded Array Multiplier XA

B

P

An 8-bit wide 2’s complement radix-4 array multiplier

118

Gray-Encoded Array Multiplier

• Characteristics– Uses gray code to reduce the switching activity

of multiplier– Saves 45.6% power than Modified Booth– Uses greater area(26.4% ) than Modified Booth

XA

B

P

119

Ultra-high Speed Parallel Multiplier

• How to ultra-high speed?– Based on Modified Booth Algorithm and Tree

Structure (Column compress)– Chooses efficient counters (3:2 and 5:3)– Uses the new compressor (faster 20% )– Uses First Partial product Addition (FPA)

Algorithm (reducing the bits of CLA by 50%)

XA

B

P

120

Ultra-high Speed Parallel Multiplier XA

B

P

Calculate the partial products as soon as possible.

The final CLA is only 16-bit instead of 32-bit.

Divide into 3 rows or 5 rows only (most efficient).

Calculation process using parallel counter in case of 16x16---Totally reduce delay by about 30%

121

ULLRLF Multiplier

• ULLRLF stands for Upper/Lower Left-to-Right Leapfrog.

• Combine the following techniques: – Signal flow optimization in [3:2] adder array

for partial product reduction,– Left-to-right leapfrog (LRLF) signal flow,– Splitting of the reduction array into upper/lower

parts.

XA

B

P

122

ULLRLF MultiplierXA

B

P

1) Signal flow optimization in [3:2] adder array -- For n = 32, the delay is reduced by 30 percent. -- The power is saved also.

PPij is always connected to pin A Sin/Cin are connected to B/C , most Sin signals are connected to C

123

ULLRLF MultiplierX

A

B

P

2) Left-to-Right Leapfrog (LRLF) Structure -- The delay of signals is more balanceable. -- Low power.

The sum signals skip over alternate rows.

124

ULLRLF MultiplierX

A

B

P

3) Upper/Lower Split Structure -- The long path of data path be broken into parallel short paths, there would be a saving in power. -- The delay of Partial Products Reduction is reduced.

Only n+2 bits

125

ULLRLF MultiplierXA

B

P

Floorplan of ULLRLF (n = 32)

•ULLRLF multipliers have less power than optimized tree multipliers for n ≤ 32 while keeping similar delay and area. • With more regularity and inherently shorter interconnects, the ULLRLF structure presents a competitive alternative to tree structures.

126

Signed Array MultiplierXA

B

P

HAFA

FAFA

HA

HA

A31

A29A31

A31

A31 A30

A31

HA

FAFA

FA

A30 A0

A1 A0

B2

A2 A1 A0

A3 A2 A1

B0

FA FAFA

A30 A1 A0

B31

32-bit carry look ahead adder

FA

A28

A29

A30

A0

B1

B3

FA

A0

P63 P62 P61 P34 P33 P31 P30 P2 P1 P0P3

STAGE 4 TO 30 (Each stage includes 32 AND gates, 31 full adders ,1 half adder and 1 NOT gate)

1

FA

32*32-Bit Array Multiplier for Signed Number

One stage of carry save adder

127

Unsigned Array MultiplierXA

B

P

A31

A29

A31

A31

A31

A31

HA

FA FA HA

HAHA FAFA

FA

A30 A0

A30 A1 A0

B2

A2 A1 A0

A3 A2 A1

B0

FA FAFA

A30 A1 A0

B31

32-bit carry look ahead adder

FA

FA

A28

A29

A30

A0

B1

B3

FA

A0

P63 P62 P61 P33 P32 P31 P30 P2 P1 P0P3

STAGE 4 TO 30 (Each stage includes 32 AND gates, 31 full adders and 1 half adder)

32*32-Bit Array Multiplier for Unsigned Number

One stage of carry save adder

128

Signed Modified Booth MultiplierX

A

B

P

129

Signed Modified Booth MultiplierX

A

B

P

SEL SEL SEL SEL SEL SEL

A0 0A1A2A3A4

SEL SEL SEL SEL

A0 0A1A2

FA FA FA

SEL SEL SEL

A0 0A1

SEL SEL SEL SEL SEL SEL

A31 A31 A30 A29 A28 A27 A26

FA FA FAFAHA

1

SEL SEL SEL SEL

A31 A31 A30 A29 A28

1

SEL SEL

A31 A31 A30

1

HA FA HA HA HA HA HA

INVERT00

P0P1P2P3

INVERT2

P4

1

P63 P62 P61 P60 P5

64-bit carry look ahead adder

STAGE 3 TO 15 (Each stage includes 33 PP selectors, 31 full adders ,1 half adder and 1 NOT gate)

INVERT n

Booth Encoder

Booth Encoder

Booth Encoder

Booth Encoder

B[1:0]0

B[3:1]

B[5:3]

B[31:5]

X1[0]X2[0]

INVERT0

X1[1]X2[1]

INVERT1

INVERT1

X1[2]X2[2]

INVERT2

X1[n]X2[n]

INVERT n

One stage

32*32-Bit Modified Booth Multiplier for Signed Number

0

130

Unsigned Modified Booth MultiplierX

A

B

P

131

Unsigned Modified Booth MultiplierXA

B

P

SEL SEL SEL SEL SELSEL_END

A0 0A1A2A3A4

SEL SEL SEL

A0 0A1A2

FA FA FA

SEL SEL

A0 0A1

SEL SEL SEL SEL SEL

A31 A30 A29 A28 A27 A26

FA FA FAFAHA

1

SEL SEL SEL

A31 A30 A29 A28

1

SEL

A31 A30

FA HA HA HA HA HA HA

S[0]0

P0P1P2P3

S[2]

P4P63 P62 P61 P5

S[i]

Booth Encoder

Booth Encoder

Booth Encoder

Booth Encoder

B[1:0]0

B[3:1]

B[5:3]

B[i+1, I, i-1]

X1[0]X2[0]S[0]

X1[1]X2[1]S[1]

S[1]

X1[2]X2[2]S[2]

X1[i]X2[i]S [i]

One stage

32*32-Bit Modified Booth Multiplier for Unsigned Number

0

SEL_END

SEL_END

SEL_END

SEL_END

HA

1

S[1]

FA

SEL_END

S[2]

FA FA

SEL

A0 0

SEL SEL

A31 A30 A29

FAHA

1S16

Booth Encoder

00B[31]

X1[16]X2[16]S[16]SEL_

ENDSEL_END

FA

P6

FA

SEL

A1

P32P33P34P35 P31

64-bit carry look ahead adder

STAGE 3 TO 15 (Each stage includes 33 PP selectors, 32 full adders ,1 half adder and 1 NOT gate)

S[0]

132

Wallace Tree multipliers X

A

B

P

32 partial products added in Wallace Tree Adder

64-bit Carry Look-ahead Adder

A[31:0] B[31:0]

C[63:0] S[63:0]

P[63:0]

133

Wallace Tree multipliers

……......................................…….....................................…….............................................................................………............................………….......................………….....................………….........................…………..................……………..............……............... ......... ........... ....

…................................................................................................……......................................……......................................……...................................………………….................

............................................................................................................................ .. .............................................................................................................................................................................…...................................................……..............................................……....................................... ....….................................…………………............................ ...................... ........................

............................................................................................................................ .. ......................................................................................................................... ....................................................... .................................................... . ................................................... . ........................

............................................................................................................................ .. ......................................................................................................................... ....................................................... .................................................... . ................................................... . .......................................... ....... .

............................................................................................................................ .. ......................................................................................................................... ..............................................

............................................................................................................................ ...................................................................................................

................................................................................................................................. .



................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................ ....................................................................

1

2

3

4

5

6

7

8

XA

B

P

• Use the 3:2 counters and 2:2 counters• Number of levels of = log (32/2) / log (3/2) ≈8 • Irregular structure • Fast

3:2 counter

....Carry

Sum

..

...

Carry

Sum

2:2 counter

Input:

Output:

134

Wallace Tree multipliers X

A

B

P

Carry Propagate/ Generate uni t

8-Bi t BCLA

8-Bi t BCLA

8-Bi t BCLA

8-Bi t BCLA

8-Bi t BCLA

8-Bi t BCLA

8-Bi t BCLA

8-Bi t BCLA

64-Bi t Summati on Uni t

8-Bi t BCLA

B63

P63-P56G63-G56

P7-P0G7-G0

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

C7-C0C63-C56

Ci n

C8PM1GM1

PM0GM0

C16PM2GM2

C24PM3GM3

C40PM5GM5

PM4GM4

C48PM6GM6

C56PM7GM7

64-Bi t Carry Look Ahead Adder

B0 A63 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A0

P63 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . P0 G63 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . G0

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

C55-C48 C47-C40 C39-C32 C31-C24 C23-C16 C15-C8

P63 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . P0 C63 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C0

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .S63 S0C64

2-level hierarchical

135

Modified Booth-Wallace Tree MultipliersXA

B

P

136

Modified Booth-Wallace Tree MultipliersXA

B

P

• Use the 3:2 counters and 2:2 counters• Number of levels of = log (16/2) / log (3/2) ≈6 • Irregular structure • Fast• Less area

Rearrage

1

2

3

4

56







................................................



.........................................

......................................................................................................................................................................................................................................................................................................................................................................................................

..........................................................

....................................................................................................................................................................................................................................................................

..

...................................................................................................................................................................................................

..................................................................................................................................

137

Twin pipe serial-parallel multipliersXA

B

P

Parallel in – serial outshift registers

Parallel in – serial outshift registers

32-bit twin pipe serial-parallel multiplier unit

B31 B29 …… B3 B1

B30 B28 …… B2 B0

Load/ Shi f tResetCl ock

Block diagram of 32*32-bit signed twin pipe serial-parallel multiplier with serial/parallel conversion logic

Serial in – parallel outshift registers

Serial in – parallel outshift registers

P62 P60 ……………………… P2 P0

P63 P61 ……………………… P3 P1

Resul t_ready

A31 A30 …………………… A1 A0

Si gn

138

Signed twin pipe serial-parallel multipliers

XA

B

P

FA

D D

D

FA

FA

DD D

D

FA

A31 A30

A31

Even data bits on rising clock

Odd data bits on rising clock

…... B2 B0 0 0 reset

Clock

Reset

FA

DD

D

D

FA

A0

HA

D

D

HAD

0

MUX

1

Product

Evenproduct

Oddproduct

D

D

falling_edge

rising_edge

Clock

…... B3 B1 0 0 reset

32*32-bit twin pipe serial-parallel multiplier for signed number

Repeat 28 units more

Sign

B31 B29 …... A30 A0

D

“Sign” control line and the sign-change hardware

139

Unsigned twin pipe serial-parallel multipliers

XA

B

P

HA

D D

D

HA

FA

DD D

D

FA

A31 A30

A31 A30

Even data bits on rising clock

Odd data bits on rising clock

…... B2 B0 0 0 reset

Clock

Reset

FA

DD

D

D

FA

A0

A0

HA

D

D

HAD

0

MUX

1

Product

Evenproduct

Oddproduct

D

D

falling_edge

rising_edge

Clock

…... B3 B1 0 0 reset

32*32 bit twin pipe serial-parallel multiplier for unsigned number

Repeat 28 units more

• Don’t need the “Sign” control line and the sign-change hardware