cmpen 411 vlsi digital circuits spring 2012 lecture 20...

23
Sp12 CMPEN 411 L20 S.1 CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 20: Multiplier Design [Adapted from Rabaey‘s Digital Integrated Circuits, Second Edition, ©2003 J. Rabaey, A. Chandrakasan, B. Nikolic]

Upload: lynhan

Post on 12-Mar-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 20 ...kxc104/class/cmpen411/13s/lec/C411L20Multiplier… · VLSI Digital Circuits Spring 2012 Lecture 20: ... Use serial-parallel

Sp12 CMPEN 411 L20 S.1

CMPEN 411VLSI Digital Circuits

Spring 2012

Lecture 20: Multiplier Design

[Adapted from Rabaey‘s Digital Integrated Circuits, Second Edition, ©2003 J. Rabaey, A. Chandrakasan, B. Nikolic]

Page 2: CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 20 ...kxc104/class/cmpen411/13s/lec/C411L20Multiplier… · VLSI Digital Circuits Spring 2012 Lecture 20: ... Use serial-parallel

Sp12 CMPEN 411 L20 S.2

Review: Basic Building Blocks

Datapath

Execution units

- Adder, multiplier, divider, shifter, etc.

Register file and pipeline registers

Multiplexers, decoders

Control

Finite state machines (PLA, ROM, random logic)

Interconnect

Switches, arbiters, buses

Memory

Caches (SRAMs), TLBs, DRAMs, buffers

Page 3: CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 20 ...kxc104/class/cmpen411/13s/lec/C411L20Multiplier… · VLSI Digital Circuits Spring 2012 Lecture 20: ... Use serial-parallel

Sp12 CMPEN 411 L20 S.3

The Binary Multiplication

x

+

Partial products

Multiplicand

Multiplier

Result

1 0 1 0 1 0

1 0 1 0 1 0

1 0 1 0 1 0

1 1 1 0 0 1 1 1 0

0 0 0 0 0 0

1 0 1 0 1 0

1 0 1 1

Page 4: CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 20 ...kxc104/class/cmpen411/13s/lec/C411L20Multiplier… · VLSI Digital Circuits Spring 2012 Lecture 20: ... Use serial-parallel

Sp12 CMPEN 411 L20 S.4

Multiply Operation

Multiplication is just a a lot of additions

multiplicand

multiplier

partial

product

array

double precision product

N

2N

N can be formed in parallel

Page 5: CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 20 ...kxc104/class/cmpen411/13s/lec/C411L20Multiplier… · VLSI Digital Circuits Spring 2012 Lecture 20: ... Use serial-parallel

Sp12 CMPEN 411 L20 S.5

Multiplication Approaches

Right shift and add

Partial product array rows are accumulated from top to bottom on an N-bit adder

- After each addition, right shift (by one bit) the accumulated partial product to align it with the next row to add

Time for N bits Tserial_mult = O(N Tadder) = O(N2) for a RCA

Making it faster

Use a faster adder

Use higher radix (e.g., base 4) multiplication – O(N/2 Tadder)

- Use multiplier recoding to simplify multiple formation (booth)

Form the partial product array in parallel and add it in parallel

Making it smaller (i.e., slower)

Use serial-parallel mult

Use an array multiplier

- Very regular structure with only short wires to nearest neighbor cells. Thus, very simple and efficient layout in VLSI Can be easily and efficiently pipelined

Page 6: CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 20 ...kxc104/class/cmpen411/13s/lec/C411L20Multiplier… · VLSI Digital Circuits Spring 2012 Lecture 20: ... Use serial-parallel

Sp12 CMPEN 411 L20 S.6

Serial-parallel multiplier structure

Page 7: CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 20 ...kxc104/class/cmpen411/13s/lec/C411L20Multiplier… · VLSI Digital Circuits Spring 2012 Lecture 20: ... Use serial-parallel

Sp12 CMPEN 411 L20 S.7

The Array Multiplier

Y0

Y1

X3 X2 X1 X0

X3

HA

X2

FA

X1

FA

X0

HA

Y2X3

FA

X2

FA

X1

FA

X0

HA

Z1

Z3Z6Z7 Z5 Z4

Y3X3

FA

X2

FA

X1

FA

X0

HA

Z2

Z0

Page 8: CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 20 ...kxc104/class/cmpen411/13s/lec/C411L20Multiplier… · VLSI Digital Circuits Spring 2012 Lecture 20: ... Use serial-parallel

Sp12 CMPEN 411 L20 S.11

Booth multiplier

Encoding scheme to reduce number of stages in multiplication.

Performs two bits of multiplication at once—requires half the stages.

Each stage is slightly more complex than simple multiplier, but adder/subtracter is almost as small/fast as adder.

Page 9: CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 20 ...kxc104/class/cmpen411/13s/lec/C411L20Multiplier… · VLSI Digital Circuits Spring 2012 Lecture 20: ... Use serial-parallel

Sp12 CMPEN 411 L20 S.12

Booth encoding

Two‘s-complement form of multiplier:

y = -2nyn + 2n-1yn-1 + 2n-2yn-2 + ... (first bit is the sign bit)

(example, y=18=010010 y= -18 = 101110 )

Rewrite using 2a = 2a+1 - 2a:

y = 2n(yn-1-yn) + 2n-1(yn-2 -yn-1) + 2n-2(yn-3 -yn-2) + ...

Consider first two terms: by looking at three bits of y, we can determine whether to add x, 2x to partial product.

Page 10: CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 20 ...kxc104/class/cmpen411/13s/lec/C411L20Multiplier… · VLSI Digital Circuits Spring 2012 Lecture 20: ... Use serial-parallel

Sp12 CMPEN 411 L20 S.13

Booth actions

yi yi-1 yi-2 increment

0 0 0 0

0 0 1 x

0 1 0 x

0 1 1 2x

1 0 0 -2x

1 0 1 -x

1 1 0 -x

1 1 1 0

y = 2n(yn-1-yn) + 2n-1(yn-2 -yn-1) + 2n-2(yn-3 -yn-2) + ...

Consider first two terms: by looking at three bits of y, we can determine whether to add x, 2x to partial product.

Page 11: CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 20 ...kxc104/class/cmpen411/13s/lec/C411L20Multiplier… · VLSI Digital Circuits Spring 2012 Lecture 20: ... Use serial-parallel

Sp12 CMPEN 411 L20 S.14

Booth example

x = 1001 (910), y = 0111 (710).

P0 = 00000000

y3y2y1=011 y1y0y-1=11(0)

y1y0y-1 = 110, P1 = P0 - (1001) = 11110111

x shift left for 2 bits to be 100100

y3y2y1 = 011, P2 = P1 (10*100100) =

11110111+01001000 = 001111111 (6310)

An array multiplier needs N addtions, booth multiplier needs only N/2 additions

Page 12: CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 20 ...kxc104/class/cmpen411/13s/lec/C411L20Multiplier… · VLSI Digital Circuits Spring 2012 Lecture 20: ... Use serial-parallel

Sp12 CMPEN 411 L20 S.15

Review: A 64-bit Adder/Subtractor

1-bit

FA S0

C0=Cin

C1

1-bit

FA S1

C2

1-bit

FA S2

C3

C64=Cout

1-bit

FA S63

C63

. .

.

Ripple Carry Adder (RCA) built out of 64 FAs

Subtraction – complement all subtrahend bits (xor gates) and set the low order carry-in

RCA

advantage: simple logic, so small (low cost)

disadvantage: slow (O(N) for N bits) and lots of glitching (so lots of energy consumption)

A0

B0

A1

B1

A2

B2

A63

B63

add/subt

Page 13: CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 20 ...kxc104/class/cmpen411/13s/lec/C411L20Multiplier… · VLSI Digital Circuits Spring 2012 Lecture 20: ... Use serial-parallel

Sp12 CMPEN 411 L20 S.16

Booth structure

Page 14: CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 20 ...kxc104/class/cmpen411/13s/lec/C411L20Multiplier… · VLSI Digital Circuits Spring 2012 Lecture 20: ... Use serial-parallel

Sp12 CMPEN 411 L20 S.17

Wallace-Tree Multiplier

6 5 4 3 2 1 0 6 5 4 3 2 1 0

Partial products First stage

Bit position

6 5 4 3 2 1 0 6 5 4 3 2 1 0

Second stage Final adder

FA HA

(a) (b)

(c) (d)

Page 15: CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 20 ...kxc104/class/cmpen411/13s/lec/C411L20Multiplier… · VLSI Digital Circuits Spring 2012 Lecture 20: ... Use serial-parallel

Sp12 CMPEN 411 L20 S.18

Wallace-Tree Multiplier

Full adder = (3,2) compressor

Page 16: CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 20 ...kxc104/class/cmpen411/13s/lec/C411L20Multiplier… · VLSI Digital Circuits Spring 2012 Lecture 20: ... Use serial-parallel

Sp12 CMPEN 411 L20 S.19

Making it Faster: Tree Multiplier Structure

partial

product

array

reduction

tree

fast carry

propagate

adder (CPA)P (product)

mux

+

reduction

tree (log N)

+

CPA (log N)

Q (‗ier)

D (‗icand)

D

D

D

0

0

0

0

multiple

forming

circuits

inte

rconnect

Page 17: CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 20 ...kxc104/class/cmpen411/13s/lec/C411L20Multiplier… · VLSI Digital Circuits Spring 2012 Lecture 20: ... Use serial-parallel

Sp12 CMPEN 411 L20 S.20

(4,2) Counter

Built out of two (3,2) counters (just FA‘s!)

all of the inputs (4 external plus one internal) have the same weight (i.e., are in the same bit position)

the internal carry output is fed to the next higher weight position (indicated by the )

(3,2)

(3,2)Note: Two carry outs - one

―internal‖ and one ―external‖

Page 18: CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 20 ...kxc104/class/cmpen411/13s/lec/C411L20Multiplier… · VLSI Digital Circuits Spring 2012 Lecture 20: ... Use serial-parallel

Sp12 CMPEN 411 L20 S.22

Tiling (4,2) Counters

Reduces columns four high to columns only two high

Tiles with neighboring (4,2) counters

Internal carry in at same ―level‖ (i.e., bit position weight) as the internal carry out

(3,2)

(3,2)

(3,2)

(3,2)

(3,2)

(3,2)

Page 19: CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 20 ...kxc104/class/cmpen411/13s/lec/C411L20Multiplier… · VLSI Digital Circuits Spring 2012 Lecture 20: ... Use serial-parallel

Sp12 CMPEN 411 L20 S.24

4x4 Partial Product Array Reduction

multiplicand

multiplier

partial

product

array

reduced pp

array (to

CPA)

double

precision

product

Fast 4x4 multiplication using (4,2) counters

How would you lay it out?

five (4,2) counters

5-bit CPA

multiplicand

mu

ltiplie

r

8-bit product

Page 20: CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 20 ...kxc104/class/cmpen411/13s/lec/C411L20Multiplier… · VLSI Digital Circuits Spring 2012 Lecture 20: ... Use serial-parallel

Sp12 CMPEN 411 L20 S.25

8x8 Partial Product Array Reduction

‘icand

‘ier

partial

product

array

reduced

partial

product

array

Wallace tree

multiplier

two rows of

nine (4,2)

counters

one row of

thirteen

(4,2)

counters

to a 13-bit fast CPA

Page 21: CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 20 ...kxc104/class/cmpen411/13s/lec/C411L20Multiplier… · VLSI Digital Circuits Spring 2012 Lecture 20: ... Use serial-parallel

Sp12 CMPEN 411 L20 S.26

An 8x8 Multiplier Layout

How should it be laid out?multiplicand

mu

ltiplie

r

thirteen (4,2) counters

nine (4,2) counters

nine (4,2) counters

13-bit CPA

Page 22: CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 20 ...kxc104/class/cmpen411/13s/lec/C411L20Multiplier… · VLSI Digital Circuits Spring 2012 Lecture 20: ... Use serial-parallel

Sp12 CMPEN 411 L20 S.32

Multipliers —Summary

• Optimization Goals Different Vs Binary Adder

• Once Again: Identify Critical Path

• Other possible techniques

- Data encoding (Booth)

- Pipelining

FIRST GLIMPSE AT SYSTEM LEVEL OPTIMIZATION

- Logarithmic versus Linear (Wallace Tree Mult)

Page 23: CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 20 ...kxc104/class/cmpen411/13s/lec/C411L20Multiplier… · VLSI Digital Circuits Spring 2012 Lecture 20: ... Use serial-parallel

Sp12 CMPEN 411 L20 S.33

Next Lecture and Reminders

Next lecture

Shifters, decoders, and multiplexers

- Reading assignment – Rabaey, et al, 11.5-11.6