aug 20071 shift operations source: david harris. aug 20072 shifter implementation regular layout,...

28
Aug 2007 1 Shift Operations Source: David Harris

Upload: beryl-watts

Post on 27-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Aug 2007 1

Shift Operations

Source: David Harris

Aug 2007 2

Shifter Implementation

Regular layout, can be compact, use transmission gates to avoid threshold drop.

Not amenable to synthesis, high capacitive loading for large arrays.Source: David Harris

Aug 2007 3

Shifter Implementation

Each level shifts by two.

Amenable to synthesis, fast.

Aug 2007 4

Multiplication

Source: David Harris

Aug 2007 5

Array Multiplier with CPAs

Source: Jan Rabaey

Array adder with Carry propagate adders (CPA), multiple near-critical paths

Aug 2007 6

Array Multiplier with CSAs

Only one critical path

Source: Jan Rabaey

Aug 2007 7

How do CSAs work?

CSA: Carry Save Adder

Want to add these four numbers together (same problem as adding partial products in a multiplier)

Source: David Harris

Aug 2007 8

How do CSAs work? (cont)

Can use a full adder network to add three numbers together if we view the carry-in inputs as a bus that contains the third number.

The output produces a sum vector and a carry vector, and these have to be added to produce the final result.

Source: David Harris

Aug 2007 9

How do CSAs work? (cont)

carry vector has to be shifted to left by 1 before being added to the sum because the COUT bit has a weight of 2x that of the sum bit.

Source: David Harris

Aug 2007 10

CSA MultiplierCarry is shifted to left before being added.

This final addition is always N/2 in size if the product has N bits. For large multipliers, need to use a fast adder structure to do this addition.

Source: Jan Rabaey

Aug 2007 11

Multiplier Layout

Layout can be made to be rectangular

Source: David Harris

Source: David Harris

Aug 2007 12

2’s Complement Multiply Definition

MSb has negative weight

MSb has negative weight

4 bit 2’s complement example:

= -5 = 0xB = 1011 = -1*23 + 0*22 +1*21 +1*20 =-8+0+2+1=-5

Source: David Harris

Source: David Harris

Aug 2007 13

2’s Complement Multiplication

2’s complement

Source: David Harris

Source: David Harris

Aug 2007 14

Modified Baugh-Wooley Multiplier(2’s complement)

Pre-compute sums of constant ‘1’, push some terms upwards.

Source: David Harris

Aug 2007 15

Multiplier Layout For Two’s Complement

Shaded Cells are modified cells for Baugh-Wooley.

Source: David Harris

Aug 2007 16

Booth EncodingPrevious multipliers use radix-2, one bit of the multiplier is observed at a time.

In general, radix-2r multipliers produce N/r partial products (assuming NxN multiplier).

Fewer partial products lead to smaller/faster CSA arrays.

A radix-4 = radix-22 multiplier produces N/2 partial products.

Two-bits * two bits = Y1Y0 * X1X0 = Y*X

= Y*0, Y*1, Y*2, Y*3

Y*0, Y*1, Y*2 are easy/fast (Y*2 is a shift).

Y*3 is hard, has to be done Y*3= Y*(2+1)= 2Y + Y,

involves a carry propagate.

Aug 2007 17

Radix-4 Partial Products

Y

* XN-1XN-2...X3X2 X1X0

Y* X1X0

+ Y* X3X2

+ Y* XN-1XN-2

Number of partial products

is reduced.

Source: David Harris

Aug 2007 18

Booth Encoding (cont.)Observe that 2Y = 4Y – 2Y and 3Y = 4Y – Y

4Y is simply the next row in the partial product, so just add Y to next row. In both cases, Y has to be added to current partial product.

Booth encoding looks at current 2 bits, and MSB of previous 2 bits, and modifies the partial product.

If the MSB of the previous pair is ‘1’, add in ‘Y’ to current value.

Aug 2007 19

Booth Encoding (cont)

PP =0*Y

PP =0*Y +Y = YPP =Y +0 = YPP =Y +Y = 2YPP =-2Y +0 = -2Y

PP =-2Y +Y = -YPP =-Y +0 = -YPP =-Y +Y = 0

Sign bit select2Y

select

1Y select

Negative operations are done at bit level as complements with +1 added to PP to complete 2’s complement

Source: David Harris

Aug 2007 20

Booth Selection Logic

Replaces AND gates in CSA array

When –Y is chosen, have a problem in that a ‘1’ has to be added to complete two’s complement

Source: David Harris

Aug 2007 21

Unsigned R-4 Booth Array (16 x 16)sign extension, either all 1’s or all 0’s for-Y terms

‘1’ or ‘0’ needed to complete 2’s complement

Extra PP in case last PP needed a ‘Y’ added in here (last two X bits were either 2 or 3)

Source: David Harris

Aug 2007 22

Optimized R-4 Booth Array (unsigned)SSSS = 1111 + S

additional reduction

produces this.

Source: David Harris

Aug 2007 23

Signed R-4 Booth Array (16 x 16)

ei = Mi xor y15

Last PP8 is not needed for signed multiply

Source: David Harris

Aug 2007 24

Booth Speedup

• Radix-4 arrays 20-to-50% smaller than CSA arrays and up to 20% faster.

• Higher Radix multipliers are possible, but not worth it except for larger multipliers (at least 64 bits).

Aug 2007 25

Wallace TreesA CSA adder just adds the PPs together one at a time:

3,2 Counter is another name

for a full adder

Source: David Harris

Aug 2007 26

Wallace Trees (cont).A Wallace tree adds the partial products in parallel!

Number of levels is:

Layout is not regular, long wires can cause delay.

Source: David Harris

Aug 2007 27

4-2 CompressorUsed to reduce the number of levels in a Wallace Tree

Number of levels is:

Logic more complex than Full Adder

Layout is more regular.

Source: David Harris

Aug 2007 28

Multiplier Summary

• CSA’s – simple, but many partial products• Booth Encoding – reduces number of required

PPs, achieves speedup over CSAs• Wallace Trees – adds PPs in parallel