computer arithmetic
DESCRIPTION
Computer arithmetic. Somet things you should know about digital arithmetic: Principles Architecture Design. Multiplication. Can be done in one clock cycle but: Very slow Needs a lot of hardware. Multiplication. Si. m. n. m. Si. n. n. Ci. Co. Ci. So. Co. m. So. - PowerPoint PPT PresentationTRANSCRIPT
Datorteknik ArithmeticCircuits bild 1
Computer arithmetic
Somet things you should know about digital arithmetic:
Principles
Architecture
Design
Datorteknik ArithmeticCircuits bild 2
Multiplication
Can be done in one clock cycle
but:
Very slow Needs a lot of hardware
Datorteknik ArithmeticCircuits bild 3
Multiplication
Si m n
Co Ci
So
Sim
n
Co
Ci
So
n
m
Datorteknik ArithmeticCircuits bild 4
Multiplication
p7 p6 p5 p4 p3 p2 p1 p0
a0
a1
a2
a3
b3 b2 b1 b0
a0b3 a0b2 a0b1 a0b0
a1b3 a1b2 a1b1 a1b0
a2b3 a2b2 a2b1 a2b0
a3b3 a3b2 a3b1 a3b0
Datorteknik ArithmeticCircuits bild 5
To avoid those costs:
Multiplication is usually multiple-cycle
For example: Repeated add, shift
In the MIPS: “4 - 12 cycles for mult”Databook s 3.9
Multiply instruction is not implemented in our simulator
Datorteknik ArithmeticCircuits bild 6
Division...
is even worse....
Multiple cycle
Repeated shift - subtract - test
Databook ... instruction uses 35 cycles
Divide instruction is not implemented in our simulator
Datorteknik ArithmeticCircuits bild 7
Division
can be done by D shifted right n bits
When D is negative:– If D is even:
D Arithmetic shift right by n
– If D is odd:
(D + 1) Arithmetic shift right by n
D / 2n
Datorteknik ArithmeticCircuits bild 8
Example
- 1 / 2
-1 arith. shift right 1:
111 -> 111
result: -1 Wrong
(-1 + 1) arith. shift right 1:
000 -> 000
result: 0 OK
Datorteknik ArithmeticCircuits bild 9
In the mips,
Multiply and divide uses special hardware(not the ALU)
and special registers “HI”, “LO”(not in our simulator)
Datorteknik ArithmeticCircuits bild 10
Floating point?
Needs its own hardware!
Co-processor, usually a separate chip
Main (integer)CPU
CP1Floating point
CP0Control
Datorteknik ArithmeticCircuits bild 11
So the ALU does
ADD
SUBTRACT
SIMPLE LOGIC
Simple logic is fast, but add / sub is slow
because of long critical path
Datorteknik ArithmeticCircuits bild 12
Add two numbers
1100 .......................010+.......................110 000
3 input bits in each step
Sum
Carry from step n-1
Full adderA0
B0S0
Cin
Cout
Datorteknik ArithmeticCircuits bild 13
The full adder
A B Ci S Co0 0 0 0 00 0 1 1 00 1 0 1 00 1 1 0 11 0 0 1 01 0 1 0 11 1 0 0 11 1 1 1 1
=1
=1
&
&
&
AB
Ci S
Co
2-level logic or:
Datorteknik ArithmeticCircuits bild 14
The carry chain
A31B31
S31
A30B30
S30
A29B29
S29
A2B2
S2
A1B1
S1
A0B0
S0
CinCout
Datorteknik ArithmeticCircuits bild 15
Addition
A31B31
S31
A30B30
S30
A29B29
S29
A2B2
S2
A1B1
S1
A0B0
S0
0
Datorteknik ArithmeticCircuits bild 16
Subtraction
A - B ?
A + Neg (B) two’s complement
A + Not (B) + 1one’s complement + 1
Datorteknik ArithmeticCircuits bild 17
Add and subtract
A31
S31
A30
S30
A29
S29
A2
S2
A1
S1
A0
S0
0 -> add1 -> sub
=1
B31
=1
B31
=1
B31
=1
B31
=1
B31
=1
B31
=1
B31
Datorteknik ArithmeticCircuits bild 18
Timing analys
There are six gates per stage** Exor are two gate levels
There are 32 stages
The critical path are 6 * 32 gate delay!(Ripple adder)
We must break up that carry chain!
Datorteknik ArithmeticCircuits bild 19
Full adder again:
S = A xor B xor Ci Co = (A and B) or ((A xor B) and Ci)
We define P = A xor B G = A and B
And we get S = P xor Ci Co = G or (P and Ci)
=1AB P
&AB G
Computedquickly!
Datorteknik ArithmeticCircuits bild 20
The full adder ....
Si = Pi xor Ci-1 Ci = Gi or (Pi and Ci-1)
If we could be given all of the Ci at the same time,
Si is just one more xor
Datorteknik ArithmeticCircuits bild 21
The full adder
C0 = G0 or (P0 and Cin)
C1 = G1 or (P1 and C0) C1 = G1 or (P1 and (G0 or (P0 and Cin)) C1 = G1 or P1G0 or P1P0Cin
in the K:th position: Ck = Gk or Gk-1Pk or....PkPk-1....P0Cin
Wide or Wide and
Datorteknik ArithmeticCircuits bild 22
The carry lookahead adder
P / Ggenerator
(two levellogic)
Carrygenerator(two level
logic)
Final add
(exor)
A32
B32 32
32 32
32
32
G
P
C
S
Cin
Datorteknik ArithmeticCircuits bild 23
At the worst...
An N-input AND (OR) has delay
lg2 (N) * 2-input delay:
Datorteknik ArithmeticCircuits bild 24
The combination of carry lookahead and ripple carry
B
C
B
C C
B
C
C0
C0
C0
P0,3 G0,3
P4,7G4,7
C4C8
P8,11 G8,11
P12,15G12,15
C12
C8
Datorteknik ArithmeticCircuits bild 25
The carry skip adder
Full adder Full adder Full adder Full adder
&
≥1
&
≥1
&
C0C4C8C12
P4,7P12,15 P8,11 P0,3
G12,15 G8,11 G4,7 G0,3
- If the full adder in step n generates a carry, it will be correct independent of carry in.
- A carry generated in step n is propagated through the and / or gates, not through the adders
Datorteknik ArithmeticCircuits bild 26
The carry select adder
C0Full adder Full adder Full adder Full adder
00 0
Full adder Full adder Full adder11 1
A B A B A B A B
A B A B A B
&≥1≥1 &
S S S
S
Datorteknik ArithmeticCircuits bild 27
Asymptotic time and space requirements
Time Space Ripple carry O(n) O(n) Carry lookahead O(log n) O(n log n)
Carry skip O(sqrt n) O(n) Carry select O(sqrt n) O(n)