computer architecture chapter 3 instructions: arithmetic for computer

Computer Architecture

Chapter 3

Instructions: Arithmetic for Computer

Yu-Lun Kuo 郭育倫Department of Computer Science and

Information Engineering

Tunghai University, Taichung, Taiwan [email protected]

http://www.csie.ntu.edu.tw/~d95037/

mailto:[email protected]

http://www.csie.ntu.edu.tw/~d95037/Course/ca2009/ca.htm

Review: MIPS Organization

2

ProcessorMemory

32 bits

230

words

read/write

addr

read data

write data

word address

(binary)

0…00000…01000…10000…1100

1…1100Register

Filesrc1 addr

src2 addr

dst addr

write data

32 bits

src1

data

src2

data

32

registers

($zero - $ra)

32

32

3232

32

32

5

5

5

PC

ALU

32 32

3232

32

0 1 2 37654

byte address

(big Endian)

Fetch

PC = PC+4

DecodeExec

Add32

324

Add32

32branch offset

Memory Address Binding

• High level language 與 Machine language在 memory address 轉換的對應方法

– 編譯期鏈結

– 載入期鏈結» Relocate

– 執行期鏈結» Dynamic linking & Dynamic loading

3

Review: MIPS Addressing Modes

4

1. Operand: Register addressingop rs rt rd funct Register

word operand

op rs rt offset

2. Operand: Base/Displacement addressing

base register

Memoryword or byte operand

3. Operand: Immediate addressingop rs rt operand

4. Instruction: PC-relative addressingop rs rt offset

Program Counter (PC)

Memorybranch destination instruction

5. Instruction: Pseudo-direct addressingop jump address

Program Counter (PC)

Memoryjump destination instruction||

R-TypeR-Type

I-TypeI-Type

addi addi

BranchBranch

J-TypeJ-Type

Overview

• Data type includes representation and operations

• let’s look at some arithmetic operations:– Addition

– Subtraction

– Sign Extension

• Also look at overflow conditions for addition.

• Multiplication, division, etc.

• Logical operations are also useful:– AND

– OR

– NOT

5

Signed and Unsigned Numbers

• Humans base 10, Computers base 2

• Bits are just bits (no inherent meaning)– Conventions define relationship between bits

and numbers

• Binary numbers (base 2)0000 0001 0010 0011 0100 0101 0110 0111 1000

1001...decimal: 0...2n-1

• Of course it gets more complicated:– Numbers are finite (overflow)

– Negative numbers» e.g., no MIPS subi instruction; addi can add a

negative number

6


• Computer program calculate both positive and negative numbers

– Distinguishes the positive from the negative

– Obvious solution

» Add a separate sign

» Conveniently can be represented in a sign bit

» Name of this representation is sign and magnitude

7


• Sign and magnitude representation has several shortcomings

– Not obvious where to put the sign bit

» Right or left?

– Adders may need an extra step to set the sign

– Has both a positive and negative zero

» Lead to problems for inattentive programmers

8


• How do we represent negative numbers?– i.e., which bit patterns will represent which

numbers?

» No use sign and magnitude

» Use two’s complement representation• Leading 0s mean positive

• Leading 1s mean negative

• Two’s complement advantage– All negative numbers have a 1 in the most

significant bit

» Hardware needs to test only this bit (+ or -)

9


10

• 32-bit signed numbers (2’s complement):0000 0000 0000 0000 0000 0000 0000 0000two = 0ten

0000 0000 0000 0000 0000 0000 0000 0001two = + 1ten...

0111 1111 1111 1111 1111 1111 1111 1110two = + 2,147,483,646ten

0111 1111 1111 1111 1111 1111 1111 1111two = + 2,147,483,647ten

1000 0000 0000 0000 0000 0000 0000 0000two = – 2,147,483,648ten

1000 0000 0000 0000 0000 0000 0000 0001two = – 2,147,483,647ten...

1111 1111 1111 1111 1111 1111 1111 1110two = – 2ten

1111 1111 1111 1111 1111 1111 1111 1111two = – 1ten

maxint

minintMSB

LSB

(2)10 = (0000 0000 0000 0000 0000 0000 0000 0010)2

(1111 1111 1111 1111 1111 1111 1111 1101) 2

(-2)10 = (1111 1111 1111 1111 1111 1111 1111 1110) 2


• Represent positive and negative numbers(x31 x -231)+(x30 x 230)+(x29 x 229)+…+ (x1 x 21)+(x0 x 20)

11

Converting < 32-bit values into 32-bit values Copy the most significant bit (the sign bit) into the “empty” bits

0010 -> 0000 00101010 -> 1111 1010 (sign extend)

sign extend versus zero extend (lb vs. lbu)

slt vs. slti (set on less than immediate)

sltu (set on less than unsigned) vs. sltiu

Sign Extension

• To add two numbers, we must represent them with the same number of bits

• If we just pad with zeroes on the left:

• Instead, replicate the MS bit -- the sign bit:

12

4-bit 8-bit

0100(4) 00000100 (still 4)

1100(-4) 00001100 (12, not -4)

4-bit 8-bit

0100(4) 00000100 (still 4)

1100(-4) 11111100 (still -4)

Signed versus Unsigned Comparison

• Suppose register $s0 has binary number

1111 1111 1111 1111 1111 1111 1111 1111

• Register $s1 has the binary number

0000 0000 0000 0000 0000 0000 0000 0001

What are the value of registers $t0 and $t1

• slt $t0, $s0, $s1 #signed

• sltu $t1, $s0, $s1 #unsigned

Ans: $t0 = 1 and $t1 = 013

MIPS Arithmetic Logic Unit (ALU)

14

• Must support the Arithmetic/Logic operations of the ISAadd, addi, addiu, addu

sub, subu,

mult, multu, div, divu

sqrt

and, andi, nor, or, ori, xor, xori

beq, bne, slt, slti, sltiu, sltu

32

32

32

m (operation)

result

A

B

ALU

4

zeroovf

11

With special handling for sign extend – addi, addiu andi, ori, xori, slti, sltiu zero extend – lbu, addiu, sltiu no overflow detected – addu, addiu, subu, multu, divu, sltiu, sltu

Review: 2’s Complement Representation

15

2’sc binary decimal

1000 -8

1001 -7

1010 -6

1011 -5

1100 -4

1101 -3

1110 -2

1111 -1

0000 0

0001 1

0010 2

0011 3

0100 4

0101 5

0110 6

0111 723 - 1 =

-(23 - 1) =

-23 =

1010

complement all the bits

1011

and add a 1

Negate

A 32-bit Ripple Carry Adder/Subtractor

16

Remember 2’s complement is just

complement all the bits

add a 1 in the least significant bit

A 0111 0111 B - 0110 +

1-bit FA S0

c0=carry_in

c1

1-bit FA S1

c2

1-bit FA S2

c3

c32=carry_out

1-bit FA S31

c31

. . .

A0

A1

A2

A31

B0

B1

B2

B31

add/sub

B0

control

(0=add,1=sub) B0 if control = 0, !B0 if control = 1

0001

1001 1

1 0001

Addition & Subtraction

17

• Just like in grade school (carry/borrow 1s) 0111 0111 0110+ 0110 - 0110 - 0101

• Two's complement operations easy

– subtraction using addition of negative numbers 0111+ 1010

• Overflow (result too large for finite computer word):

– e.g., adding two n-bit numbers does not yield an n-bit number 0111+ 0001 note that overflow term is somewhat

misleading,

1000 it does not mean a carry “overflowed”

Overflow

• If operands are too big, then sum cannot be represented as an n-bit 2’s comp number

18

01000 (8) 11000 (-8)

+01001 (9) +10111 (-9)

10001 (-15) 01111 (+15)

Overflow Detection

• Overflow– the result is too large to represent in 32 bits

• Overflow occurs when– adding two positives yields a negative

– adding two negatives gives a positive

– subtract a negative from a positive gives a negative

– subtract a positive from a negative gives a positive

19

1

1

1 10

1

0

1

1

0

0 1 1 1

0 0 1 1+

7

3

0

1

– 6

1 1 0 0

1 0 1 1+

–4

– 3

71

0

Tailoring the ALU to the MIPS ISA

• Need to support the logic operation (and,nor,or,xor)– Bit wise operations (no carry operation involved)

– Need a logic gate for each function, mux to choose the output

• Need to support the set-on-less-than instruction (slt)– Use subtraction to determine if (a – b) < 0 (implies a < b)

– Copy the sign bit into the low order bit of the result, set remaining result bits to 0

• Need to support test for equality (bne, beq)– Again use subtraction: (a - b) = 0 implies a = b

– Additional logic to “nor” all result bits together

• Immediates are sign extended outside the ALU with wiring (i.e., no logic needed)

20

Logic Operation (*)

• Shift Operations– Shifts move all the bits in a word left or right

sll $t2, $s0, 8 #$t2 = $s0 << 8 bits

srl $t2, $s0, 8 #$t2 = $s0 >> 8 bits

• The shift operation is implemented by hardware separate from the ALU

21

op rs rt rd shamt funct

Logic Operation (*)

• sll (R-format)

• Ex. If register $s0 is

0000 0000 0000 0000 0000 0000 0000 1101

execute sll $s2, $s0, 8

What is the value of $s2?

0000 0000 0000 0000 0000 1101 0000 0000

22

1016 08unused0

sll $s2, $s0, 8

Arithmetic Logic Unit (ALU)

• Using 4 kinds hardware components

23

Logical Operations

• Operations on logical TRUE or FALSE– two states -- takes one bit to represent:

TRUE=1, FALSE=0

24

A B A AND B0 0 00 1 01 0 01 1 1

A B A OR B0 0 00 1 11 0 11 1 1

A NOT A0 11 0

Logic Operation (*)

• And (and) instruction

• Or (or) instruction

• Ex. If register $t2 is

0000 0000 0000 0000 0000 1101 0000 0000

If register $t1 is

0000 0000 0000 0000 0011 1100 0000 0000

– Then execute and $t0, $t1, $t2

The value of $t0 is

0000 0000 0000 0000 0000 1100 0000 0000

25

Examples of Logical Operations

26

• AND– useful for clearing bits

» AND with zero = 0

» AND with one = no change

• OR– useful for setting bits

» OR with zero = no change

» OR with one = 1

• NOT– unary operation -- one argument

– flips every bit

11000101

AND 00001111

00000101

11000101

OR 00001111

11001111

NOT 11000101

00111010

Adder (1-bit)

• 1-bit full adder

27

+a

bSum

CarryIn

CarryOut

a

0

0

0

0

1

1

1

1

b

0

0

1

1

0

0

1

1

CarryIn

0

1

0

1

0

1

0

1

Sum

0

1

1

0

1

0

0

1

CarryOut

0

0

0

1

0

1

1

1

Input Output

Adder (4-bit ripple carry adder)

• Each full adder inputs a Cin, which is the Cout of the previous adder

28

Adder (32 bit ALU)

• 32-bit adder requires 31 carry computations

29

CarryIn

Operation

CarryOut

a

b

Result

ALU

• ALU notation

30

ALU

a

b

Zero

結果溢位

進位輸出

ALU- 運算

OverflowOverflow

Carry outCarry out

3.4 Multiplication (1/4)

• Binary multiplication is just a bunch of right shifts and adds

31

multiplicand

multiplier

partial

product

arraydouble precision product

n

2n

ncan be formed in parallel and added in parallel for faster multiplication

Multiplication (2/4)

• More complicated than addition– accomplished via shifting and addition

• More time and more area

• Ex. Unsigned Multiplication(1000)2 x (1011)2

32

multiplicand

multiplier


• The length of the multiplication– n-bit multiplicand

– m-bit multiplier

– Product is n + m bits long

» The n + m bits are required to represent all possible products

• We must cope with overflow – Because we frequently want a 32-bit product

as the result of multiplying two 32-bit numbers

33


• The design mimics the algorithm– We learned in grammar school

• Assume– Multiplier: in the 32-bit Multiplier register

– 64-bit Product register is initialized to 0

» Write new values into the Product register

– 64-bit Multiplicand register

» Need to move the multiplicand left one digit each step

» Over 32 steps a 32-bit multiplicand would move 32 bits to the left

34

The First Multiplication Algorithm

• Product register is initialized to 0

• Each step took a clock cycle– Require almost 100 clock cycle

35

Multiplier0 = 1

1.

將被乘數與乘積相加，然後把結

果放入乘積暫存器內

1a.

將被乘數暫存器左移一位元

重複 32 次否 ?

2.

T F

TF

將乘數暫存器右移一位元3.

First Multiply Algorithm (ex. p.180)

• Using 4-bit number to save space– Multiply 2ten X 3ten (0010 X 0011)

36

The Second Multiplication Algorithm

• Multiplicand register, ALU, and Multiplier register are all 32 bits wide

• Only Product register is 64 bits (initial = 0)– The multiplier is placed instead in the right

half on the Product register

37

The Second Multiplication Algorithm

38

Multiplier0 = 1

1.

把被乘數加到乘積的左半邊，

然後把結果放到乘積暫存器的左半邊

1a.

將乘積暫存器右移一位元

重複 32 次否 ?

2.

T F

TF

將乘數暫存器右移一位元3.

Multiply in MIPS

• MIPS has two instructions– Multiply: mult

– Multiply unsigned: multu

• MIPS multiply instructions ignore overflow– Up to the software to check to see if the

product is too big to fit in 32 bits

39

3.5 Division

• Division is just a bunch of quotient digit guesses and left shifts and subtracts

40

dividenddivisor

partial

remainder

array

quotientnn

remainder

n

0 0 0

0

0

0

Division

• MIPS has two instructions– Divide: div

– Divide unsigned: divu

• As with multiply, divide ignores overflow– Software must determine if the quotient is too

large

– Software must also check the divisor to avoid division by 0

41

3.6 Floating Point (p.189)

• We need a way to represent

– numbers with fractions, e.g., 3.14159265 (π)

– very small numbers, e.g., .000000001

– very large numbers, e.g., 3.15576 X 109

42

Floating Point

• Representation– sign, exponent, significand: (–1)sign X significand X 2exponent

– more bits for significand gives more accuracy

– more bits for exponent increases range

• IEEE 754 floating point standard:

– single precision: 8 bit exponent, 23 bit significand

– double precision: 11 bit exponent, 52 bit significand

43

Representing Big (and Small) Numbers

44

• What if we want to encode the approx. age of the earth?

4,600,000,000 or 4.6 x 109

or the weight in kg of one a.m.u. (atomic mass unit)

0.0000000000000000000000000166 or 1.6 x 10-27

Floating point representation (-1)sign x F x 2E

Still have to fit everything in 32 bits (single precision)

s E (exponent) F (fraction)

1 bit 8 bits 23 bits

Floating Point Form

• Generally of the form

(-1)S * F* 2E

• Single precision

• Double precision

45

Exponent SignificandS

1-bit 8-bit 23-bit


1-bit 11-bit 20-bit

Significand (continue)

32-bit

Floating Point Form

• Generally of the form

(-1)S * F* 2E

• Single precision

– S: the sign of the floating-point number

– Exponent: the value of the 8-bit exponent field

– Fraction(significand): the 23-bit number

» Sign and magnitude ( 符號與大小 )• The sign has a separate bit from the rest of the number

46


1-bit 8-bit 23-bit

Overflow & Underflow

• Overflow– Means that the positive exponent is too large

to be represented in the exponent field

• Underflow– Means that the negative exponent is too large

to be represented in the exponent field

47

IEEE 754 floating-point standard (1/2)

• These format go beyond MIPS– They are part of the IEEE 754 floating-point

standard

• To pack even more bits into the significand– IEEE754 makes the leading 1 bit of normalized

binary numbers implicit

» The number is 24 bits long in single precision• Implied 1 and a 23-bit fraction

» The number is 53 bits long in double precision• Implied 1 and a 52-bit fraction

49

IEEE 754 floating-point standard (2/2)

• 0 has no leading 1– It is given the reserved exponent value 0

– 000…00two represents 0

• The representation of the rest of the numbers

(-1)S * (1+Fraction)* 2E

@F is stored in normalized form where the msb in the fraction is 1 (so there is no need to store it!) – called the hidden bit

@ E specifies the value in the exponent field

• If number the bits of the fraction from left to right s1, s2, s3,…

(-1)S*(1+(s1*2-1) +(s2*2-2) +(s3*2-3)+…)*2E

50


51

• 32-bit signed numbers (2’s complement):0000 0000 0000 0000 0000 0000 0000 0000two = 0ten

0000 0000 0000 0000 0000 0000 0000 0001two = + 1ten...

0111 1111 1111 1111 1111 1111 1111 1110two = + 2,147,483,646ten

0111 1111 1111 1111 1111 1111 1111 1111two = + 2,147,483,647ten

1000 0000 0000 0000 0000 0000 0000 0000two = – 2,147,483,648ten

1000 0000 0000 0000 0000 0000 0000 0001two = – 2,147,483,647ten...

1111 1111 1111 1111 1111 1111 1111 1110two = – 2ten

1111 1111 1111 1111 1111 1111 1111 1111two = – 1ten

maxint

minintMSB

LSB

(2)10 = (0000 0000 0000 0000 0000 0000 0000 0010)2

(1111 1111 1111 1111 1111 1111 1111 1101) 2

(-2)10 = (1111 1111 1111 1111 1111 1111 1111 1110) 2

IEEE 754

• The designers of IEEE 754 wanted a floating –point

– The sign is in the most significant bit

– Could be easily processed by integer comparisons, especially for sorting

» Quick test of less than, greater than, or equal to 0

• Negative exponents pose a challenge to simplified sorting

– If we use two’s complement in which negative exponents have a 1 in the MSB

– Negative exponent will look like a big number

52

IEEE 754 Biased Notation (1/2) (p.194)

• Biased notation– The most negative exponent as 00…00two

– The most positive exponent as 11…11two

• Bias of 127 for single precision and 1023 for double precision

指數 = E + Biased ( 偏差值 )

53

E

128

127

…

0

-1

…

-127

指數

255

254

…

127

126

…

0

Biased

127

IEEE 754 Biased Notation (2/2) (p.194)

• IEEE 754 floating point standard

(-1)sign x (1+F) x 2E-bias

– Formats for both single and double precision» 127 for single and 1023 for double precision

– F is stored in normalized form where the msb in the fraction is 1 (so there is no need to store it!) called the hidden bit

54

Floating Point Representation (ex.195)

• Show the IEEE 754 binary representation -0.75 in single precision-3/4ten -3/22

ten = - ( ½ + ¼ ) -0.11two

-0.11two X 20 - 1.1two X 2-1

» Sign is 1 – number is negative

» Exponent field is 01111110 = 126 (decimal)

» Fraction is 0.100000000000… = 0.5 (decimal)

• Value = -1.5 x 2(126-127) = -1.5 x 2-1 = -0.75 10111111010000000000000000000000

55

sign exponent fraction

Converting Binary to Decimal FP

• 1 10000001 01000000000000000000000The sign bit: 1The exponent field: 129The fraction field: 1*2-2 (=0.25)

(-1)sign x (1+F) x 2E-bias

= (-1)1 x (1+0.25) x 2129-127

= -1 x 1.25 x 22

= -1.25 x 4

= -5.0

56

Floating-Point Addition (p.199)

57

Addition (and subtraction) Algorithm on p.200

(F1 2E1) + (F2 2E2) = F3 2E3

Step 1: Restore the hidden bit in F1 and in F2 Step 1: Align fractions by right shifting F2 by E1 - E2 positions

(assuming E1 E2) keeping track of (three of) the bits shifted out in a round bit, a guard bit, and a sticky bit

Step 2: Add the resulting F2 to F1 to form F3 Step 3: Normalize F3 (so it is in the form 1.XXXXX …)

- If F1 and F2 have the same sign F3 [1,4) 1 bit right shift F3 and increment E3

- If F1 and F2 have different signs F3 may require many left shifts each time decrementing E3

Step 4: Round the sum F3 and possibly normalize F3 again Step 5: Rehide the most significant bit of F3 before storing the result

Floating-Point Addition (ex.)

• Adding 0.5ten and -0.4375ten in binary0.5 = (0.1)2 = (1.000)2 * 2-1

-0.4375 = - (0.0111)2 = - (1.110)2 * 2-2

• Step 1. Exponent matches the larger number- (1.110)2 * 2-2 = - (0.111)2 * 2-1

• Step 2. Add the significands(1.000)2 * 2-1 - (0.111)2 * 2-1 = (0.001)2 * 2-1

• Step3. Normalize the sum (0.001)2 * 2-1 = (1.000)2 * 2-4

• Step4. Round the sum(1.000)2 * 2-4 = (0.0001000)2 = (1/24)10 = (1/16)10 =

(0.0625)10 …ANS58

•Q & A

59

computer architecture chapter 3 instructions: arithmetic for computer

Documents