hints_sol_2.pdf

1.

i) a) sub $s0, $s1, $s2

b) addi $t0, $s2, -5 , $t0 is any free register

add $s0, $s1,$to

(immediate subtraction is done by addi and taking negative of the number)

ii) a)1

b)2

iii) a) -1 ;$s0=2-3

b) 0 ; $s1=3-5=-2

; $s1=2-2=0

iv) a) $t0=$t0+4

b)$t0=$t1+$t2

$t0=$t3+$t0

v) a)5

b)9

2.

i) a) sub $t0,$t1,$t0

b) addi $t0,$t1, -2

add $t2,$t3,$t0

ii) a)1

b)2

iii) a)1 ; $t0=2-1

b)5 ; $t1=3-2 =1; $t2=4+1=5

iv) a) $t0=$t0+4

b) $t0=$t1+$t2

$t0=$t3-$t0

v) a)5

b)-1

3.

i) a) sub $t0,0,$t1 ; $t0=0-$t1=-$t1

sub $t3,$t0,$t3

b) sub $t0,0,$t3 ;$t0=0-$t3=-$t3

addi $t0,$t0,-5

add $t3,$t1,$t0

ii) a)2

b)3

iii) a)-3 ; $t0=-2; $t3=-2-1

b)-4 ; $t0=-1; $t0=-1-5=-6; $t3=2+(-6)=-4

iv) a) $t3=$t3- 4

b) $t1=$t0+$t2

$t3=$t1+$t3

v) a)-3

b)6

4.

i) a) 613,566,756

b) 1,073,741,824

ii) a) 613,566,756

b) 1,073,741,824

iii) a) 24924924

b) 5FBE4000

iv) a) 1111

b)0100 0000 0000

v) a) F

b)400

vi) a) 1

b) C00

5.

Hints

(overflow conditions for addition and subtraction are summarised in the table below:

Operand Operation Op1 op2

a+b + + -

a+b - - +

a-b + - +

a-b - + +

)

i)a) 50000000, overflow

b) 00000000, no overflow

ii) a) 5, no overflow

b) FFFFFFFE, overflow

iii)a) D0000000 , no overflow


iv) a) 00000000, no overflow


v) a) 80000000, overflow

b) A0000000, no overflow

vi) a) 2FFFFFFF, no overflow

b) FFFFFFFF, overflow

6.

i)a) 6FFFFFFF, no overflow

b) 70000400 , no overflow

ii) a) 7FFFFFFF, no overflow

b) 7FFFFC00, overflow

iii) a) 80000000, overflow

b) 7FFFFBFF, no overflow

iv) a) overflow

b) overflow

v) a) 94924924

b) CFBE4000

vi) a) 2492614948

b) 3485351936

7. (i) a). The size of rs,rt,rd,shamt must become 7bits (2^7=128) to

account for increase in registers.

b)

the op and func field depend on the no of instructions.

(ii). a) the size of rs and rt will become 7 bits.

b) the size of op bit field will increase.

(iii). Increase in number of instructions could reduce size of assembly lang instructions as

otherwise it would take multiple instructions to do the same.

However it can also

(iv)

a) 17367058,

b) 2903048210

(v).

a)mflo (R-type opcode : 00000 followed by func)

b) Sw (I-type)

(vi) .

a) sll -R type (opcode: 00000)

b) Sw -I type.(opcode: 101011)

8.

(i)

a) FF005A5A,

b) FFFFFFE7.

(ii)

(a) Nor $t1,$t2,$zero

(b)Nor $t3,$t3,$zero

or $t1,$t2,$t3.

(iii)

(a)100111

(b)100111

100101

(iv) 0x00012340

(v). a) NOR $s0,$s0,$ZERO

OR $s0,$s1,$s0

b) LW t0,0($2)

sll $s0,$t0,4

where $s0=A, $s1=B, $S2=base address of C.

(vi) Check the bit level representation

9.

(i.) The answer is really the same for all. All of these instructions are either supported by an existing

instruction, or sequence of existing instructions.

(ii). a) I-type

b) I-type

(iii) a)addi $t2, $t3,- 5

b) slti $t1, $t2,0

beq $t1,$0,LAB+4

addi $t2,$t2,1

LAB:

EXIT

(iv) a) 20,

b) 24

(v) a) B = 0;

for (i=0;i<10;i++) B = B+2;

b)B = 0;

for (i=10;i>=0;i--) B = B+2;

(vi) a) 3N

b)5N+ 2

10. (i)

a)

Fib:

bgt $a0,1,recurse

move $v0,$a0

jr $ra

recurse

addi $sp,$sp,-12 #3 registers

sw $ra,0($sp)

sw $a0,4($sp)

addi $a0,$a0,-1

jal fib

sw $v0, 8($sp)

lw $a0,4($sp)

sub $a0,$a0,-2

jal fib

lw $t0,8($sp)

add $v0,$v0,$t0

lw $ra,0($sp)

addi $sp,$sp,12

jr $ra

b)

positive:

addi $sp, $sp, –4

sw $ra, 0($sp)

add $s0, $a0, $0

add $s1, $a1, $0

jal addit

addi $t1, $0, 1

slt $t2, $0, $v0

bne $t2, $0, exit

addi $t1, $0, $0

exit:

add $v0, $t1, $0

lw $ra, 0($sp)

addi $sp, $sp, 4

jr $ra

addit:

add $v0, $a0, $a1

jr $ra

(ii)

a)

Due to the recursive nature of the code, not possible for the compiler to in-line the

function call.

b)

positive:

addi $sp, $sp, –4

sw $ra, 0($sp)

add $t0, $a0, $a1

slt $t1, $0, $t0

add $v0, $t1, $0

lw $ra, 0($sp)

addi $sp, $sp, 4

jr $ra

(iii)

a)

after calling function compare:

old $sp => 0x7ffffffc ???

$sp => –4 contents of register $ra

after calling function sub:

old $sp => 0x7ffffffc ???

–4 contents of register $ra

$sp => –8 contents of register $ra #return to

Compare

b) Similar as in part a

(iv) Reserve the stack space for saving the variables used in the functions. (using addi and sw)

Use move instruction to save the variables for going to function func.

Call func using jal.

Again save the variables using move instructions.

Again call func.

Then load the variables value from stack to registers using lw.

a) f: addi $sp,$sp,–8

sw $ra,4($sp)

sw $s0,0($sp)

add $s0,$a2,$a3

jal func

move $a0,$v0

move $a1,$s0

jal func

lw $ra,4($sp)

lw $s0,0($sp)

addi $sp,$sp,8

jr $ra

b) f: addi $sp,$sp,–4

sw $ra,0($sp)

add $t0, $a0, $a1

add $t1, $a2, $a3

bgt $t0, $t1, exit

add $s0, $a2, $a3

add $s1, $a0, $a1

jal func

exit:

add $s0, $a0, $a1

add $s1, $a2, $a3

jal func

lw $ra, 0($sp)

addi $sp, $sp, 4

jr $ra

(v) We can use the tail-call optimization for the second call to func, but then we must restore $ra and

$sp before that call.

(vi) Register $ra is equal to the return address in the caller function, registers

$sp and $s3 have the same values they had when function f was called, and register

$t5 can have an arbitrary value. For register $t5, note that although our function

f does not modify it, function func is allowed to modify it so we cannot assume

anything about the of $t5 after function func has been called.

11.

(i.)

a)

After entering the function main

Old $sp=> 0x7ffffffc ???

$sp=> -4 contents of register $ra

After entering function my_function

Old $sp => 0x7ffffffc ???

-4 contents of register $ra

$sp=> -8 contents of register $ra(return to main)

Global pointers:

0x10008000 100 my_global

b)

After entering the function main

Old $sp=> 0x7ffffffc ???

$sp=> -4 contents of register $ra

After entering function my_function

Old $sp => 0x7ffffffc ???

-4 contents of register $ra

$sp=> -8 contents of register $ra(return to main)

Global pointers:

0x10008000 100 my_global

(ii)

a)

addi $sp, $sp, -4

sw $r0,0($sp)

addi $a0, $0, 10

addi $a1, $0, 20

jal FUNC

add $t2, $v0, $0

lw $r0, 0($sp)

addi $sp, $sp, 4

jr $r0

FUNC:lw $a2, 0($s0)

sub $t0, $a0,$a1

add $v0,$t1,$a2

jr $r0

b)

addi $sp, $sp, -4

sw $r0,0($sp)

lw $a0,0($s0)

addi $a0,$a1,1

jal FUNC

add $t0,$v0,$0

lw $r0, 0($sp)

addi $sp, $sp, 4

jr $r0

FUNC:addi $v0,$a0,1

Jr $r0

(iii)

Same as (ii)

(iv)

a)

check yourself

b).

addi $sp,$sp,-4

sw $ra,($sp)

add $a2, $a3, $a2

slt $a2, $a2, $a0

move $v0, $a1

beqz $a2, L

j L2

L:

move $a0, $a1

jal g

L2:

lw $ra,($sp)

addi $sp,$sp,+4

jr $ra

(v)

a) int func(int a,int b,int d)

{

C=a+b;

If(d==0)

C=b-a;

Return c;

}

b)

//assume $a0 is a, $s1 is b, $a2 is c, $a3 is d , $v0 is retValue equivalent c code is:

c=c+d;

if(c<a) c=1;

else c=0;

retValue =b;

if(c==0)

{

a=b;

retValue = 500;

}

return retValue;

(vi)

a)

find the value

b)

value returned by the function is : 500

12. (i)

a)lw $t1,4($t0), $t0 points to the base of a record or structure

b)j 10000

Both j and jal are J-type instructions. J-type instructions use 6 bits for the opcode, and 26 bits

for the immediate value (called the target). This is the semantics of the j instruction. PC <- PC31-28 IR25-0 00 The new address is computed by taking the upper 4 bits of the PC, concatenated to the 26 bit

immediate value, and the lower two bits are 00, so the address created remains word-aligned.

This is called pseudo-direct addressing. Direct addressing would specify are 32 bits. It's

called pseudo-direct because some bits of the PC are used to compute the address. j allows you to access 1/16 of all possible word-aligned addresses.

(ii)

a) I-type

b)J-type

(iii)

a)+Adv:Saves instruction for incrementing array index. No extra hardware is needed.

eg- add $10, $20, $13 ;$20 is base,$13 is index

lw $5, 0($10)

+Can jump to any 32b address

Disadv:-Requires that two registers be written at the same time more hardware.

eg- lw $10, 4($13) ; $10 <- Mem[$10+4]

addi $13, $13, 4 ; $13 <- $13+4

-need to load a register with a 32 bit address, which could take multiple cycles

b)Adv: + 26 bits of the address is embedded as the immediate, and is used as the instruction

offset within the current 256MB (64MWord) region defined by the MS 4 bits of the PC.

eg- j Label

+allows the PC to be set to the current PC +4 +/- BranchAddr, supporting quick forward and

backward branches

Disadv: - not directly supported in hardware

eg- blt $s0, $s1, Else is implemented as

slt $at, $s0, $s1

bne $at, $zero, Else

-range of branches is smaller than large programs

(iv)

a)0x1 000 3100

0x0 001 0020

b)0xx 000 0010

0x0 001 0020

(v)

Now, for I type, we can only move to +/-(2^8) and for J type we can only move to +/-(2^18). So we are

restricted to particular range.

(vi)

If the place where we have to jump is lying beyond the allowed area, we will have to use one more

jump instruction.

13.

LL $t1,0($s1) : this would load the content of memory pointed to by s1 to t1

SC $t0,0($s1): this would read the value of the content of the memory pointed to by s1. If the value has

not changed then t0 is set to 1

else 0.

LL & SC are to check atomicity.

(i) 4 instructions (ii) One of the locations specified by the LL instruction has no corresponding SC instruction.

(iii) try: MOV R3,R4 LL R2,0(R2)

SC R3,0(R2)

BEQZ R3,try

MOV R4,R2

a)

Processor1 Processor2 Cycle Processor1 MEM Processor2

$t1 $t0 ($s1) $t1 $t0

0 1 2 99 30 40

ll $t1,0($s1) 1 1 2 99 99 40

ll $t1,0($s1) 2 99 2 99 99 40

sc $t0,0($s1) 3 99 2 40 99 1

sc $t0,0($s1) 4 99 0 40 99 1

b)

Processor1 Processor2 Cycle Processor1 MEM Processor2

$t1 $t0 ($s1) $t1 $t0

0 1 2 99 30 40

ll

$t1,0($s1) 1 99 2 99 30 40

ll $t1,0($s1) 2 99 2 99 99 40

Addi $t1,$t1,1 3 99 2 99 100 40

sc $t0,0($s1) 4 99 2 40 100 1

sc

$t0,0($s1) 5 99 0 40 100 1

14.

i)

a) swap: lw $t0,0($a0)

lw $t1,0($a1)

sw $t0,0($a1)

sw $t1,0(Sa0)

b) swap:

lw $t0,0($a0)

lw $t1,0($a1)

add $t0,$t0,$t1

sub $t1,$t0,$t1

sub $t0,$t0,$t1

sw $t0,0($a0)

sw $t1,0($a1)

ii) We need to pass the address of the elements to be swapped we have:

add $a0,$s2,$t2

add $t2,$t2,4

add $a1,$s2,$t2

a0 and a1 will contain the address of the elements that are to be swapped

iii)

The offset gets reduced to 1, need to use byte instructions

iv) We need to use offsets corresponding to a byte

v)

a) The swap function is never called. The inner loop is called once each time once.

b) same.

vi)

a) Each time the swap function is called by the inner loop , i-1 times

b)same

15.

i) a) register operand

b) register+offset and update register

ii) a) lw $s0, 4($s1)

addi $s1,$s1,4

b) lw $s1,0($s0)

lw $s1,4($s0)

lw $s1,8($s0)

addi $s0,$s0,12

iii) t1 holds r1 and t0 holds r0 a)

addi $t0,$zero,10

loop:

add $t1,$t1,$t0

add $t0,$t0,-1

slti $t2,$t0,1

beq $t2,$zero,exit

j loop

b) add $t0,$t0,$t1

Add $t2 $t3 + $t4 $t5, result in $t0 $t1

addu $t1, $t3, $t5 # add least significant word

sltu $t0, $t1, $t5 # set carry-in bit

addu $t0, $t0, $t2 # add in first most significant word

addu $t0, $t0, $t4

iv) a)6 MIPS vs 4 ARM instructions

b) check yourself

v) (a) ( 2/3) * (No of MIPS instructions / No of ARM instructions )

(b) find out using the formula above

16.

i.

a. This instruction copies ECX elements, where each element is 2 bytes in size, from an array pointed to by ESI to an array pointer by EDI.

b. It checks for least significant byte and loops the instructions.

ii.

a.

loop: lw $t0,0($a2)

sw $t0,0($a1)

addi $a0,$a0,–1

addi $a1,$a1,2

addi $a2,$a2,2

bnez $a0,loop

b.

loop: lb $t0, 0($a3)

lb $t1,0($a1)

bne $t0,$t1, JMP

bnez $a0,loop

JMP:addi $a1,$a1,1

addi $a0,$a0,1

`

iii.

speed up = number of MIPS instruction / num of x86 instruction

a. 6 / 5 = 1.2

b. 6/2 or 6/4..take average

iv.

a.

addi $s0,$0,1

addi $sp,$sp,-4

sw $r0,0($sp)

slt $s1,$a1,$a0

addi $s0,$s0,1

bne $s0,$s1,L

mov $v0,$a2

L:mov $v0,$a3

L2:jr $ra

Use stack. Push values of a, b, c and d onto stack. Now move the values of a and b to some

other registers and compare using gt and pop out c or d according to the conditions.

b.Use stack. Push the array and n onto stack. Use some counter value in register and

decrement it for implementing for loop till it becomes 0. And set all array values be 0 using

move instructions.

MIPS : number of instructions x 4 bytes.

X86 : just count the number of bytes given in code. 25bytes(a) and 31 bytes (b)

v. In MIPS, we fetch the next two consecutive instructions by reading the next 8 bytes from the instruction memory. In x86, we only know where the second

instruction begins after we have read and decoded the fi rst one, so it is more diffi cult

to design a processor that executes multiple instructions in parallel.

vi.

speedup = x86 cycles / MIPS cycles

a 11/5

b.calculate yourself

17.

i. To make it 32-bit integer we have to multiply by 4 before adding some value into array.

a. In the first iteration $t0 is 0 and the lw fetches a[0]. After that $t0 is 1, the lw uses a nonaligned

address triggers a bus error.

b. In the fi rst iteration $t2 point to address of a[0] , so the lw and sw instructions access a[0] and a[1] . In the

second iteration $t2 point to the next byte in a[0] instead of pointing to a[1]. Thus the fi rst lw uses a nonaligned

address and causes a bus error. Note that the computation for $t2 (address of a[n]) does not cause a bus error

because that address is not actually used to access memory

ii.

a. Yes, assuming that x is a sign-extended byte value between –128 and 127. If x is simply a byte value between 0 and 255, the function procedure only works if neither x nor array a contain

values outside the range of 0..127.

b.yes

iii.

a. f : move $v0, $0 : ret = 0 move $t0, $a0 : ptr = a

sll $t1,$t0,2 ; We must multiply the index by 4 before we add it to a[] to form the address for lw

add $t1, $a1, $a0 : &(a[n])

L : lw $t2, 0($t0) : read *p

Bne $t2, $a2, S : if(*p == x)

Addi $v0, $v0, 1 : ret++;

S : addi $t0,$t0,1 : p = p+1

Bne $t0, $t1,L : repeat if p != &(a[n])

Jr $ra : return ret

b.

f : move $t0, $0 : i = 0

addi $t1, $a1, -1 : n - 1

L : sll $t2,$a2,2 add $t2, $t0, $a0 : address of a[i]

lw $t3, 1($t2) : read a[i + 1]

sw $t3, 0($t2) : a[i] = a[i + 1]

addi $t0,$t0,1 : i = i+1

Bne $t0, $t1,L : repeat if i != n-1

Jr $ra : return

iv.

At the exit from my_alloc, the $sp register is moved to “free” the memory that is returned to main. Then

my_init() writes to this memory to initialize it. Note that neither my_init nor main access the stack memory in

any other way until sort() is called(stack based function), so the values at the point where sort() is called are still

the

same as those written by my_init:

a. 0, 0, 0, 0, 0

b. 5, 4, 3, 2, 1

v.

In main, register $s0 becomes 5, then my_alloc is called. The address of the array v “allocated” by my_alloc is

0xffe8, because in my_alloc $sp was saved at 0xfffc, and then 20 bytes (4 ´ 5) were reserved for array arr ($sp

was decremented by 20 to yield 0xffe8). The elements of array v returned to main are thus a[0] at 0xffe8, a[1] at

0xffec, a[2] at 0xfff0, a[3] at 0xfff4, and a[4] at 0xfff8. After my_alloc returns, $sp is back to 0x10000. The

value returned from my_alloc is 0xffe8 and this address is placed into the $s1 register. The my_init function

does not modify $sp, $s0, $s1, $s2, or $s3. When sort() begins to execute, $sp is 0x1000, $s0 is 5, $s1 is 0xffe7,

and $s2 and $s3 keep their original values of −10 and

1, respectively. The sort (0 procedure then changes $sp to 0xffec (0x1000 minus 20), and writes $s0 to memory

at address 0xffec (this is where a[1] is, so a[1] becomes 5), writes $s1 to memory at address 0xfff0 (this is where

a[2] is, so a[2] becomes 0xffe8), writes $s2 to memory address 0xfff4 (this is where a[3] is, so a[3] becomes -

10), writes $s3 to memory address 0xfff8 (this is where a[4] is, so a[4] becomes 1), and writes the return address

to 0xfffc, which does not affect values in array v. Now the values of array v are:

a. 0 5 0xffe8 7 1

b. 5 5 0xffe8 7 1

vi.

When the sort() procedure enters its main loop, the elements of array v are sorted without any interference from

other stack accesses. The resulting sorted array is

a. 0, 1, 5, 7, 0xffe8

b. 1, 5, 5, 7, 0xffe8

Unfortunately, this is not the end of the chaos caused by the original bug in my_alloc. When the sort() function

begins restoring registers, $ra is read the (luckily) unmodifi ed location where it was saved. Then $s0 is read

from memory at address 0xffec (this is where a[1] is), $s1 is read from address 0xfff0 (this is where a[2] is), $s2

is read from address 0xfff4 (this is where a[3] is), and $s3 is read from address 0xfff8 (this is where a[4] is).

When sort() returns to main(), registers $s0 and $s1 are supposed to keep n and the address of array v. As a

result, after sort() returns to main(), n and v are:

a. n=1, v=5

So v is a 1-element array of integers that begins at address 5

b. n=5, v=5

So v is a 5-element array of integers that begins at address 5

If we were to actually attempt to access (e.g., print out) elements of array v in the main() function after this

point, the fi rst lw would result in a bus error due to non-aligned address. If MIPS were to tolerate non-aligned

accesses, we would print out whatever values were at the address v points to (note that this is not the same

address to which my_init wrote its values).

hints_sol_2.pdf

Documents