hints_sol_2.pdf
DESCRIPTION
Solution to TutorialTRANSCRIPT
1.
i) a) sub $s0, $s1, $s2
b) addi $t0, $s2, -5 , $t0 is any free register
add $s0, $s1,$to
(immediate subtraction is done by addi and taking negative of the number)
ii) a)1
b)2
iii) a) -1 ;$s0=2-3
b) 0 ; $s1=3-5=-2
; $s1=2-2=0
iv) a) $t0=$t0+4
b)$t0=$t1+$t2
$t0=$t3+$t0
v) a)5
b)9
2.
i) a) sub $t0,$t1,$t0
b) addi $t0,$t1, -2
add $t2,$t3,$t0
ii) a)1
b)2
iii) a)1 ; $t0=2-1
b)5 ; $t1=3-2 =1; $t2=4+1=5
iv) a) $t0=$t0+4
b) $t0=$t1+$t2
$t0=$t3-$t0
v) a)5
b)-1
3.
i) a) sub $t0,0,$t1 ; $t0=0-$t1=-$t1
sub $t3,$t0,$t3
b) sub $t0,0,$t3 ;$t0=0-$t3=-$t3
addi $t0,$t0,-5
add $t3,$t1,$t0
ii) a)2
b)3
iii) a)-3 ; $t0=-2; $t3=-2-1
b)-4 ; $t0=-1; $t0=-1-5=-6; $t3=2+(-6)=-4
iv) a) $t3=$t3- 4
b) $t1=$t0+$t2
$t3=$t1+$t3
v) a)-3
b)6
4.
i) a) 613,566,756
b) 1,073,741,824
ii) a) 613,566,756
b) 1,073,741,824
iii) a) 24924924
b) 5FBE4000
iv) a) 1111
b)0100 0000 0000
v) a) F
b)400
vi) a) 1
b) C00
5.
Hints
(overflow conditions for addition and subtraction are summarised in the table below:
Operand Operation Op1 op2
a+b + + -
a+b - - +
a-b + - +
a-b - + +
)
i)a) 50000000, overflow
b) 00000000, no overflow
ii) a) 5, no overflow
b) FFFFFFFE, overflow
iii)a) D0000000 , no overflow
b) 00000001, no overflow
iv) a) 00000000, no overflow
b) 10000000, no overflow
v) a) 80000000, overflow
b) A0000000, no overflow
vi) a) 2FFFFFFF, no overflow
b) FFFFFFFF, overflow
6.
i)a) 6FFFFFFF, no overflow
b) 70000400 , no overflow
ii) a) 7FFFFFFF, no overflow
b) 7FFFFC00, overflow
iii) a) 80000000, overflow
b) 7FFFFBFF, no overflow
iv) a) overflow
b) overflow
v) a) 94924924
b) CFBE4000
vi) a) 2492614948
b) 3485351936
7. (i) a). The size of rs,rt,rd,shamt must become 7bits (2^7=128) to
account for increase in registers.
b)
the op and func field depend on the no of instructions.
(ii). a) the size of rs and rt will become 7 bits.
b) the size of op bit field will increase.
(iii). Increase in number of instructions could reduce size of assembly lang instructions as
otherwise it would take multiple instructions to do the same.
However it can also
(iv)
a) 17367058,
b) 2903048210
(v).
a)mflo (R-type opcode : 00000 followed by func)
b) Sw (I-type)
(vi) .
a) sll -R type (opcode: 00000)
b) Sw -I type.(opcode: 101011)
8.
(i)
a) FF005A5A,
b) FFFFFFE7.
(ii)
(a) Nor $t1,$t2,$zero
(b)Nor $t3,$t3,$zero
or $t1,$t2,$t3.
(iii)
(a)100111
(b)100111
100101
(iv) 0x00012340
(v). a) NOR $s0,$s0,$ZERO
OR $s0,$s1,$s0
b) LW t0,0($2)
sll $s0,$t0,4
where $s0=A, $s1=B, $S2=base address of C.
(vi) Check the bit level representation
9.
(i.) The answer is really the same for all. All of these instructions are either supported by an existing
instruction, or sequence of existing instructions.
(ii). a) I-type
b) I-type
(iii) a)addi $t2, $t3,- 5
b) slti $t1, $t2,0
beq $t1,$0,LAB+4
addi $t2,$t2,1
LAB:
EXIT
(iv) a) 20,
b) 24
(v) a) B = 0;
for (i=0;i<10;i++) B = B+2;
b)B = 0;
for (i=10;i>=0;i--) B = B+2;
(vi) a) 3N
b)5N+ 2
10. (i)
a)
Fib:
bgt $a0,1,recurse
move $v0,$a0
jr $ra
recurse
addi $sp,$sp,-12 #3 registers
sw $ra,0($sp)
sw $a0,4($sp)
addi $a0,$a0,-1
jal fib
sw $v0, 8($sp)
lw $a0,4($sp)
sub $a0,$a0,-2
jal fib
lw $t0,8($sp)
add $v0,$v0,$t0
lw $ra,0($sp)
addi $sp,$sp,12
jr $ra
b)
positive:
addi $sp, $sp, –4
sw $ra, 0($sp)
add $s0, $a0, $0
add $s1, $a1, $0
jal addit
addi $t1, $0, 1
slt $t2, $0, $v0
bne $t2, $0, exit
addi $t1, $0, $0
exit:
add $v0, $t1, $0
lw $ra, 0($sp)
addi $sp, $sp, 4
jr $ra
addit:
add $v0, $a0, $a1
jr $ra
(ii)
a)
Due to the recursive nature of the code, not possible for the compiler to in-line the
function call.
b)
positive:
addi $sp, $sp, –4
sw $ra, 0($sp)
add $t0, $a0, $a1
slt $t1, $0, $t0
add $v0, $t1, $0
lw $ra, 0($sp)
addi $sp, $sp, 4
jr $ra
(iii)
a)
after calling function compare:
old $sp => 0x7ffffffc ???
$sp => –4 contents of register $ra
after calling function sub:
old $sp => 0x7ffffffc ???
–4 contents of register $ra
$sp => –8 contents of register $ra #return to
Compare
b) Similar as in part a
(iv) Reserve the stack space for saving the variables used in the functions. (using addi and sw)
Use move instruction to save the variables for going to function func.
Call func using jal.
Again save the variables using move instructions.
Again call func.
Then load the variables value from stack to registers using lw.
a) f: addi $sp,$sp,–8
sw $ra,4($sp)
sw $s0,0($sp)
add $s0,$a2,$a3
jal func
move $a0,$v0
move $a1,$s0
jal func
lw $ra,4($sp)
lw $s0,0($sp)
addi $sp,$sp,8
jr $ra
b) f: addi $sp,$sp,–4
sw $ra,0($sp)
add $t0, $a0, $a1
add $t1, $a2, $a3
bgt $t0, $t1, exit
add $s0, $a2, $a3
add $s1, $a0, $a1
jal func
exit:
add $s0, $a0, $a1
add $s1, $a2, $a3
jal func
lw $ra, 0($sp)
addi $sp, $sp, 4
jr $ra
(v) We can use the tail-call optimization for the second call to func, but then we must restore $ra and
$sp before that call.
(vi) Register $ra is equal to the return address in the caller function, registers
$sp and $s3 have the same values they had when function f was called, and register
$t5 can have an arbitrary value. For register $t5, note that although our function
f does not modify it, function func is allowed to modify it so we cannot assume
anything about the of $t5 after function func has been called.
11.
(i.)
a)
After entering the function main
Old $sp=> 0x7ffffffc ???
$sp=> -4 contents of register $ra
After entering function my_function
Old $sp => 0x7ffffffc ???
-4 contents of register $ra
$sp=> -8 contents of register $ra(return to main)
Global pointers:
0x10008000 100 my_global
b)
After entering the function main
Old $sp=> 0x7ffffffc ???
$sp=> -4 contents of register $ra
After entering function my_function
Old $sp => 0x7ffffffc ???
-4 contents of register $ra
$sp=> -8 contents of register $ra(return to main)
Global pointers:
0x10008000 100 my_global
(ii)
a)
addi $sp, $sp, -4
sw $r0,0($sp)
addi $a0, $0, 10
addi $a1, $0, 20
jal FUNC
add $t2, $v0, $0
lw $r0, 0($sp)
addi $sp, $sp, 4
jr $r0
FUNC:lw $a2, 0($s0)
sub $t0, $a0,$a1
add $v0,$t1,$a2
jr $r0
b)
addi $sp, $sp, -4
sw $r0,0($sp)
lw $a0,0($s0)
addi $a0,$a1,1
jal FUNC
add $t0,$v0,$0
lw $r0, 0($sp)
addi $sp, $sp, 4
jr $r0
FUNC:addi $v0,$a0,1
Jr $r0
(iii)
Same as (ii)
(iv)
a)
check yourself
b).
addi $sp,$sp,-4
sw $ra,($sp)
add $a2, $a3, $a2
slt $a2, $a2, $a0
move $v0, $a1
beqz $a2, L
j L2
L:
move $a0, $a1
jal g
L2:
lw $ra,($sp)
addi $sp,$sp,+4
jr $ra
(v)
a) int func(int a,int b,int d)
{
C=a+b;
If(d==0)
C=b-a;
Return c;
}
b)
//assume $a0 is a, $s1 is b, $a2 is c, $a3 is d , $v0 is retValue equivalent c code is:
c=c+d;
if(c<a) c=1;
else c=0;
retValue =b;
if(c==0)
{
a=b;
retValue = 500;
}
return retValue;
(vi)
a)
find the value
b)
value returned by the function is : 500
12. (i)
a)lw $t1,4($t0), $t0 points to the base of a record or structure
b)j 10000
Both j and jal are J-type instructions. J-type instructions use 6 bits for the opcode, and 26 bits
for the immediate value (called the target). This is the semantics of the j instruction. PC <- PC31-28 IR25-0 00 The new address is computed by taking the upper 4 bits of the PC, concatenated to the 26 bit
immediate value, and the lower two bits are 00, so the address created remains word-aligned.
This is called pseudo-direct addressing. Direct addressing would specify are 32 bits. It's
called pseudo-direct because some bits of the PC are used to compute the address. j allows you to access 1/16 of all possible word-aligned addresses.
(ii)
a) I-type
b)J-type
(iii)
a)+Adv:Saves instruction for incrementing array index. No extra hardware is needed.
eg- add $10, $20, $13 ;$20 is base,$13 is index
lw $5, 0($10)
+Can jump to any 32b address
Disadv:-Requires that two registers be written at the same time more hardware.
eg- lw $10, 4($13) ; $10 <- Mem[$10+4]
addi $13, $13, 4 ; $13 <- $13+4
-need to load a register with a 32 bit address, which could take multiple cycles
b)Adv: + 26 bits of the address is embedded as the immediate, and is used as the instruction
offset within the current 256MB (64MWord) region defined by the MS 4 bits of the PC.
eg- j Label
+allows the PC to be set to the current PC +4 +/- BranchAddr, supporting quick forward and
backward branches
Disadv: - not directly supported in hardware
eg- blt $s0, $s1, Else is implemented as
slt $at, $s0, $s1
bne $at, $zero, Else
-range of branches is smaller than large programs
(iv)
a)0x1 000 3100
0x0 001 0020
b)0xx 000 0010
0x0 001 0020
(v)
Now, for I type, we can only move to +/-(2^8) and for J type we can only move to +/-(2^18). So we are
restricted to particular range.
(vi)
If the place where we have to jump is lying beyond the allowed area, we will have to use one more
jump instruction.
13.
LL $t1,0($s1) : this would load the content of memory pointed to by s1 to t1
SC $t0,0($s1): this would read the value of the content of the memory pointed to by s1. If the value has
not changed then t0 is set to 1
else 0.
LL & SC are to check atomicity.
(i) 4 instructions (ii) One of the locations specified by the LL instruction has no corresponding SC instruction.
(iii) try: MOV R3,R4 LL R2,0(R2)
SC R3,0(R2)
BEQZ R3,try
MOV R4,R2
a)
Processor1 Processor2 Cycle Processor1 MEM Processor2
$t1 $t0 ($s1) $t1 $t0
0 1 2 99 30 40
ll $t1,0($s1) 1 1 2 99 99 40
ll $t1,0($s1) 2 99 2 99 99 40
sc $t0,0($s1) 3 99 2 40 99 1
sc $t0,0($s1) 4 99 0 40 99 1
b)
Processor1 Processor2 Cycle Processor1 MEM Processor2
$t1 $t0 ($s1) $t1 $t0
0 1 2 99 30 40
ll
$t1,0($s1) 1 99 2 99 30 40
ll $t1,0($s1) 2 99 2 99 99 40
Addi $t1,$t1,1 3 99 2 99 100 40
sc $t0,0($s1) 4 99 2 40 100 1
sc
$t0,0($s1) 5 99 0 40 100 1
14.
i)
a) swap: lw $t0,0($a0)
lw $t1,0($a1)
sw $t0,0($a1)
sw $t1,0(Sa0)
b) swap:
lw $t0,0($a0)
lw $t1,0($a1)
add $t0,$t0,$t1
sub $t1,$t0,$t1
sub $t0,$t0,$t1
sw $t0,0($a0)
sw $t1,0($a1)
ii) We need to pass the address of the elements to be swapped we have:
add $a0,$s2,$t2
add $t2,$t2,4
add $a1,$s2,$t2
a0 and a1 will contain the address of the elements that are to be swapped
iii)
The offset gets reduced to 1, need to use byte instructions
iv) We need to use offsets corresponding to a byte
v)
a) The swap function is never called. The inner loop is called once each time once.
b) same.
vi)
a) Each time the swap function is called by the inner loop , i-1 times
b)same
15.
i) a) register operand
b) register+offset and update register
ii) a) lw $s0, 4($s1)
addi $s1,$s1,4
b) lw $s1,0($s0)
lw $s1,4($s0)
lw $s1,8($s0)
addi $s0,$s0,12
iii) t1 holds r1 and t0 holds r0 a)
addi $t0,$zero,10
loop:
add $t1,$t1,$t0
add $t0,$t0,-1
slti $t2,$t0,1
beq $t2,$zero,exit
j loop
b) add $t0,$t0,$t1
Add $t2 $t3 + $t4 $t5, result in $t0 $t1
addu $t1, $t3, $t5 # add least significant word
sltu $t0, $t1, $t5 # set carry-in bit
addu $t0, $t0, $t2 # add in first most significant word
addu $t0, $t0, $t4
iv) a)6 MIPS vs 4 ARM instructions
b) check yourself
v) (a) ( 2/3) * (No of MIPS instructions / No of ARM instructions )
(b) find out using the formula above
16.
i.
a. This instruction copies ECX elements, where each element is 2 bytes in size, from an array pointed to by ESI to an array pointer by EDI.
b. It checks for least significant byte and loops the instructions.
ii.
a.
loop: lw $t0,0($a2)
sw $t0,0($a1)
addi $a0,$a0,–1
addi $a1,$a1,2
addi $a2,$a2,2
bnez $a0,loop
b.
loop: lb $t0, 0($a3)
lb $t1,0($a1)
bne $t0,$t1, JMP
bnez $a0,loop
JMP:addi $a1,$a1,1
addi $a0,$a0,1
`
iii.
speed up = number of MIPS instruction / num of x86 instruction
a. 6 / 5 = 1.2
b. 6/2 or 6/4..take average
iv.
a.
addi $s0,$0,1
addi $sp,$sp,-4
sw $r0,0($sp)
slt $s1,$a1,$a0
addi $s0,$s0,1
bne $s0,$s1,L
mov $v0,$a2
L:mov $v0,$a3
L2:jr $ra
Use stack. Push values of a, b, c and d onto stack. Now move the values of a and b to some
other registers and compare using gt and pop out c or d according to the conditions.
b.Use stack. Push the array and n onto stack. Use some counter value in register and
decrement it for implementing for loop till it becomes 0. And set all array values be 0 using
move instructions.
MIPS : number of instructions x 4 bytes.
X86 : just count the number of bytes given in code. 25bytes(a) and 31 bytes (b)
v. In MIPS, we fetch the next two consecutive instructions by reading the next 8 bytes from the instruction memory. In x86, we only know where the second
instruction begins after we have read and decoded the fi rst one, so it is more diffi cult
to design a processor that executes multiple instructions in parallel.
vi.
speedup = x86 cycles / MIPS cycles
a 11/5
b.calculate yourself
17.
i. To make it 32-bit integer we have to multiply by 4 before adding some value into array.
a. In the first iteration $t0 is 0 and the lw fetches a[0]. After that $t0 is 1, the lw uses a nonaligned
address triggers a bus error.
b. In the fi rst iteration $t2 point to address of a[0] , so the lw and sw instructions access a[0] and a[1] . In the
second iteration $t2 point to the next byte in a[0] instead of pointing to a[1]. Thus the fi rst lw uses a nonaligned
address and causes a bus error. Note that the computation for $t2 (address of a[n]) does not cause a bus error
because that address is not actually used to access memory
ii.
a. Yes, assuming that x is a sign-extended byte value between –128 and 127. If x is simply a byte value between 0 and 255, the function procedure only works if neither x nor array a contain
values outside the range of 0..127.
b.yes
iii.
a. f : move $v0, $0 : ret = 0 move $t0, $a0 : ptr = a
sll $t1,$t0,2 ; We must multiply the index by 4 before we add it to a[] to form the address for lw
add $t1, $a1, $a0 : &(a[n])
L : lw $t2, 0($t0) : read *p
Bne $t2, $a2, S : if(*p == x)
Addi $v0, $v0, 1 : ret++;
S : addi $t0,$t0,1 : p = p+1
Bne $t0, $t1,L : repeat if p != &(a[n])
Jr $ra : return ret
b.
f : move $t0, $0 : i = 0
addi $t1, $a1, -1 : n - 1
L : sll $t2,$a2,2 add $t2, $t0, $a0 : address of a[i]
lw $t3, 1($t2) : read a[i + 1]
sw $t3, 0($t2) : a[i] = a[i + 1]
addi $t0,$t0,1 : i = i+1
Bne $t0, $t1,L : repeat if i != n-1
Jr $ra : return
iv.
At the exit from my_alloc, the $sp register is moved to “free” the memory that is returned to main. Then
my_init() writes to this memory to initialize it. Note that neither my_init nor main access the stack memory in
any other way until sort() is called(stack based function), so the values at the point where sort() is called are still
the
same as those written by my_init:
a. 0, 0, 0, 0, 0
b. 5, 4, 3, 2, 1
v.
In main, register $s0 becomes 5, then my_alloc is called. The address of the array v “allocated” by my_alloc is
0xffe8, because in my_alloc $sp was saved at 0xfffc, and then 20 bytes (4 ´ 5) were reserved for array arr ($sp
was decremented by 20 to yield 0xffe8). The elements of array v returned to main are thus a[0] at 0xffe8, a[1] at
0xffec, a[2] at 0xfff0, a[3] at 0xfff4, and a[4] at 0xfff8. After my_alloc returns, $sp is back to 0x10000. The
value returned from my_alloc is 0xffe8 and this address is placed into the $s1 register. The my_init function
does not modify $sp, $s0, $s1, $s2, or $s3. When sort() begins to execute, $sp is 0x1000, $s0 is 5, $s1 is 0xffe7,
and $s2 and $s3 keep their original values of −10 and
1, respectively. The sort (0 procedure then changes $sp to 0xffec (0x1000 minus 20), and writes $s0 to memory
at address 0xffec (this is where a[1] is, so a[1] becomes 5), writes $s1 to memory at address 0xfff0 (this is where
a[2] is, so a[2] becomes 0xffe8), writes $s2 to memory address 0xfff4 (this is where a[3] is, so a[3] becomes -
10), writes $s3 to memory address 0xfff8 (this is where a[4] is, so a[4] becomes 1), and writes the return address
to 0xfffc, which does not affect values in array v. Now the values of array v are:
a. 0 5 0xffe8 7 1
b. 5 5 0xffe8 7 1
vi.
When the sort() procedure enters its main loop, the elements of array v are sorted without any interference from
other stack accesses. The resulting sorted array is
a. 0, 1, 5, 7, 0xffe8
b. 1, 5, 5, 7, 0xffe8
Unfortunately, this is not the end of the chaos caused by the original bug in my_alloc. When the sort() function
begins restoring registers, $ra is read the (luckily) unmodifi ed location where it was saved. Then $s0 is read
from memory at address 0xffec (this is where a[1] is), $s1 is read from address 0xfff0 (this is where a[2] is), $s2
is read from address 0xfff4 (this is where a[3] is), and $s3 is read from address 0xfff8 (this is where a[4] is).
When sort() returns to main(), registers $s0 and $s1 are supposed to keep n and the address of array v. As a
result, after sort() returns to main(), n and v are:
a. n=1, v=5
So v is a 1-element array of integers that begins at address 5
b. n=5, v=5
So v is a 5-element array of integers that begins at address 5
If we were to actually attempt to access (e.g., print out) elements of array v in the main() function after this
point, the fi rst lw would result in a bus error due to non-aligned address. If MIPS were to tolerate non-aligned
accesses, we would print out whatever values were at the address v points to (note that this is not the same
address to which my_init wrote its values).