cs/coe0447 computer organization & assembly language
DESCRIPTION
CS/COE0447 Computer Organization & Assembly Language. Chapter 2 Part 1. Topics to cover in Chapter 2. MIPS operations and operands MIPS registers Memory view Instruction encoding Arithmetic operations Logic operations Memory transfer operations. MIPS Operations/Operands. - PowerPoint PPT PresentationTRANSCRIPT
1
CS/COE0447
Computer Organization & Assembly Language
Chapter 2 Part 1
2
Topics to cover in Chapter 2
• MIPS operations and operands• MIPS registers• Memory view• Instruction encoding
• Arithmetic operations• Logic operations• Memory transfer operations
3
MIPS Operations/Operands
• “Operation” (instruction)• “Operand”
• MIPS operations– Arithmetic operations (integer/floating-point) (add, sub,…)– Logical operations (and, or,…)– Shift operations (shift a certain number of bits to the left or right)– Compare operations (do something if one operand is less than another,…)– Load/stores to transfer data from/to memory– Branch/jump operations– System control operations/coprocessor operations
• MIPS operands– General-purpose registers– Fixed registers, e.g., HI/LO registers– Memory location– Immediate value
4
MIPS Arithmetic
• <op> <rdestination> <rsource1> <rsource2>
• All arithmetic instructions have 3 operands– Operand order is fixed: destination first– 32 registers (page 2 of green card)
• Examples– add $t0, $s0, $s2 # $t0 = $s0 + $s2– sub $s0, $t0, $t1 # $s0 = $t0 – $t1
rd rs rt
5
MIPS Registers
r0r1r2r3r4r5r6r7r8r9
r10r11r12r13r14r15
r16r17r18r19r20r21r22r23r24r25r26r27r28r29r30r31
$zero$at$v0$v1$a0$a1$a2$a3$t0$t1$t2$t3$t4$t5$t6$t7
HILO
PC
$s1$s2$s3$s4$s5$s6$s7$t8$t9$k0$k1$gp$sp$fp$ra
$s0
General-Purpose Registers Special-Purpose Registers
32 bits 32 bits 32 bits
6
General-Purpose Registers
• GPR: all can be used as operands in instructions
• Still, conventions and limitations exist to keep GPRs from being used arbitrarily
– r0, termed $zero, always has a value “0”– r31, termed $ra (return address), is reserved for subroutine call/return– Etc. (we’ll see others later)– Register usage and related software conventions are summarized in “application
binary interface” (ABI), which is important when writing system software such as an assembler and a compiler
7
Instruction Encoding
• Instructions are encoded in binary numbers– Assembler translates assembly programs into binary numbers– Machine decodes binary numbers to figure out what the instruction is– MIPS has “fixed” 32-bit instruction encoding
• MIPS has several instruction formats– R-format: arithmetic instructions– I-format: transfer/branch/immediate format– J-format: jump instruction format– (FI/FR-format: floating-point instruction format)(later chapter)
8
MIPS Instruction Formats
Name Fields Comments
6 bits 5 bits 5 bits 5 bits 5 bits 6 bits All MIPS instructions 32 bits
R-format op rs rt rd shamt funct Arithmetic/logic instruction format
I-format op rs rt address/immediate Transfer, branch, immediate format
J-format op target address Jump instruction format
9
R-Format Instructions• Define “fields” of the following number of bits each: 6 + 5 +
5 + 5 + 5 + 6 = 32
6 5 5 5 65
opcode rs rt rd functshamt
• For simplicity, each field has a name:
For shift instructions:“shift amount”
10
R-Format Example
• MIPS Instruction:add $8,$9,$10
Binary number per field representation:
Decimal number per field representation:
hex representation:
decimal representation:
On Green Card: Format in column 1, opcodes in column 3(Let’s look and then come back)
11
M I P S Reference Data: CORE INSTRUCTION SET
NAME MNE-MON-IC
FOR-MAT
OPERATION (in Verilog)
OPCODE/ FUNCT (hex)
Add add R R[rd] = R[rs] + R[rt] (1)
0 / 20hex
Add Immediate
addi I R[rt] = R[rs] + SignExtImm (1)(2)
8hex
Branch On Equal
beq I if(R[rs]==R[rt]) PC=PC+4+ BranchAddr (4)
4hex
(1) May cause overflow exception(2) SignExtImm = { 16{immediate[15]}, immediate }(3) ZeroExtImm = { 16{1b’0}, immediate }(4) BranchAddr = { 14{immediate[15]}, immediate, 2’b0}
Later
12
R-Format Instructions (REMINDER)
• Define “fields” of the following number of bits each: 6 + 5 + 5 + 5 + 5 + 6 = 32
6 5 5 5 65
opcode rs rt rd functshamt
• For simplicity, each field has a name:
13
R-Format Example
• MIPS Instruction:add $8,$9,$10
Binary number per field representation:
Decimal number per field representation:
Now let’s fill this in
14
R-Format Example
• MIPS Instruction:add $8,$9,$10
0 9 10 8 320Binary number per field representation:
Decimal number per field representation:
hex representation: 012A 4020hex
decimal representation: 19,546,144ten
000000 01001 01010 01000 10000000000hex
15
I-Format Instructions• Define “fields” of the following number of bits each: 6 + 5 +
5 + 16 = 32
6 5 5 16
opcode rs rt immediate
• For simplicity, each field has a name:
Let’s do an example using addi
16
M I P S Reference Data: CORE INSTRUCTION SET
NAME MNE-MON-IC
FOR-MAT
OPERATION (in Verilog)
OPCODE/ FUNCT (hex)
Add add R R[rd] = R[rs] + R[rt] (1)
0 / 20hex
Add Immediate
addi I R[rt] = R[rs] + SignExtImm (1)(2)
8hex
Branch On Equal
beq I if(R[rs]==R[rt]) PC=PC+4+ BranchAddr (4)
4hex
(1) May cause overflow exception(2) SignExtImm = { 16{immediate[15]}, immediate }(3) ZeroExtImm = { 16{1b’0}, immediate }(4) BranchAddr = { 14{immediate[15]}, immediate, 2’b0}
17
I-Format Example
• MIPS Instruction:addi $8,$9,7
Binary number per field representation:
Decimal number per field representation:
18
I-Format Example
• MIPS Instruction:addi $8,$9,7
Binary number per field representation:
Decimal number per field representation:
hex representation: hex
FILL IN THESE VALUES!!!!!
19
M I P S Reference Data: CORE INSTRUCTION SET
NAME MNE-MON-IC
FOR-MAT
OPERATION (in Verilog)
OPCODE/ FUNCT (hex)
Add add R R[rd] = R[rs] + R[rt] (1)
0 / 20hex
Add Immediate
addi I R[rt] = R[rs] + SignExtImm (1)(2)
8hex
Branch On Equal
beq I if(R[rs]==R[rt]) PC=PC+4+ BranchAddr (4)
4hex
(1) May cause overflow exception(2) SignExtImm = { 16{immediate[15]}, immediate }(3) ZeroExtImm = { 16{1b’0}, immediate }(4) BranchAddr = { 14{immediate[15]}, immediate, 2’b0}
20
addi $8,$9,7
Executing the addi instruction
$8
$9 0x00000023Suppose
Instruction from earlier slide: format.Immediate = 0x0007 FINISH THIS!!!!
So, you add 00000023 00000007
21
Logic Instructions
• Bit-wise logic operations• <op> <rdestination> <rsource1> <rsource2>
• Examples– and $t0, $s0, $s2 # $t0 = $s0 ^ $s2– or $s0, $t0, $t1 # $s0 = $t0 | $t1– nor $s0, $t0, $t1 # $s0 = ~($t0 | $t1)– xor $s0, $t0, $t1 # $s0 = $t0 ^ $t1
Name Fields Comments
R-format op rs rt rd shamt funct Logic instruction format
22
Dealing with Immediates
• Many operations involve small “immediate value”– a = a + 1;– b = b – 4;– c = d | 0x04;
• Some frequently used arithmetic/logic instructions have their “immediate” version
– addi $t0, $t1, 16 # t0 <- $t1 + 16– andi $t2, $t3, 254 # t2 <- $t3 & 0xfe– ori $t5, $t6, 4 # t5 <- $t6 | 0x04– xori $t7, $t7, 16 # t7 <- $t7 ^ 0x10
• There is no “subi”; Why?
Name Fields Comments
I-format op rs rt 16-bit immediate Transfer, branch, immediate format
23
Long Immediates
• Sometimes we need a long immediate, e.g., 32 bits– Do we need an instruction to load a 32-bit constant value to a register?
• MIPS requires that we use two instructions– lui $t0, 1010101001010101
• Then we get lower-order 16 bits– ori $t0, $t0, 1100110000110011
1010101001010101 0000000000000000
1010101001010101 1100110000110011$t0
$t0
24
Memory Transfer Instructions
• Only two instructions provided– LOAD: move data from memory to register
• lw $t3, 4($t2) # $t3 <- M[$t2 + 4]– STORE: move data from register to memory
• sw $t4, 16($t1) # M[$t1 + 16] <- $t4
• Support for other data types than 32-bit word needed– 16-bit half-word
• “short” type in C• 16-bit processing is common in signal processing• lh/sh (load half/store half)
– 8-bit byte• “char” type in C• 8-bit processing is common in controller applications• lb/sb (load byte/store byte)
25
Address Calculation
• Memory address is specified with (register, constant)– Register to keep “base address”– Constant field to keep “offset”– Address is “base address” + “offset”, i.e., register + constant
• The offset can be positive or negative– It is written in terms of number of bytes for load/store instructions
• If no register needs to be used then use register 0– Register 0 always stores value 0 (“hardwired”)
• If no offset, then offset is 0
• MIPS uses this simple address calculation method, but other architectures such as PowerPC or x86 have different methods
26
Machine Code Example
swap:sll $t0, $a1, 2add $t1, $a0, $t0lw $t3, 0($t1)lw $t4, 4($t1)sw $t4, 0($t1)sw $t3, 4($t1)jr $ra
void swap(int v[], int k){
int temp;temp = v[k];v[k] = v[k+1];v[k+1] = temp;
}
27
Memory View
• Viewed as a large, single-dimension 8-bit array with an address (“byte address”)
• A memory address is an index into the array
BYTE
BYTE
BYTE
BYTE
BYTE
BYTE
0
1
2
3
4
5
…
28
Memory Organization
• 232 bytes with byte addresses from 0 to 232 – 1
• 230 words with byte addresses 0, 4, 8, …, 232 – 4
• Words are aligned– 2 least significant bits of address are 0s
• Half-words are aligned– LSB of address is 0
• Addressing within a word– Which byte is first and which byte is last?– Big-endian vs. Little-endian
WORD
WORD
WORD
WORD
WORD
WORD
0
4
8
12
16
20
…
0 1 2 3
0123
0
0
LSBMSB
LSBMSB
29
What C code properly fills in the blank in loop on right?
1: A[i++] >= 10 2: A[i++] >= 10 | A[i] < 0 3: A[i] >= 10 || A[i++] < 0 4: A[i++] >= 10 || A[i] < 0 5: A[i] >= 10 && A[i++] < 0 6 None of the above
Peer Instruction: $s3=i, $s4=j, $s5=@A
do j = j + 1 while (______);
Loop: addiu $s4,$s4,1 # j = j + 1sll $t1,$s3,2 # $t1 = 4 * iaddu $t1,$t1,$s5 # $t1 = @ A[i]lw $t0,0($t1) # $t0 = A[i]slti $t1,$t0,10 # $t1 = $t0 < 10beq $t1,$0, Loop # goto Loop [not delayed]addiu $s3,$s3,1 # i = i + 1 slti $t1,$t0, 0 # $t1 = $t0 < 0bne $t1,$0, Loop # goto Loop [not delayed]
31
Peer InstructionWhich instruction has same representation as 35ten?A. add $0, $0, $0
B. subu $s0,$s0,$s0
C. lw $0, 0($0)
D. addi $0, $0, 35
E. subu $0, $0, $0
F. Trick question! Instructions are not numbers
• Use Green Card handout to answer
34
More on Alignment
• A misaligned access– lw $t1, 3($zero)
• How do we define a word at address 3?– Data 0, 1, 2, 3
• If you meant this, use address 0 instead of 3
– Data 3, 4, 5, 6
• If you meant this, it is misaligned!
• Certain hardware may support this, but usually not
• Get a byte from address 3, a word from address 4 and manipulate data to get what you want
• Alignment issue does not exist for byte accesses
…
0 1 2 3
111098
0
4 7654
8
35
Shift Instructions
• Bit-wise logic operations• <op> <rdestination> <rsource> <shamt>
• Examples– sll $t0, $s0, 4 # $t0 = $s0 << 4– srl $s0, $t0, 2 # $s0 = $t0 >> 2
• Shift amount can be in a register (“shamt” field not used)• Shift right arithmetic (SRA) keeps the sign of a number
– sra $s0, $t0, 4
Name Fields Comments
R-format opNOT
USEDrt rd shamt funct shamt is “shift amount”
36
Control
• Decision-making instructions– Alter the control flow, i.e., change the “next” instruction to be executed
• MIPS conditional branch instructions– bne $t0, $t1, LABEL– beq $t0, $t1, LABEL
• Example
bne $s0, $s1, LABELadd $s3, $s0, $s1LABEL: …
if (i == h) h =i+j;
(i == h)?
h=i+j;
LABEL:
YES NO
37
Control, cont’d
• MIPS unconditional branch instruction (jump)– j LABEL
• Example– f, g, and h are in registers $s3, $s4, and $s5
bne $s4, $s5, ELSEadd $s3, $s4, $s5j EXITELSE: sub $s3, $s4, $s5EXIT: …
if (i == h) f=g+h;else f=g–h;
(i == h)?
f=g+h;
YES NO
f=g–h
EXIT
38
Control, cont’d
• Loops– “while” loops
• Example on page 74
– “for” loops
39
Control, cont’d
• We have beq and bne; what about branch-if-less-than?– slt
• Can you make a “pseudo” instruction “blt $s1, $s2, LABEL”?
• Assembler needs a temporary register to do this– $at is reserved for this purpose
slt $t0, $s1, $s2if (s1<s2) t0=1;else t0=0;
40
Instruction Format
• Address in the instruction is not a 32-bit number – it’s only 16-bit– The 16-bit immediate value is in signed, 2’s complement form
• Addressing in branch instructions– The 16-bit number in the instruction specifies the number of “instructions” to be skipped– Memory address is obtained by adding this number to the PC– Next address = PC + 4 + sign_extend(16-bit immediate << 2)
• Example– beq $s1, $s2, 100
Branch/
Immediateop rs rt 16-bit immediate
4 17 18 25
41
Instruction Format, cont’d
• The address of next instruction is obtained by concatenating with PC
– Next address = {PC[31:28],IMM[25:0],00}– Address boundaries of 256MB
• Example– j 10000
Jump op 26-bit immediate
2 2500
42
MIPS Addressing Modes
• Immediate addressing (I-format)
• Register addressing (R-/I-format)
• Base addressing (load/store) [register + offset]
• PC-relative addressing (beq/bne) [PC + 4 + offset]
• Pseudo-direct addressing (j) [concatenation w/ PC]
op rs rt
funct
immediate
26-bit address
op rs rt immediate
op rs rt rd shamt
op rs rt offset
op rs rt offset
op
43
Summary
44
Register Usage Convention
NAME REG. NUMBER USAGE
$zero 0 zero
$at 1 reserved for assembler usage
$v0~$v1 2~3 values for results and expression eval.
$a0~$a3 4~7 arguments
$t0~$t7 8~15 temporaries
$s0~$s7 16~23 temporaries, saved
$t8~$t9 24~25 more temporaries
$k0~$k1 26~27 reserved for OS kernel
$gp 28 global pointer
$sp 29 stack pointer
$fp 30 frame pointer
$ra 31 return address
45
Stack and Frame Pointers
• Stack pointer ($sp)– Keeps the address to the top of the s
tack– $29 is reserved for this purpose– Stack grows from high address to low– Typical stack operations are push/po
p
• Procedure frame– Contains saved registers and local va
riables– “Activation record”
• Frame pointer ($fp)– Points to the first word of a frame– Offers a stable reference pointer– $30 is reserved for this– Some compilers don’t use $fp
46
“C Program” Down to “Numbers”
swap:muli $2, $5, 4add $2, $4, $2lw $15, 0($2)lw $16, 4($2)sw $16, 0($2)sw $15, 4($2)jr $31
void swap(int v[], int k){
int temp;temp = v[k];v[k] = v[k+1];v[k+1] = temp;
}
00000000101000010…00000000000110000…10001100011000100…10001100111100100…10101100111100100…10101100011000100…00000011111000000…
com
piler
assemble
r
47
To Produce an Executable
source file.asm/.s
source file.asm/.s
source file.asm/.s
object file.obj/.o
object file.obj/.o
object file.obj/.o
library.lib/.a
executable.exe
assembler
assembler
assembler
linker
48
An Assembler
• Expands macros– Macro is a sequence of operations conveniently defined by a user– A single macro can expand to many instructions
• Determines addresses and translates source into binary numbers– Start from address 0– Record in “symbol table” addresses of labels– Resolve branch targets and complete branch instructions’ encoding– Record instructions that need be fixed after linkage
• Packs everything in an object file
• “Two-pass assembler”– To handle forward references
49
An Object File
• Header– Size and position of other pieces of the file
• Text segment– Machine codes
• Data segment– Binary representation of the data in the source
• Relocation information– Identifies instructions and data words that depend on absolute addresses
• Symbol table– Keeps addresses of global labels– Lists unresolved references
• Debugging information– Contains a concise description of the way in which the program was compiled
50
Assembler Directives
• Guides the assembler to properly handle following codes with certain considerations
• .text– Tells assembler that codes follow
• .data– Tells assembler that data follow
• .align– Directs aligning the following items
• .global– Tells to treat the following symbol as global
• .asciiz– Tells to handle the following as a “string”
51
Code Example.text.align 2.globl main
main:subu $sp, $sp, 32…
loop:lw $t6, 28($sp)…la $a0, strlw $a1, 24($sp)jal printf…jr $ra
.data
.align 0str:
.asciiz “The sum from 0 … 100 is %d\n”
52
Macro Example
.dataint_str:
.asciiz “%d”
.text.macro print_int($arg)
la $a0, int_strmov $a1, $argjal printf.end_macro
…print_int($7)
la $a0, int_strmov $a1, $7jal printf
53
Linker
54
Procedure Calls
• Argument passing– First 4 arguments are passed through $a0~$a3– More arguments are passed through stack
• Result passing– First 2 results are passed through $v0~$v1– More results can be passed through stack
• Stack manipulations can be tricky and error-prone
• More will be discussed in recitations
55
SPIM
56
High-Level Lang. vs. Assembly
High-level language Assembly
Example C, Fortran, Java, … -
+High productivity
– Short description & readability
Portability
Low productivity
– Long description & low readability
Not portable
– Limited optimization capability in certain
cases With proper knowledge and experiences,
fully optimized codes can be written
57
Interpreter vs. Compiler
Interpreter Compiler
Concept Line-by-line translation and executionWhole program translation and native
execution
Example Java, BASIC, … C, Fortran, …
+Interactive
Portable (e.g., Java Virtual Machine)High performance
– Low performance Binary not portable to other machines or
generations
58
Compiler Structure
• Compiler is a software with a variety of functions implemented inside it– Front-end
• Deals with high-level language constructs and translates them into more relevant tree-like or list-format internal representation (IR)
• Symbols (e.g., a[i+8*j]) are still available– Back-end
• Back-end IR that more or less correspond to machine instructions
• With more steps, IR becomes machine instructions
59
Compiler Structure, cont’d
• Front-end– Scanning
• Takes the input source program and chops the programs into recognizable “tokens”
• Also known as “lexical analysis” and “LEX” tool is used– Parsing
• Takes the token stream, checks the syntax, and produces abstract syntax trees
• “YACC” or “BISON” tools are used– Semantic analysis
• Takes the abstract syntax trees, performs type checking, and builds a symbol table
– IR generation• Similar to assembly language program, but assumes unlimited
registers, etc.
60
Compiler Structure, cont’d
• Back-end– Local optimization
• Optimizations within a basic block• Common Sub-expression Elimination (CSE), copy propagation, constant
propagation, dead code elimination, …– Global optimization
• Optimizations that deal with multiple basic blocks– Loop optimizations
• Loops are so important that many compiler and architecture optimizations target them
• Induction variable removal, loop invariant removal, strength reduction, …– Register allocation
• Try to keep as many variables in registers as possible– Machine-dependent optimizations
• Utilize any useful instructions provided by the target machine
61
Code Example (AST)
while (save[i] == k)i+=1;
62
Code Example (IR)
63
Code Example (CFG)
64
Code Example (CFG), cont’d
65
Code Example (Reg. Alloc.)
66
Compiler Example
• gcc (GNU open-source C Compiler)– cpp: C pre-processor
• Macro expansion (#define)• File expansion (#include)• Handles other directives (#if, #else, #pragma)
– cc1: C compiler (front-end & back-end)• Compiles your C program
– gas: assembler• Assembles compiler-generated assembly codes
– ld: linker• Links all the object files and library files to generate an execu
table
67
Optimization and Compile Time
• Optimizations are a must– Code quality is much improved - 2 ~ 10 times improvement is not
uncommon
• Turning on optimizations can incur a significant compile time overhead– For a big, complex software project, compile time is an issue
– Not that serious in many cases as compiling is a rare event compared to program execution
– Considering the resulting code quality, you pay this fee
– Now, computers became fast!
68
Building a Compiler
• A modern compiler is a very complex software tool– Need a very careful design of data structures– All good software engineering techniques applied
• Traditional tools to create a compiler– LEX– YACC/BISON
• Start from existing open-source compilers– LCC– GCC– There are many other research prototypes!
• If you are interested in compiler technologies– CS1621 – Structure of Programming Languages– CS1622 – Introduction to Compiler Design
69
Summary
• Operations and operands
• In MIPS assembly language– Registers replace C variables– One instruction (simple operation) per line
• Some lessons– Simpler is better
• e.g., A few instruction formats vs. many complex instruction formats– Smaller is faster
• e.g., 32 GPRs vs. 128 GPRs
• Instructions– Arithmetic– Logic– Memory transfer