assembly language part 2
DESCRIPTION
Assembly Language Part 2. Professor Jennifer Rexford COS 217. Goals of Today’s Lecture. Machine language Encoding the operation and the operands Simpler MIPS instruction set as an example More on IA32 assembly language Different sizes of data Example instructions Addressing modes - PowerPoint PPT PresentationTRANSCRIPT
2
Goals of Today’s Lecture
• Machine language Encoding the operation and the operands Simpler MIPS instruction set as an example
• More on IA32 assembly language Different sizes of data Example instructions Addressing modes
• Layout of assembly language program
3
Machine Language
Using MIPS Architecture as an Example
(since it has a simpler instruction set than IA32)
4
Three Levels of Languages
• High-level languages (e.g., Java and C) Easier programming by describing operations in a
natural language Increased portability of the code
• Assembly language (e.g., IA32 and MIPS) Tied to the specifics of the underlying machine Instructions and names to make code human readable
• Machine language Also tied to the specifics of the underlying machine In binary format the computer can read and execute Every instruction is a sequence of one or more numbers
Machine-Language Instructions
An ADD Instruction:add r1 = r2 + r3 (assembly)
Parts of the Instruction:
• Opcode (verb) – what operation to perform
• Operands (noun) – what to operate upon
• Source Operands – where values come from
• Destination Operand – where to deposit data values
Opcode Operands
6
Machine-Language Instruction
• Opcode What to do
• Source operand(s) Immediate (in the instruction itself) Register Memory location I/O port
• Destination operand Register Memory location I/O port
• Assembly syntaxOpcode source1, [source2,] destination
7
MIPS Has Three Kinds of 32-bit Instructions
• R: Registers Two source registers (rs and rt) One destination register (rd) E.g., “rd = rs + rt” or “rd = rs & rt” or “rd = rs xor rt”
op rs rd rt shamt funct
Operation and specific variant
Shift amount
8
MIPS Has Three Kinds of 32-bit Instructions
• I: Immediate, transfer, branch One source register (rs) and one 16-bit constant (imm) One destination register (rd) E.g., “rd = rs + imm” or “rd = rs & imm” E.g., “rd = MEM[rs + imm]” (treating rs+imm as address) E.g., “jump to address contained in rs” (rs as address) E.g., “jump to word imm if rs is 0” (i.e., change instruction
pointer)
op rs rd address/immediate
9
MIPS Has Three Kinds of 32-bit Instructions
• J: Jump One 28-bit constant (imm) for # of 32-bit words to jump E.g., “jump by imm words” (i.e., change the instruction
pointer)
op target address
10
MIPS “Add” Instruction Encoding
Add registers 18 and 19, and store result in register 17.
0 18 19 17 0 32
add is an R inst
11
MIPS “Subtract” Instruction Encoding
Subtract register 19 from register 18 and store in register 17
0 18 19 17 0 34
sub is an R inst
13
movl %edx, %eaxandl $1, %eaxje .else
jmp .endif
.else:
.endif:sarl $1, %edx
movl %edx, %eaxaddl %eax, %edxaddl %eax, %edxaddl $1, %edx
addl $1, %ecx
.loop:cmpl $1, %edxjle .endloop
jmp .loop.endloop:
movl $0, %ecx
Earlier Example
count=0;
while (n>1) {
count++;
if (n&1)
n = n*3+1;
else
n = n/2;
}
n %edxcount %ecx
14
Size of Variables
• Data types in high-level languages vary in size Character: 1 byte Short, int, and long: varies, depending on the computer Pointers: typically 4 bytes Struct: arbitrary size, depending on the elements
• Implications Need to be able to store and manipulate in multiple sizes Byte (1 byte), word (2 bytes), and extended (4 bytes) Separate assembly-language instructions
– e.g., addb, addw, addl Separate ways to access (parts of) a 4-byte register
15
Four-Byte Memory Words
Memory
232-1
0
Byte order is little endian
31 08 716 15
.
.
.
24 23
Byte 4Byte 0
Byte 5Byte 1Byte 2
Byte 6Byte 3Byte 7
16
IA32 General Purpose Registers
General-purpose registers
EAXEBXECXEDXESIEDI
31 0AXBXCXDX
16-bit 32-bit
DISI
ALAHBLCLDL
BHCHDH
8 715
17
Arithmetic Instructions• Simple instructions
add{b,w,l} source, dest dest = source + dest sub{b,w,l} source, dest dest = dest – source Inc{b,w,l} dest dest = dest + 1 dec{b,w,l} dest dest = dest – 1 neg{b,w,l} dest dest = ^dest cmp{b,w,l} source1, source2 source2 – source1
• Multiply mul (unsigned) or imul (signed)mull %ebx # edx, eax = eax * ebx
• Divide div (unsigned) or idiv (signed)idiv %ebx # edx = edx,eax / ebx
• Many more in Intel manual (volume 2) adc, sbb, decimal arithmetic instructions
18
Bitwise Logic Instructions
• Simple instructionsand{b,w,l} source, dest dest = source & destor{b,w,l} source, dest dest = source | destxor{b,w,l} source, dest dest = source ^ destnot{b,w,l} dest dest = ^destsal{b,w,l} source, dest (arithmetic) dest = dest << sourcesar{b,w,l} source, dest (arithmetic)dest = dest >> source
• Many more in Intel Manual (volume 2) Logic shift Rotation shift Bit scan Bit test Byte set on conditions
19
Branch Instructions• Conditional jump
j{l,g,e,ne,...} target if (condition) {eip = target}
• Unconditional jump jmp target jmp *register
Comparison Signed Unsigned
e e “equal”
ne ne “not equal”
> g a “greater,above”
ge ae “...-or-equal”
< l b “less,below”
le be “...-or-equal”
overflow/carry o c
no ovf/carry no nc
20
Setting the EFLAGS Register
• Comparison cmpl compares two integers Done by subtracting the first number from the second
– Discarding the results, but setting the eflags register Example:
– cmpl $1, %edx (computes %edx – 1)– jle .endloop (looks at the sign flag and the zero flag)
• Logical operation andl compares two integers Example:
– andl $1, %eax (bit-wise AND of %eax with 1)– je .else (looks at the zero flag)
• Unconditional branch jmp Example:
– jmp .endif and jmp .loop
21
EFLAG Register & Condition Codes
CF1P
F0AF0Z
FSF
TF
IF
DF
OF
IOPL
NT0R
FVM
AC
VIF
VIP
IDReserved (set to 0)
012345678910111213141516171819202131 22
Carry flag
Identification flagVirtual interrupt pendingVirtual interrupt flagAlignment checkVirtual 8086 modeResume flagNested task flagI/O privilege levelOverflow flag
Interrupt enable flagDirection flag
Trap flagSign flagZero flagAuxiliary carry flag or adjust flagParity flag
22
Data Transfer Instructions•mov{b,w,l} source, dest
General move instruction
•push{w,l} sourcepushl %ebx # equivalent instructions
subl $4, %espmovl %ebx, (%esp)
•pop{w,l} destpopl %ebx # equivalent instructions
movl (%esp), %ebxaddl $4, %esp
• Many more in Intel manual (volume 2) Type conversion, conditional move, exchange, compare and
exchange, I/O port, string move, etc.
esp
esp
esp
esp
24
Ways to Read and Write Data
• Processors have many ways to access data Known as “addressing modes”
• Two simplest ways (used in earlier example) Immediate addressing: movl $0, %ecx
– Data embedded in the instruction– Initialize register ECX with zero
Register addressing: movl %edx, %ecx– Data stored in a register– Copy value in register EDX into register ECX
• The others all deal with memory addresses To read and write data from main memory E.g., to get data from memory into a register E.g., to write data from a register back in to memory
25
Direct vs. Indirect Addressing
• Read or write from a particular memory location Essentially dereferencing a pointer
• Direct addressing: movl 2000, %ecx Address embedded in the instruction E.g., address 2000 corresponds to a global variable Load ECX register with the long located at address 2000
• Indirect addressing: movl (%eax), %ebx Address stored in a register E.g., EAX register is a pointer Load EBX register with long located at address in EAX
26
More Complex Addressing Modes
• Base pointer addressing: movl 4(%eax), %ebx Extends indirect addressing by allowing an offset E.g., add “4” to the register EAX to get the address Allows access to a particular field in a structure E.g., if “age” starts at the 4th byte of a record
• Indexed addressing: movl 2000(,%ecx,1), %ebx Starts from a base address (e.g., 2000) Adds an offset from a register (e.g., ECX) With a multiplier of 1, 2, 4, or 8 (e.g., 1 to multiply by 1) Allows register to be index for byte, word, or long array
27
Effective Address
• Displacement movl foo, %ebx
• Base movl (%eax), %ebx
• Base + displacement movl foo(%eax), %ebxmovl 1(%eax), %ebx
• (Index * scale) + displacement movl (,%eax,4), %ebx
• Base + (index * scale) + displacement movl foo(%edx,%eax,4),%ebx
eaxebxecxedxespebpesiedi
eaxebxecxedxespebpesiedi
+
1
2
4
8
* +
None
8-bit
16-bit
32-bit
Offset =
Base Index scale displacement
28
Data Access Methods: Summary• Immediate addressing: data stored in the instruction itself
movl $10, %ecx
• Register addressing: data stored in a register movl %eax, %ecx
• Direct addressing: address stored in instruction movl 2000, %ecx
• Indirect addressing: address stored in a register movl (%eax), %ebx
• Base pointer addressing: includes an offset as well movl 4(%eax), %ebx
• Indexed addressing: instruction contains base address, and specifies an index register and a multiplier (1, 2, 4, or 8) movl 2000(,%ecx,1), %ebx
30
A Simple Assembly Program.section .text
.globl _start
_start:
# Program starts executing
# here
# Body of the program goes
# here
# Program ends with an
# “exit()” system call
# to the operating system
movl $1, %eax
movl $0, %ebx
int $0x80
.section .data
# pre-initialized
# variables go here
.section .bss
# zero-initialized
# variables go here
.section .rodata
# pre-initialized
# constants go here
31
Main Parts of the Program
• Break program into sections (.section) Data, BSS, RoData, and Text
• Starting the program Making _start a global (.global _start)
– Tells the assembler to remember the symbol _start – … because the linker will need it
Identifying the start of the program (_start)– Defines the value of the label _start
32
Main Parts of the Program
• Exiting the program Specifying the exit() system call (movl $1, %eax)
– Linux expects the system call number in EAX register Specifying the status code (movl $0, %ebx)
– Linux expects the status code in EBX register Interrupting the operating system (int $0x80)