assembly language part 2

33
1 Assembly Language Part 2 Professor Jennifer Rexford COS 217

Upload: joan-schneider

Post on 31-Dec-2015

45 views

Category:

Documents


2 download

DESCRIPTION

Assembly Language Part 2. Professor Jennifer Rexford COS 217. Goals of Today’s Lecture. Machine language Encoding the operation and the operands Simpler MIPS instruction set as an example More on IA32 assembly language Different sizes of data Example instructions Addressing modes - PowerPoint PPT Presentation

TRANSCRIPT

1

Assembly LanguagePart 2

Professor Jennifer Rexford

COS 217

2

Goals of Today’s Lecture

• Machine language Encoding the operation and the operands Simpler MIPS instruction set as an example

• More on IA32 assembly language Different sizes of data Example instructions Addressing modes

• Layout of assembly language program

3

Machine Language

Using MIPS Architecture as an Example

(since it has a simpler instruction set than IA32)

4

Three Levels of Languages

• High-level languages (e.g., Java and C) Easier programming by describing operations in a

natural language Increased portability of the code

• Assembly language (e.g., IA32 and MIPS) Tied to the specifics of the underlying machine Instructions and names to make code human readable

• Machine language Also tied to the specifics of the underlying machine In binary format the computer can read and execute Every instruction is a sequence of one or more numbers

Machine-Language Instructions

An ADD Instruction:add r1 = r2 + r3 (assembly)

Parts of the Instruction:

• Opcode (verb) – what operation to perform

• Operands (noun) – what to operate upon

• Source Operands – where values come from

• Destination Operand – where to deposit data values

Opcode Operands

6

Machine-Language Instruction

• Opcode What to do

• Source operand(s) Immediate (in the instruction itself) Register Memory location I/O port

• Destination operand Register Memory location I/O port

• Assembly syntaxOpcode source1, [source2,] destination

7

MIPS Has Three Kinds of 32-bit Instructions

• R: Registers Two source registers (rs and rt) One destination register (rd) E.g., “rd = rs + rt” or “rd = rs & rt” or “rd = rs xor rt”

op rs rd rt shamt funct

Operation and specific variant

Shift amount

8

MIPS Has Three Kinds of 32-bit Instructions

• I: Immediate, transfer, branch One source register (rs) and one 16-bit constant (imm) One destination register (rd) E.g., “rd = rs + imm” or “rd = rs & imm” E.g., “rd = MEM[rs + imm]” (treating rs+imm as address) E.g., “jump to address contained in rs” (rs as address) E.g., “jump to word imm if rs is 0” (i.e., change instruction

pointer)

op rs rd address/immediate

9

MIPS Has Three Kinds of 32-bit Instructions

• J: Jump One 28-bit constant (imm) for # of 32-bit words to jump E.g., “jump by imm words” (i.e., change the instruction

pointer)

op target address

10

MIPS “Add” Instruction Encoding

Add registers 18 and 19, and store result in register 17.

0 18 19 17 0 32

add is an R inst

11

MIPS “Subtract” Instruction Encoding

Subtract register 19 from register 18 and store in register 17

0 18 19 17 0 34

sub is an R inst

12

Greater Detail on IA32 Assembly:Instruction Set and Data Sizes

13

movl %edx, %eaxandl $1, %eaxje .else

jmp .endif

.else:

.endif:sarl $1, %edx

movl %edx, %eaxaddl %eax, %edxaddl %eax, %edxaddl $1, %edx

addl $1, %ecx

.loop:cmpl $1, %edxjle .endloop

jmp .loop.endloop:

movl $0, %ecx

Earlier Example

count=0;

while (n>1) {

count++;

if (n&1)

n = n*3+1;

else

n = n/2;

}

n %edxcount %ecx

14

Size of Variables

• Data types in high-level languages vary in size Character: 1 byte Short, int, and long: varies, depending on the computer Pointers: typically 4 bytes Struct: arbitrary size, depending on the elements

• Implications Need to be able to store and manipulate in multiple sizes Byte (1 byte), word (2 bytes), and extended (4 bytes) Separate assembly-language instructions

– e.g., addb, addw, addl Separate ways to access (parts of) a 4-byte register

15

Four-Byte Memory Words

Memory

232-1

0

Byte order is little endian

31 08 716 15

.

.

.

24 23

Byte 4Byte 0

Byte 5Byte 1Byte 2

Byte 6Byte 3Byte 7

16

IA32 General Purpose Registers

General-purpose registers

EAXEBXECXEDXESIEDI

31 0AXBXCXDX

16-bit 32-bit

DISI

ALAHBLCLDL

BHCHDH

8 715

17

Arithmetic Instructions• Simple instructions

add{b,w,l} source, dest dest = source + dest sub{b,w,l} source, dest dest = dest – source Inc{b,w,l} dest dest = dest + 1 dec{b,w,l} dest dest = dest – 1 neg{b,w,l} dest dest = ^dest cmp{b,w,l} source1, source2 source2 – source1

• Multiply mul (unsigned) or imul (signed)mull %ebx # edx, eax = eax * ebx

• Divide div (unsigned) or idiv (signed)idiv %ebx # edx = edx,eax / ebx

• Many more in Intel manual (volume 2) adc, sbb, decimal arithmetic instructions

18

Bitwise Logic Instructions

• Simple instructionsand{b,w,l} source, dest dest = source & destor{b,w,l} source, dest dest = source | destxor{b,w,l} source, dest dest = source ^ destnot{b,w,l} dest dest = ^destsal{b,w,l} source, dest (arithmetic) dest = dest << sourcesar{b,w,l} source, dest (arithmetic)dest = dest >> source

• Many more in Intel Manual (volume 2) Logic shift Rotation shift Bit scan Bit test Byte set on conditions

19

Branch Instructions• Conditional jump

j{l,g,e,ne,...} target if (condition) {eip = target}

• Unconditional jump jmp target jmp *register

Comparison Signed Unsigned

e e “equal”

ne ne “not equal”

> g a “greater,above”

ge ae “...-or-equal”

< l b “less,below”

le be “...-or-equal”

overflow/carry o c

no ovf/carry no nc

20

Setting the EFLAGS Register

• Comparison cmpl compares two integers Done by subtracting the first number from the second

– Discarding the results, but setting the eflags register Example:

– cmpl $1, %edx (computes %edx – 1)– jle .endloop (looks at the sign flag and the zero flag)

• Logical operation andl compares two integers Example:

– andl $1, %eax (bit-wise AND of %eax with 1)– je .else (looks at the zero flag)

• Unconditional branch jmp Example:

– jmp .endif and jmp .loop

21

EFLAG Register & Condition Codes

CF1P

F0AF0Z

FSF

TF

IF

DF

OF

IOPL

NT0R

FVM

AC

VIF

VIP

IDReserved (set to 0)

012345678910111213141516171819202131 22

Carry flag

Identification flagVirtual interrupt pendingVirtual interrupt flagAlignment checkVirtual 8086 modeResume flagNested task flagI/O privilege levelOverflow flag

Interrupt enable flagDirection flag

Trap flagSign flagZero flagAuxiliary carry flag or adjust flagParity flag

22

Data Transfer Instructions•mov{b,w,l} source, dest

General move instruction

•push{w,l} sourcepushl %ebx # equivalent instructions

subl $4, %espmovl %ebx, (%esp)

•pop{w,l} destpopl %ebx # equivalent instructions

movl (%esp), %ebxaddl $4, %esp

• Many more in Intel manual (volume 2) Type conversion, conditional move, exchange, compare and

exchange, I/O port, string move, etc.

esp

esp

esp

esp

23

Greater Detail on IA32 Assembly:Addressing Modes

24

Ways to Read and Write Data

• Processors have many ways to access data Known as “addressing modes”

• Two simplest ways (used in earlier example) Immediate addressing: movl $0, %ecx

– Data embedded in the instruction– Initialize register ECX with zero

Register addressing: movl %edx, %ecx– Data stored in a register– Copy value in register EDX into register ECX

• The others all deal with memory addresses To read and write data from main memory E.g., to get data from memory into a register E.g., to write data from a register back in to memory

25

Direct vs. Indirect Addressing

• Read or write from a particular memory location Essentially dereferencing a pointer

• Direct addressing: movl 2000, %ecx Address embedded in the instruction E.g., address 2000 corresponds to a global variable Load ECX register with the long located at address 2000

• Indirect addressing: movl (%eax), %ebx Address stored in a register E.g., EAX register is a pointer Load EBX register with long located at address in EAX

26

More Complex Addressing Modes

• Base pointer addressing: movl 4(%eax), %ebx Extends indirect addressing by allowing an offset E.g., add “4” to the register EAX to get the address Allows access to a particular field in a structure E.g., if “age” starts at the 4th byte of a record

• Indexed addressing: movl 2000(,%ecx,1), %ebx Starts from a base address (e.g., 2000) Adds an offset from a register (e.g., ECX) With a multiplier of 1, 2, 4, or 8 (e.g., 1 to multiply by 1) Allows register to be index for byte, word, or long array

27

Effective Address

• Displacement movl foo, %ebx

• Base movl (%eax), %ebx

• Base + displacement movl foo(%eax), %ebxmovl 1(%eax), %ebx

• (Index * scale) + displacement movl (,%eax,4), %ebx

• Base + (index * scale) + displacement movl foo(%edx,%eax,4),%ebx

eaxebxecxedxespebpesiedi

eaxebxecxedxespebpesiedi

+

1

2

4

8

* +

None

8-bit

16-bit

32-bit

Offset =

Base Index scale displacement

28

Data Access Methods: Summary• Immediate addressing: data stored in the instruction itself

movl $10, %ecx

• Register addressing: data stored in a register movl %eax, %ecx

• Direct addressing: address stored in instruction movl 2000, %ecx

• Indirect addressing: address stored in a register movl (%eax), %ebx

• Base pointer addressing: includes an offset as well movl 4(%eax), %ebx

• Indexed addressing: instruction contains base address, and specifies an index register and a multiplier (1, 2, 4, or 8) movl 2000(,%ecx,1), %ebx

29

Layout of an Assembly Language Program

30

A Simple Assembly Program.section .text

.globl _start

_start:

# Program starts executing

# here

# Body of the program goes

# here

# Program ends with an

# “exit()” system call

# to the operating system

movl $1, %eax

movl $0, %ebx

int $0x80

.section .data

# pre-initialized

# variables go here

.section .bss

# zero-initialized

# variables go here

.section .rodata

# pre-initialized

# constants go here

31

Main Parts of the Program

• Break program into sections (.section) Data, BSS, RoData, and Text

• Starting the program Making _start a global (.global _start)

– Tells the assembler to remember the symbol _start – … because the linker will need it

Identifying the start of the program (_start)– Defines the value of the label _start

32

Main Parts of the Program

• Exiting the program Specifying the exit() system call (movl $1, %eax)

– Linux expects the system call number in EAX register Specifying the status code (movl $0, %ebx)

– Linux expects the status code in EBX register Interrupting the operating system (int $0x80)

33

Conclusions

• Machine code Binary representation of instructions What operation to do, and on what data

• IA32 instructions Manipulate bytes, words, or longs Numerous kinds of operations Wide variety of addressing modes

• Next time Calling functions, using the stack