2010 semester 1 0001110010000110 add r6,r2,r6 ... tells us the address of the first instruction. •...
TRANSCRIPT
Computer Science 210 s1c Computer Systems 1
2010 Semester 1
Lecture Notes
James Goodman!
Credits: Slides prepared by Gregory T. Byrd, North Carolina State University
Assembly Language
Lecture 14, 31Mar10:
31Mar10 CS210 173
Human-Readable Machine Language
Computers like ones and zeros…
Humans like symbols…
Assembler is a program that turns symbols into machine instructions.
! ISA-specific: close correspondence between symbols and instruction set
• mnemonics for opcodes • labels for memory locations
! additional operations for allocating storage and initializing data
ADD R6,R2,R6 ; increment index reg.
0001110010000110
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
31Mar10 CS210 174
An Assembly Language Program
; ; Program to multiply a number by the constant 6 ;
.ORIG x3050 LD R1, SIX LD R2, NUMBER AND R3, R3, #0 ; Clear R3. It will ; contain the product.
; The inner loop ; AGAIN ADD R3, R3, R2
ADD R1, R1, #-1 ; R1 keeps track of BRp AGAIN ; the iteration.
; HALT
; NUMBER .BLKW 1 SIX .FILL x0006 ;
.END
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
31Mar10 CS210 175
LC-3 Assembly Language Syntax
Each line of a program is one of the following: ! an instruction ! an assembler directive (or pseudo-op) ! a comment
Whitespace (between symbols) and case are ignored. Comments (beginning with “;”) are also ignored.
An instruction has the following format: LABEL OPCODE OPERANDS ; COMMENTS
optional mandatory
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
31Mar10 CS210 176
Opcodes and Operands
Opcodes ! reserved symbols that correspond to LC-3 instructions ! listed in Appendix A
• ex: ADD, AND, LD, LDR, … Operands
! registers -- specified by Rn, where n is the register number ! numbers -- indicated by # (decimal) or x (hex) ! label -- symbolic name of memory location ! separated by comma ! number, order, and type correspond to instruction format
• ex: ADD R1,R1,R3 ADD R1,R1,#3 LD R6,NUMBER BRz LOOP
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
31Mar10 CS210 177
Labels and Comments
Label ! placed at the beginning of the line ! assigns a symbolic name to the address corresponding to line
• ex: LOOP ADD R1,R1,#-1 BRp LOOP
Comment ! anything after a semicolon is a comment ! ignored by assembler ! used by humans to document/understand programs ! tips for useful comments:
• avoid restating the obvious, as “decrement R1” • provide additional insight, as in “accumulate product in R6” • use comments to separate pieces of program
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
31Mar10 CS210 178
Assembler Directives
Pseudo-operations ! do not refer to operations executed by program ! used by assembler ! look like instruction, but “opcode” starts with a full stop
Opcode Operand Meaning
.ORIG address starting address of program
.END end of program
.BLKW n allocate n words of storage
.FILL n allocate one word, initialize with value n
.STRINGZ n-character string
allocate n+1 locations, initialize w/characters and null terminator
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
31Mar10 CS210 179
Trap Codes
LC-3 assembler provides “pseudo-instructions” for each trap code, so you don’t have to remember them.
Code Equivalent Description
HALT TRAP x25 Halt execution and print message to console.
IN TRAP x23 Print prompt on console, read (and echo) one character from keybd. Character stored in R0[7:0].
OUT TRAP x21 Write one character (in R0[7:0]) to console.
GETC TRAP x20 Read one character from keyboard. Character stored in R0[7:0].
PUTS TRAP x22 Write null-terminated string to console. Address of string is in R0.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
31Mar10 CS210 180
Style Guidelines
Use the following style guidelines to improve the readability and understandability of your programs:
1. Provide a program header, with author’s name, date, etc., and purpose of program.
2. Start labels, opcode, operands, and comments in same column for each line. (Unless entire line is a comment.)
3. Use comments to explain what each register does. 4. Give explanatory comment for most instructions. 5. Use meaningful symbolic names.
• Mixed upper and lower case for readability. • ASCIItoBinary, InputRoutine, SaveR1
6. Provide comments between program sections. 7. Each line must fit on the page -- no wraparound or truncations.
• Long statements split in aesthetically pleasing manner.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
31Mar10 CS210 188
Sample Program
Count the occurrences of a character in a file. Remember this?
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
31Mar10 CS210 189
; Program to count occurrences of a character in a file. ; Character to be input from the keyboard. ; Result to be displayed on the monitor. ; Program only works if no more than 9 occurrences are found. ; ; ; Initialization ;
.ORIG x3000 AND R2, R2, #0 ; R2 is counter, initially 0 LD R3, PTR ; R3 is pointer to characters GETC ; R0 gets character input LDR R1, R3, #0 ; R1 gets first character
; ; Test character for end of file ; TEST ADD R4, R1, #-4 ; Test for EOT (ASCII x04)
BRz OUTPUT ; If done, prepare the output ; ; Test character for match. If a match, increment count. ;
NOT R1, R1 ADD R1, R1, R0 ; If match, R1 = xFFFF NOT R1, R1 ; If match, R1 = x0000 BRnp GETCHAR ; If no match, do not increment ADD R2, R2, #1
; ; Get next character from file. ; GETCHAR ADD R3, R3, #1 ; Point to next character.
LDR R1, R3, #0 ; R1 gets next char to test BRnzp TEST
; ; Output the count. ; OUTPUT LD R0, ASCII ; Load the ASCII template
ADD R0, R0, R2 ; Convert binary count to ASCII OUT ; ASCII code in R0 is displayed. HALT ; Halt machine
; ; Storage for pointer and ASCII template ; ASCII .FILL x0030 PTR .FILL x4000
.END
Recommended Homework (no credit—do not turn in)
Download the LC 3 simulator package from the resources page <http://www.cs.auckland.ac.nz/compsci210s1c/resources/>. For running on Windows, read the document LC3WinGuide.pdf. (You may run the simulator under Linux: read the document LC3_unix.pdf).
Follow the instructions for running a programme, creating the files described in the example and execute the programme.
Create a source file from the text of programme discussed in the lecture (figures 5.16 & 7.2 in the book).
Create a “file” starting in the memory at location x4000. Assemble the programme. Execute the programme, typing different characters and make sure the
programme prints the correct result. What goes wrong if the character you enter occurs more than 10 times in the
file?
6/3/10 CS210 190
The Assembly Process
31Mar10 CS210 192
Assembly Process
Convert assembly language file (.asm) into an executable file (.obj) for the LC-3 simulator.
First Pass: ! scan program file ! find all labels and calculate the corresponding addresses;
this is called the symbol table
Second Pass: ! convert instructions to machine language,
using information from symbol table
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
31Mar10 CS210 193
First Pass: Constructing the Symbol Table
1. Find the .ORIG statement, which tells us the address of the first instruction. • Initialize location counter (LC), which keeps track of the
current instruction.
2. For each non-empty line in the program: a) If line contains a label, add label and LC to symbol table. b) Increment LC.
– NOTE: If statement is .BLKW or .STRINGZ, increment LC by the number of words allocated.
3. Stop when .END statement is reached.
NOTE: A line that contains only a comment is considered an empty line.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
31Mar10 CS210 195
Second Pass: Generating Machine Language
For each executable assembly language statement, generate the corresponding machine language instruction.
! If operand is a label, look up the address from the symbol table.
Potential problems: ! Improper number or type of arguments
• ex: NOT R1,#7 ADD R1,R2 ADD R3,R3,NUMBER
! Immediate argument too large • ex: ADD R1,R2,#1023
! Address (associated with label) more than 256 from instruction • can’t use PC-relative addressing mode
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Jun-3-10 CS210 200
Practice
Using the symbol table constructed earlier, translate these statements into LC-3 machine language.
Statement Machine Language LD R3,PTR 0010 0110 0001 0001 ADD R4,R1,#-4 0001 1000 0111 1100 LDR R1,R3,#0 0110 0010 1100 0000 BRnp GETCHAR 0000 1010 0000 0001
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Jun-3-10 CS210 205
LC-3 Assembler
Using “assemble” (Unix) or LC3Edit (Windows), generates several different output files.
This one gets loaded into the simulator.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Jun-3-10 CS210 206
Object File Format
LC-3 object file contains ! Starting address (location where program must be loaded),
followed by… ! Machine instructions
Example ! Beginning of “count character” object file looks like this:
0011000000000000 0101010010100000 0010011000010001 1111000000100011
.
.
.
.ORIG x3000 AND R2, R2, #0 LD R3, PTR TRAP x23
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Jun-3-10 CS210 207
Multiple Object Files
An object file is not necessarily a complete program. ! system-provided library routines ! code blocks written by multiple developers
For LC-3 simulator, can load multiple object files into memory, then start executing at a desired address.
! system routines, such as keyboard input, are loaded automatically • loaded into “system memory,” below x3000 • user code should be loaded between x3000 and xFDFF
! each object file includes a starting address ! be careful not to load overlapping object files
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Jun-3-10 CS210 208
Linking and Loading
Loading is the process of copying an executable image into memory.
! more sophisticated loaders are able to relocate images to fit into available memory
! must readjust branch targets, load/store addresses
Linking is the process of resolving symbols between independent object files.
! suppose we define a symbol in one module, and want to use it in another
! some notation, such as .EXTERNAL, is used to tell assembler that a symbol is defined in another module
! linker will search symbol tables of other modules to resolve symbols and complete code generation before loading
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Computer Science 210 s1c Computer Systems 1
2010 Semester 1
Lecture Notes
Credits: Slides prepared by Gregory T. Byrd, North Carolina State University
Input & Output
Jun-3-10 CS215s1c 211
News from the NYTimes (June ’96)
“When a computer runs out of [RAM memory], modern operating systems automatically use the memory on the hard drive. But today’s hard drives retrieve data at speeds of about 10 milliseconds (millionths of a second). That seems fast until you consider that modern RAM can do this at 60 nanoseconds (billionths of a second), more than 150 times as fast.”
What’s wrong with this statement??
I/O Device Examples
Jun-3-10 CS215s1c 212
Device Behavior Partner Data Rate (KB/sec) Keyboard Input Human 0.01 Mouse Input Human 0.02 Laser Printer Output Human 1000 Graphics Display Output Human 30,000 Network-LAN Input or Output Machine 200-1,000,000 Floppy disk Storage Machine 50 CD-ROM (1x) Storage Machine 150 DVD-ROM (1x) Storage Machine 1352 Magnetic Disk Storage Machine 100,000 Flash Memory Storage Machine 30,000
Jun-3-10 CS215s1c 213
Time Line
10-10 10-7 100 10-9 10-8 10-6 10-5 10-4 10-3 10-2 10-1
Time (Logarithmic Scale)
1 month 1 day 1 hour 1 minute 1 second 1 year
Scale by 31,557,600 Jun-3-10 CS215s1c 214
Speed Line
Time for light to travel 30 cm
One clock period 2 GHz
Total Disk access time
Cache miss time (Memory access time)
Cache hit time
Execute one
instruction (best case)
Time for sound to travel 30 cm
One disk revolution (6-8 ms)
Transfer 1 char at 56K
baud
Read 1 byte from disk
10-10 10-7 100 10-9 10-8 10-6 10-5 10-4 10-3 10-2 10-1
Time (Logarithmic Scale)
Jun-3-10 CS215s1c 215
Synchronisation
! What happens if you try to print a file on a printer already in use? ! What happens if you try to read a character before it’s typed? ! What happens to a sequence of characters you type in before you read
them? ! What happens if you send characters to a printer faster than it can
accept them?
Jun-3-10 CS215s1c 216
I/O Devices are Cantankerous
! Many I/O devices have a mechanical component ! They are very slow relative to electronic speeds ! They respond when they’re ready, not necessarily when it’s convenient ! They may not be willing to wait forever for their input (overrun) ! The CPU is the slave: it must synchronize
Computer Science 210 s1c Computer Systems 1
2010 Semester 1
Lecture Notes
James Goodman!
Credits: Slides prepared by Gregory T. Byrd, North Carolina State University
Input & Output
Lecture 17, 21Apr10:
8-220
I/O: Connecting to Outside World
So far, we’ve learned how to: ! compute with values in registers ! load data from memory to registers ! store data from registers to memory
But where does data in memory come from?
And how does data get out of the system so that humans can use it?
8-221
I/O: Connecting to the Outside World
Types of I/O devices characterized by: ! behavior: input, output, storage
• input: keyboard, motion detector, network interface • output: monitor, printer, network interface • storage: disk, CD-ROM
! data rate: how fast can data be transferred? • keyboard: 100 bytes/sec • disk: 30 MB/s • network: 1 Mb/s - 1 Gb/s
8-222
I/O Controller
Control/Status Registers ! CPU tells device what to do -- write to control register ! CPU checks whether task is done -- read status register
Data Registers ! CPU transfers data to/from device
Device electronics ! performs actual operation
• pixels to screen, bits to/from disk, characters from keyboard
Graphics Controller Control/Status
Output Data Electronics CPU display
8-223
Programming Interface
How are device registers identified? ! Memory-mapped vs. special instructions
How is timing of transfer managed? ! Asynchronous vs. synchronous
Who controls transfer? ! CPU (polling) vs. device (interrupts)
8-224
Memory-Mapped vs. I/O Instructions
Instructions ! designate opcode(s) for I/O ! register and operation encoded in instruction
Memory-mapped ! assign a memory address
to each device register ! use data movement
instructions (LD/ST) for control and data transfer
8-225
Transfer Timing
I/O events generally happen much slower than CPU cycles.
Synchronous ! data supplied at a fixed, predictable rate ! CPU reads/writes every X cycles
Asynchronous ! data rate less predictable ! CPU must synchronize with device,
so that it doesn’t miss data or write too quickly
8-226
Transfer Control
Who determines when the next data transfer occurs?
Polling ! CPU keeps checking status register until
new data arrives OR device ready for next data ! “Are we there yet? Are we there yet? Are we there yet?”
Interrupts ! Device sends a special signal to CPU when
new data arrives OR device ready for next data ! CPU can be performing other tasks instead of polling device. ! “Wake me when we get there.”
8-227
LC-3
Memory-mapped I/O (Table A.3)
Asynchronous devices ! synchronized through status registers
Polling and Interrupts ! the details of interrupts will be discussed in Chapter 10
Location I/O Register Function
xFE00 Keyboard Status Reg (KBSR) Bit [15] is one when keyboard has received a new character.
xFE02 Keyboard Data Reg (KBDR) Bits [7:0] contain the last character typed on keyboard.
xFE04 Display Status Register (DSR) Bit [15] is one when device ready to display another char on screen.
xFE06 Display Data Register (DDR) Character written to bits [7:0] will be displayed on screen.
8-228
Input from Keyboard
When a character is typed: ! its ASCII code is placed in bits [7:0] of KBDR
(bits [15:8] are always zero) ! the “ready bit” (KBSR[15]) is set to one ! keyboard is disabled -- any typed characters will be ignored
When KBDR is read: ! KBSR[15] is set to zero ! keyboard is enabled
KBSR
KBDR 15 8 7 0
15 14 0
keyboard data
ready bit
8-229
Basic Input Routine
new char?
read character
YES
NO
Polling
POLL LDI R0, KBSRPtr BRzp POLL LDI R0, KBDRPtr
...
KBSRPtr .FILL xFE00 KBDRPtr .FILL xFE02
8-230
Simple Implementation: Memory-Mapped Input
Address Control Logic determines whether MDR is loaded from Memory or from KBSR/KBDR.
8-231
Output to Monitor
When Monitor is ready to display another character: ! the “ready bit” (DSR[15]) is set to one
When data is written to Display Data Register: ! DSR[15] is set to zero ! character in DDR[7:0] is displayed ! any other character data written to DDR is ignored
(while DSR[15] is zero)
DSR
DDR 15 8 7 0
15 14 0
output data
ready bit
8-232
Basic Output Routine
screen ready?
write character
YES
NO
Polling
POLL LDI R1, DSRPtr BRzp POLL STI R0, DDRPtr
...
DSRPtr .FILL xFE04 DDRPtr .FILL xFE06
8-233
Simple Implementation: Memory-Mapped Output
Sets LD.DDR or selects DSR as input.
8-234
Keyboard Echo Routine
Usually, input character is also printed to screen. ! User gets feedback on character typed
and knows its ok to type the next character.
new char?
read character
YES
NO
screen ready?
write character
YES
NO
POLL1 LDI R0, KBSRPtr BRzp POLL1 LDI R0, KBDRPtr
POLL2 LDI R1, DSRPtr BRzp POLL2 STI R0, DDRPtr
...
KBSRPtr .FILL xFE00 KBDRPtr .FILL xFE02 DSRPtr .FILL xFE04 DDRPtr .FILL xFE06
8-239
Interrupt-Driven I/O
External device can: (1) Force currently executing program to stop; (2) Have the processor satisfy the device’s needs; and (3) Resume the stopped program as if nothing happened.
Why? ! Polling consumes a lot of cycles,
especially for rare events – these cycles can be used for more computation.
! Example: Process previous input while collecting current input. (See Example 8.1 in text.)
8-240
Interrupt-Driven I/O
To implement an interrupt mechanism, we need: ! A way for the I/O device to signal the CPU that an
interesting event has occurred. ! A way for the CPU to test whether the interrupt signal is set
and whether its priority is higher than the current program.
Generating Signal ! Software sets “interrupt enable” bit in device register. ! When ready bit is set and IE bit is set, interrupt is signaled.
KBSR 15 14 0
ready bit 13
interrupt enable bit
interrupt signal to processor
8-241
Priority
Every instruction executes at a stated level of urgency. LC-3: 8 priority levels (PL0-PL7)
! Example: • Payroll program runs at PL0. • Nuclear power correction program runs at PL6.
! It’s OK for PL6 device to interrupt PL0 program, but not the other way around.
Priority encoder selects highest-priority device, compares to current processor priority level, and generates interrupt signal if appropriate.
8-242
Testing for Interrupt Signal
CPU looks at signal between STORE and FETCH phases. If not set, continues with next instruction. If set, transfers control to interrupt service routine.
EA
OP
EX
S
F
D
interrupt signal?
Transfer to ISR
NO
YES
More details in Chapter 10.
8-243
Full Implementation of LC-3 Memory-Mapped I/O
Because of interrupt enable bits, status registers (KBSR/DSR) must be written, as well as read.
Computer Science 210 s1c Computer Systems 1
2010 Semester 1
Lecture Notes
James Goodman!
Credits: Slides prepared by Gregory T. Byrd, North Carolina State University
Subroutines & Traps Lecture 19, 26Apr10:
9-250
System Calls
Certain operations require specialized knowledge and protection: ! specific knowledge of I/O device registers
and the sequence of operations needed to use them ! I/O resources shared among multiple users/programs;
a mistake could affect lots of other users!
Not every programmer knows (or wants to know) this level of detail
Provide service routines or system calls (part of operating system) to safely and conveniently perform low-level, privileged operations
9-251
System Call
1. User program invokes system call. 2. Operating system code performs operation. 3. Returns control to user program.
In LC-3, this is done through the TRAP mechanism.
9-252
LC-3 TRAP Mechanism
1. A set of service routines. ! part of operating system -- routines start at arbitrary addresses
(convention is that system code is “below” x3000) ! up to 256 routines
2. Table of starting addresses. ! stored at x0000 through x00FF in memory ! called System Control Block in some architectures
3. TRAP instruction. ! used by program to transfer control to operating system ! 8-bit trap vector names one of the 256 service routines
4. A linkage back to the user program. ! want execution to resume
immediately after the TRAP instruction
9-253
TRAP Instruction
Trap vector ! identifies which system call to invoke ! 8-bit index into table of service routine addresses
• in LC-3, this table is stored in memory at 0x0000 – 0x00FF • 8-bit trap vector is zero-extended into 16-bit memory address
Where to go ! lookup starting address from table; place in PC
How to get back ! save address of next instruction (current PC) in R7
9-254
TRAP
NOTE: PC has already been incremented during instruction fetch stage.
9-255
RET (JMP R7)
How do we transfer control back to instruction following the TRAP?
We saved old PC in R7. ! JMP R7 gets us back to the user program at the right spot.
! LC-3 assembly language lets us use RET (return) in place of “JMP R7”.
Must make sure that service routine does not change R7, or we won’t know where to return.
9-256
TRAP Mechanism Operation
1. Lookup starting address. 2. Transfer to service routine. 3. Return (JMP R7).
9-257
Example: Using the TRAP Instruction
.ORIG x3000 LD R2, TERM ; Load negative ASCII ‘7’
LD R3, ASCII ; Load ASCII difference AGAIN TRAP x23 ; input character
ADD R1, R2, R0 ; Test for terminate BRz EXIT ; Exit if done ADD R0, R0, R3 ; Change to lowercase TRAP x21 ; Output to monitor... BRnzp AGAIN
ASCII .FILL x0020 ; lowercase bit EXIT TRAP x25 ; halt
.END
9-258
Example: Output Service Routine
.ORIG x0430 ; syscall address ST R7, SaveR7 ; save R7 & R1 ST R1, SaveR1 …
; ----- Write character TryWrite LDI R1, CRTSR ; get status
BRzp TryWrite ; look for bit 15 on WriteIt STI R0, CRTDR ; write char
… ; ----- Return from TRAP Return LD R1, SaveR1 ; restore R1 & R7
LD R7, SaveR7 RET ; back to user
CRTSR .FILL xF3FC CRTDR .FILL xF3FF SaveR1 .FILL 0 SaveR7 .FILL 0
.END
stored in table, location x21
9-259
TRAP Routines and their Assembler Names
vector symbol routine
x20 GETC read a single character (no echo)
x21 OUT output a character to the monitor
x22 PUTS write a string to the console
x23 IN print prompt to console, read and echo character from keyboard
x25 HALT halt the program
9-260
Saving and Restoring Registers
Must save the value of a register if: ! Its value will be destroyed by service routine, and ! We will need to use the value after that action.
Who saves? ! caller of service routine?
• knows what it needs later, but may not know what gets altered by called routine
! called service routine? • knows what it alters, but does not know what will be needed later
by calling routine
9-261
Example
LEA R3, Binary LD R6, ASCII ; char->digit template LD R7, COUNT ; initialize to 10
AGAIN TRAP x23 ; Get char ADD R0, R0, R6 ; convert to number STR R0, R3, #0 ; store number ADD R3, R3, #1 ; incr pointer ADD R7, R7, -1 ; decr counter BRp AGAIN ; more? BRnzp NEXT
ASCII .FILL xFFD0 COUNT .FILL #10 Binary .BLKW #10 What’s wrong with this routine?
What happens to R7?
9-269
Saving and Restoring Registers
Called routine -- “callee-save” ! Before start, save any registers that will be altered
(unless altered value is desired by calling program!) ! Before return, restore those same registers
Calling routine -- “caller-save” ! Save registers destroyed by own instructions or
by called routines (if known), if values needed later • save R7 before TRAP • save R0 before TRAP x23 (input character)
! Or avoid using those registers altogether
Values are saved by storing them in memory.
9-270
Question
Can a service routine call another service routine?
If so, is there anything special the calling service routine must do?
9-271
What about User Code?
Service routines provide three main functions: 1. Shield programmers from system-specific details. 2. Write frequently-used code just once. 3. Protect system resources from malicious/clumsy
programmers.
Are there any reasons to provide the same functions for non-system (user) code?
9-272
Subroutines
A subroutine is a program fragment that: ! lives in user space ! performs a well-defined task ! is invoked (called) by another user program ! returns control to the calling program when finished
Like a service routine, but not part of the OS ! not concerned with protecting hardware resources ! no special privilege required
Reasons for subroutines: ! reuse useful (and debugged!) code without having to
keep typing it in ! divide task among multiple programmers ! use vendor-supplied library of useful routines
9-274
JSR Instruction
Jumps to a location (like a branch but unconditional), and saves current PC (addr of next instruction) in R7. ! saving the return address is called “linking” ! target address is PC-relative (PC + Sext(IR[10:0])) ! bit 11 specifies addressing mode
• if =1, PC-relative: target address = PC + Sext(IR[10:0]) • if =0, register: target address = contents of register IR[8:6]
9-275
JSR
NOTE: PC has already been incremented during instruction fetch stage.
9-276
JSRR Instruction
Just like JSR, except Register addressing mode. ! target address is Base Register ! bit 11 specifies addressing mode
What important feature does JSRR provide that JSR does not?
9-277
JSRR
NOTE: PC has already been incremented during instruction fetch stage.
9-278
Returning from a Subroutine
RET (JMP R7) gets us back to the calling routine. ! just like TRAP
9-279
Example: Negate the value in R0
2sComp NOT R0, R0 ; flip bits ADD R0, R0, #1 ; add one RET ; return to caller
To call from a program (within 1024 instructions):
; need to compute R4 = R1 - R3 ADD R0, R3, #0 ; copy R3 to R0 JSR 2sComp ; negate ADD R4, R1, R0 ; add to R1 ...
Note: Caller should save R0 if we’ll need it later!
9-280
Passing Information to/from Subroutines
Arguments ! A value passed in to a subroutine is called an argument. ! This is a value needed by the subroutine to do its job. ! Examples:
• In 2sComp routine, R0 is the number to be negated • In OUT service routine, R0 is the character to be printed. • In PUTS routine, R0 is address of string to be printed.
Return Values ! A value passed out of a subroutine is called a return value. ! This is the value that you called the subroutine to compute. ! Examples:
• In 2sComp routine, negated value is returned in R0. • In GETC service routine, character read from the keyboard
is returned in R0.
9-281
Using Subroutines
In order to use a subroutine, a programmer must know: ! its address (or at least a label that will be bound to its address) ! its function (what does it do?)
• NOTE: The programmer does not need to know how the subroutine works, but what changes are visible in the machine’s state after the routine has run.
! its arguments (where to pass data in, if any) ! its return values (where to get computed data, if any)
9-282
Saving and Restore Registers
Since subroutines are just like service routines, we also need to save and restore registers, if needed.
Generally use “callee-save” strategy, except for return values. ! Save anything that the subroutine will alter internally
that shouldn’t be visible when the subroutine returns. ! It’s good practice to restore incoming arguments to
their original values (unless overwritten by return value).
Remember: You MUST save R7 if you call any other subroutine or service routine (TRAP). ! Otherwise, you won’t be able to return to caller.
9-288
Library Routines
Vendor may provide object files containing useful subroutines ! don’t want to provide source code -- intellectual property ! assembler/linker must support EXTERNAL symbols
(or starting address of routine must be supplied to user) ...
.EXTERNAL SQRT ...
LD R2, SQAddr ; load SQRT addr JSRR R2 ...
SQAddr .FILL SQRT
Using JSRR, because we don’t know whether SQRT is within 1024 instructions.
Chapter 10 And, Finally... The Stack
10-292
Stack: An Abstract Data Type
An important abstraction that you will encounter in many applications.
We will describe three uses: Interrupt-Driven I/O
! The rest of the story…
Evaluating arithmetic expressions ! Store intermediate results on stack instead of in registers
Data type conversion ! 2’s comp binary to ASCII strings
10-293
Stacks
A LIFO (last-in first-out) storage structure. ! The first thing you put in is the last thing you take out. ! The last thing you put in is the first thing you take out.
This means of access is what defines a stack, not the specific implementation.
Two main operations: PUSH: add an item to the stack POP: remove an item from the stack
10-294
A Physical Stack
Coin rest in the arm of an automobile
First quarter out is the last quarter in.
1995 1996 1998 1982 1995
1998 1982 1995
Initial State After One Push
After Three More Pushes
After One Pop
10-295
A Hardware Implementation
Data items move between registers
/ / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
Yes Empty:
TOP #18 / / / / / / / / / / / / / / / / / / / / / / / /
No Empty:
TOP #12 #5 #31 #18
/ / / / / /
No Empty:
TOP #31 #18
/ / / / / / / / / / / / / / / / / /
No Empty:
TOP
Initial State After One Push
After Three More Pushes
After Two Pops
10-296
A Software Implementation
Data items don't move in memory, just our idea about there the TOP of the stack is.
/ / / / / / / / / / / / / / / / / / / / / / / / / / / / / / TOP
/ / / / / / / / / / / / / / / / / /
#18 / / / / / /
TOP
#12 #5
#31 #18
/ / / / / /
TOP #12 #5
#31 #18
/ / / / / /
TOP
Initial State After One Push
After Three More Pushes
After Two Pops
x4000 x3FFF x3FFC x3FFE R6 R6 R6 R6
By convention, R6 holds the Top of Stack (TOS) pointer.
10-297
Basic Push and Pop Code
For our implementation, stack grows downward (when item added, TOS moves closer to 0)
Push ADD R6, R6, #-1 ; decrement stack ptr
STR R0, R6, #0 ; store data (R0)
Pop LDR R0, R6, #0 ; load data from TOS ADD R6, R6, #1 ; decrement stack ptr
10-298
Pop with Underflow Detection
If we try to pop too many items off the stack, an underflow condition occurs. ! Check for underflow by checking TOS before removing data. ! Return status code in R5 (0 for success, 1 for underflow)
POP LD R1, EMPTY ; EMPTY = -x4000 ADD R2, R6, R1 ; Compare stack pointer BRz FAIL ; with x3FFF LDR R0, R6, #0 ADD R6, R6, #1 AND R5, R5, #0 ; SUCCESS: R5 = 0 RET FAIL AND R5, R5, #0 ; FAIL: R5 = 1 ADD R5, R5, #1 RET EMPTY .FILL xC000
10-299
Push with Overflow Detection
If we try to push too many items onto the stack, an overflow condition occurs. ! Check for underflow by checking TOS before adding data. ! Return status code in R5 (0 for success, 1 for overflow)
PUSH LD R1, MAX ; MAX = -x3FFB ADD R2, R6, R1 ; Compare stack pointer BRz FAIL ; with x3FFF ADD R6, R6, #-1 STR R0, R6, #0 AND R5, R5, #0 ; SUCCESS: R5 = 0 RET FAIL AND R5, R5, #0 ; FAIL: R5 = 1 ADD R5, R5, #1 RET MAX .FILL xC005
Computer Science 210 s1c Computer Systems 1
2010 Semester 1
Lecture Notes
James Goodman!
Credits: Slides prepared by Gregory T. Byrd, North Carolina State University
Stacks Lecture 21, 29Apr10:
Stack Implementation Details
In example, the first location (largest address) is never used • Push: Decrement SP, then store • Pop: Load using SP, then increment
Notice that SP always points to top element on the stack • Unless it is empty
Alternative implementation • Push: Store using SP, then decrement SP • Pop: Increment SP, then load
In first scheme, SP points to first element, but invalid address when stack is empty, and that address is never used • But points to invalid address when stack is empty • That location is never used!
In second scheme, SP points to first free location in stack
• But points to invalid address when the stack is full
6/3/10 CS210 307 10-308
Interrupt-Driven I/O (Part 2)
Interrupts were introduced in Chapter 8. 1. External device signals need to be serviced. 2. Processor saves state and starts service routine. 3. When finished, processor restores state and resumes program.
Chapter 8 didn’t explain how (2) and (3) occur, because it involves a stack.
Now, we’re ready…
Interrupt is an unscripted subroutine call, triggered by an external event.
10-309
Processor State
What state is needed to completely capture the state of a running process?
Processor Status Register ! Privilege [15], Priority Level [10:8], Condition Codes [2:0]
Program Counter ! Pointer to next instruction to be executed.
Registers ! All temporary state of the process that’s not stored in memory.
10-310
Where to Save Processor State?
Can’t use registers. ! Programmer doesn’t know when interrupt might occur,
so she can’t prepare by saving critical registers. ! When resuming, need to restore state exactly as it was.
Memory allocated by service routine? ! Must save state before invoking routine,
so we wouldn’t know where. ! Also, interrupts may be nested –
that is, an interrupt service routine might also get interrupted!
Use a stack! ! Location of stack “hard-wired”. ! Push state to save, pop to restore.
10-311
Supervisor Stack
A special region of memory used as the stack for interrupt service routines. ! Initial Supervisor Stack Pointer (SSP) stored in Saved.SSP. ! Another register for storing User Stack Pointer (USP):
Saved.USP.
Want to use R6 as stack pointer. ! So that our PUSH/POP routines still work.
When switching from User mode to Supervisor mode (as result of interrupt), save R6 to Saved.USP.
10-312
Invoking the Service Routine – The Details
1. If Priv = 1 (user), Saved.USP = R6, then R6 = Saved.SSP.
2. Push PSR and PC to Supervisor Stack. 3. Set PSR[15] = 0 (supervisor mode). 4. Set PSR[10:8] = priority of interrupt being serviced. 5. Set PSR[2:0] = 0. 6. Set MAR = x01vv, where vv = 8-bit interrupt vector
provided by interrupting device (e.g., keyboard = x80). 7. Load memory location (M[x01vv]) into MDR. 8. Set PC = MDR; now first instruction of ISR will be fetched.
Note: This all happens between the STORE RESULT of the last user instruction and the FETCH of the first ISR instruction.
10-313
Returning from Interrupt
Special instruction – RTI – that restores state.
1. Pop PC from supervisor stack. (PC = M[R6]; R6 = R6 + 1) 2. Pop PSR from supervisor stack. (PSR = M[R6]; R6 = R6 + 1) 3. If PSR[15] = 1, R6 = Saved.USP.
(If going back to user mode, need to restore User Stack Pointer.)
RTI is a privileged instruction. ! Can only be executed in Supervisor Mode. ! If executed in User Mode, causes an exception.
(More about that later.)
10-314
Example (1)
/ / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
x3006 PC
Program A
ADD x3006
Executing ADD at location x3006 when Device B interrupts.
Saved.SSP
10-315
Example (2)
/ / / / / /
x3007 PSR for A
/ / / / / /
/ / / / / /
x6200 PC
R6
Program A
ADD x3006
Saved.USP = R6. R6 = Saved.SSP. Push PSR and PC onto stack, then transfer to Device B service routine (at x6200).
x6200
ISR for Device B
x6210 RTI
10-316
Example (3)
/ / / / / /
x3007 PSR for A
/ / / / / /
/ / / / / /
x6203 PC
R6
Program A
ADD x3006
Executing AND at x6202 when Device C interrupts.
x6200
ISR for Device B
AND x6202
x6210 RTI
10-317
Example (4)
/ / / / / /
x3007 PSR for A
x6203 PSR for B
x6300 PC
R6
Program A
ADD x3006
x6200
ISR for Device B
AND x6202
ISR for Device C
Push PSR and PC onto stack, then transfer to Device C service routine (at x6300).
x6300
x6315 RTI
x6210 RTI
10-318
Example (5)
/ / / / / /
x3007 PSR for A
x6203 PSR for B
x6203 PC
R6
Program A
ADD x3006
x6200
ISR for Device B
AND x6202
ISR for Device C
Execute RTI at x6315; pop PC and PSR from stack.
x6300
x6315 RTI
x6210 RTI
10-319
Example (6)
/ / / / / /
x3007 PSR for A
x6203 PSR for B
x3007 PC
Program A
ADD x3006
x6200
ISR for Device B
AND x6202
ISR for Device C
Execute RTI at x6210; pop PSR and PC from stack. Restore R6. Continue Program A as if nothing happened.
x6300
x6315 RTI
x6210 RTI
Saved.SSP
Computer Science 210 s1c Computer Systems 1
2010 Semester 1
Lecture Notes
James Goodman!
Credits: Slides prepared by Gregory T. Byrd, North Carolina State University
Real Processors: Alpha, MIPS & the X86
Lecture 22, 3May10:
Jun-3-10 CS210 340
What’s So Great About the ALPHA?
1. It’s real Well, it once was
2. It’s the best/fastest/cleanest Really!
3. “A design to last 25 years” Uhhh…
Jun-3-10 CS210 341
Ideas Same in LC-3 & MIPS & Alpha
• von Neumann computer ! Implemented with a finite-state machines ! Performs same basic fetch/execute cycle
• Fixed-length instructions (32-bits)[see X-86]
• General-purpose registers ! 2n registers ! Load/Store architecture[see X-86]
• JSR/RET • TRAP (CALLSYS[Alpha], SYSCALL[MIPS])
Jun-3-10 CS210 342
History of Digital Equipment Corporation (DEC)
• Founded in 1957 • PDP-8 (1964) 12-bit computer • PDP-11 (1970) 16-bit computer • VAX (1976) 32-bit computer • Alpha
! EV4: 1992; 192MHz ! EV5: 1995; 333MHz ! EV6: 1998; 450MHz (eventually 1.25GHz) ! EV7: 2003; 1.15GHz
• DEC bought by Compaq (later bought by HP): 1998 • Alpha IP sold to Intel: 2001 • Intel phased out Alpha in favour of Itanium: 2004
Jun-3-10 CS210 343
Beyond a Byte
• The Alpha is a 64-bit computer • Registers (32) are 64 bits wide • Instructions are 32 bits • Addresses can be up to 55 bits (!254 = 18 quadrillion bytes of memory) • Operate instructions exist for
! Bytes (8 bits) ! Words (16 bits) ! Longwords (32 bits) ! Quadwords (64 bits)
• Load/store instructions exist for different sized operands ! lb/stb (byte) ! lw/sw (word) ! ll/stl (longword) ! lq/stq (quadword) ! Smaller operands go into least significant bits of register
• Floating point: 64 more registers; more operations
Jun-3-10 CS210 344
Instruction Format
The MIPS Architecture
Jun-3-10 CS215s2c 346
References
Good starting point for the MIPS architecture: http://en.wikipedia.org/wiki/MIPS_architecture
! Very nice summary of architecture ! Lots of pointers to other material
Read (more) about the MIPS architecture ! http://www.mrc.uidaho.edu/mrc/people/jff/digital/MIPSir.html
• MIPS Instruction reference ! http://www.xs4all.nl/~vhouten/mipsel/r3000-isa.html
• Student paper summarizing MIPS Instruction Set ! http://www.langens.eu/tim/ea/mips_en.php
• Lots of MIPS documentation: ! http://chortle.ccsu.edu/AssemblyTutorial/TutorialContents.html
• Tutorial on MIPS Assembly Language: ! http://www.cs.wisc.edu/~larus/HP_AppA.pdf
• Patterson&Hennessy (CS 313 textbook) Appendix A: SPIM, a MIPS simulator (pdf)
Jun-3-10 CS210 347
The LC-3 Instructions
Jun-3-10 CS210 348
The Alpha Instructions
Jun-3-10 CS210 349
The Alpha Instructions
Jun-3-10 CS215s2c 350
The MIPS Computer
!"#$
"!$ %&$
!'&($
)*+,-.$
)/&$
)0&$
&*12$
3-45*$
%6785$
985785$
%:9$
9"$ /(#$
&;$
<4:(,$
;=5>"5>$&*?@$
!"#"$%&'()$*+,"'-"./,0"$,'
Jun-3-10 CS215s2c 351
Registers
! 32 general registers • $0 - $31; also names • $0 is special
– when read, gives zero – writing has no effect
• $31 sometimes implicit in instruction ! 16/32 floating-point registers
• $fgr0-$fgr31 32-bit floating-point registers • Can be configured as 16 64-bit registers
! Special-purpose registers • Hi/Lo (multiplication/division) • Floating-point control/status registers
Jun-3-10 CS215s2c 352
Pseudoinstructions
Some “instructions” are not implemented in the hardware, but are synthesised from two or more real instructions. These instructions are recognized by the assembler and automatically synthesised.
For purposes of this class, we will generally ignore the distinction.
Jun-3-10 CS215s2c 353
Categories of Instructions
1. Arithmetic/Logical [LC-3: Operate Instructions] a. Arithmetic b. Logical c. Shift d. Compare [LC-3 equivalent?]
2. Control a. Branch on condition b. Jump c. Special
3. Data transfer a. Load b. Store c. Move(copy) d. Load address
Jun-3-10 CS215s2c 354
1a. Arithmetic Instructions
ADD, SUB, MUL, DIV, REM, Two sources, one destination (can be common)
! Form: add D,S1,S2 D ← S1 + S2 • D, S1 are registers. • S2 can be a register or an immediate, i.e., value contained in the
instruction. ! Multiple operand sizes (8, 16, 32, 64 bits) ! Signed and unsigned arithmetic
• add (signed) • addu (unsigned) • Difference: unsigned never overflows
! Overflow • Addition & subtraction: only one bit • Multiplication: none because result is twice as big
Jun-3-10 CS215s2c 355
1b. Logical Instructions
Instructions: AND, OR, XOR, NOR, NOT Two sources (one for NOT), one destination Form: and D,S1,S2 D ← S1 AND S2
! D, S1 are registers. ! S2 can be a register or an immediate, i.e., value contained in the
instruction.
Multiple operand sizes (8, 16, 32, 64 bits) Overflow: none
Jun-3-10 CS215s2c 356
LC-3 Logical Operations
A B
0 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
0 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1
1 0 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1
1 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
AND NOT
Possible Functions of A, B
Jun-3-10 CS215s2c 357
MIPS Logical Operations
A B
0 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
0 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1
1 0 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1
1 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
AND NOT
Possible Functions of A, B
OR NOR XOR
Jun-3-10 CS215s2c 358
Categories of Instructions
1. Arithmetic/Logical [LC-3: Operate Instructions] a. Arithmetic b. Logical c. Shift d. Compare [LC-3 equivalent?]
2. Control a. Branch on condition b. Jump c. Special
3. Data transfer a. Load b. Store c. Move(copy) d. Load address
Jun-3-10 CS215s2c 359
1c. Shift Operations
Form: sll D,S,AMT AMT is a count, equivalent to AMT shifts by 1 place. There are three types of Shift Operations
! logical (srl, sll) ! arithmetic (sra, sll) ! rotate (rr)
Jun-3-10 CS215s2c 362
Shift Operations
Right Rotate Operation:
No information lost For N-bit word, rotate right N positions has no effect Rotate right i positions is same as rotate left N – i positions Not implemented in MIPS (why not?)
msb lsb
Jun-3-10 CS215s2c 363
Logical Shift Operations
Right Logical Shift Operation:
MIPS instruction: srl Java equivalent: >>>
0
msb lsb
discard
Jun-3-10 CS215s2c 364
Logical Shift Operations
Left Logical Shift Operation:
MIPS instruction: sll Java equivalent: <<
0
msb lsb
Discard
Jun-3-10 CS215s2c 365
Arithmetic Shift Operations
Right Arithmetic Shift Operation ! Unsigned integer division by power of 2
Round down (toward negative infinity) MIPS instruction: sra Java equivalent: >>
! same as integer division by power of 2???
msb lsb
discard
Jun-3-10 CS215s2c 366
Arithmetic Shift Operations
Left Arithmetic Shift Operation ! Unsigned integer multiplication by power of 2
Overflow if MSB changes
MIPS instruction: sll (no sla) Java equivalent: ‘* 2i’
0
msb lsb
Discard?
Same as logical le! shi!!
Jun-3-10 CS215s2c 367
Categories of Instructions
1. Arithmetic/Logical [LC-3: Operate Instructions] a. Arithmetic b. Logical c. Shift d. Compare [LC-3 equivalent?]
2. Control a. Branch on condition b. Jump c. Special
3. Data transfer a. Load b. Store c. Move(copy) d. Load address
Jun-3-10 CS215s2c 368
2a. Control Instructions
Basic instruction for choosing alternate instruction path: ! Branch on condition: bne R1,R2,L1
• True if R1,R2 are unequal ! Possible tests
• beq : R1 = R2 are equal ? • bne : R1 ! R2 ? • bgt : R1 > R2 ? • blt : R1 < R2 ? • bge : R1 " R2 ? • ble : R1 # R2 ? • b : Unconditional
Jun-3-10 CS215s2c 369
Other MIPS Control Instructions
2b. Jump
jmp Unconditional, large/unlimited range jal Unconditional, but save address for return
2c. Special
syscall Invoke operating system break Invoke operating system rfe Return from exception
Jun-3-10 CS215s2c 370
Categories of Instructions
1. Arithmetic/Logical [LC-3: Operate Instructions] a. Arithmetic b. Logical c. Shift d. Compare [LC-3 equivalent?]
2. Control a. Branch on condition b. Jump c. Special
3. Data transfer a. Load b. Store c. Move(copy) d. Load address
LC-3: 4 Load Instructions
Really 4 addressing modes: • LEA Rd, Label ; Rd = PC + SEXT(PCoffset9) • LD Rd, Label ; Rd = mem[PC + SEXT(PCoffset9)] • LDI Rd, Label ; Rd = mem[ mem[PC + SEXT(PCoffset9)]]
• LDR Rd, Rb, offset6 ; Rd = mem[Rb + SEXT( offset6)]
6/3/10 CS210 372
MIPS: 1 Load Instruction (others are synthesized)
• LW Rd, (Rb)offset16 ; Rd = mem[Rb + SEXT( offset16)]
Jun-3-10 CS215s2c 373
Load Reg, Disp(Base)
Effective address: (Base) + Displacement Base specifies the content of a register Displacement is a 16-bit signed constant, sign-extended Displacement defines position relative to Base
=A$ B1@*$ 0*@5$
CDE45@$ FDE45@$FDE45@$
04@7=1G*+*65$
HCDE45@$
Computer Science 210 s1c Computer Systems 1
2010 Semester 1
Lecture Notes
James Goodman!
Credits: Slides prepared by Gregory T. Byrd, North Carolina State University
Real Processors: Alpha, MIPS & the X86
Lecture 23, 5May10:
MIPS: Variations of Load: Byte, Halfword, Word, Longword
6/3/10 CS210 384
• LB Rd, (Rb)offset16 ; Rd = mem[Rb + SEXT( offset16)] • LH Rd, (Rb)offset16 ; Rd = mem[Rb + SEXT( offset16)] • LW Rd, (Rb)offset16 ; Rd = mem[Rb + SEXT( offset16)] • LL Rd, (Rb)offset16 ; Rd = mem[Rb + SEXT( offset16)]
• LUI Rd, constant ; Rd = constant<<16
Jun-3-10 CS215s2c 385
Insert Constant
LC-3: Can insert 5-bit constant with AND immediate
MIPS: Can insert 16-bit constant with AND or OR immediate ! But MIPS has 32/64 bits ! How to insert constant in upper bits?
Load Upper Immediate: LUI ! Inserts immediate 16 bits upper, clears lower 16 bits
Matched with another instruction to provide arbitrary 32-bit load address with two instructions (but not 64-bit load)
Jun-3-10 CS215s2c 386
3. Memory Instructions
3a. Load byte, half-word, word, longword
lb … Load sign-extended byte lbu … Load zero-extended byte lh … Load sign-extended two bytes (halfword) …
3b. Store byte, half-word, word, longword
sb … Store byte sh … Store halfword sw … Store word …
Views of Memory
6/3/10 CS210 387
Jun-3-10 CS215s2c 388
MIPS/Alpha/X86 View of Memory
Also true for longwords (64 bits)
IIJ$III$
IIK$IIC$IIL$
IIG$II*$
II1$
IHI$
IHK$IHJ$
IHC$IHL$IH1$
IH*$IJI$
IHG$
IJJ$M$
IIH$III$
IIJ$IIN$IIK$
IIC$IIO$
IIF$
IIL$
II1$IIP$
IIE$IIG$II2$
IIQ$IHI$
II*$
IHH$M$
IIK$III$
IIL$IIG$IHI$
IHL$IHG$
IHK$
IJI$
IJL$IJK$
IJG$INI$INK$
ING$IKI$
INL$
IKK$M$
Size of box: 8 bits
Jun-3-10 CS215s2c 390
Byte Order
Little Endian
IIJ$III$
IIK$IIC$IIL$
IIG$II*$
II1$
IHI$
IHK$IHJ$
IHC$IHL$IH1$
IH*$IJI$
IHG$
IJJ$M$
IIH$III$
IIJ$IIN$IIK$
IIC$IIO$
IIF$
IIL$
II1$IIP$
IIE$IIG$II2$
IIQ$IHI$
II*$
IHH$M$
IIK$III$
IIL$IIG$IHI$
IHL$IHG$
IHK$
IJI$
IJL$IJK$
IJG$INI$INK$
ING$IKI$
INL$
IKK$M$
HH$ HI$ IH$ II$H$ I$
Jun-3-10 CS215s2c 391
Byte Order
Big Endian
IIJ$III$
IIK$IIC$IIL$
IIG$II*$
II1$
IHI$
IHK$IHJ$
IHC$IHL$IH1$
IH*$IJI$
IHG$
IJJ$M$
IIH$III$
IIJ$IIN$IIK$
IIC$IIO$
IIF$
IIL$
II1$IIP$
IIE$IIG$II2$
IIQ$IHI$
II*$
IHH$M$
IIK$III$
IIL$IIG$IHI$
IHL$IHG$
IHK$
IJI$
IJL$IJK$
IJG$INI$INK$
ING$IKI$
INL$
IKK$M$
II$ IH$ HI$ HH$I$ H$
Jun-3-10 CS215s2c 392
Views of Memory
An array of longwords (little-endian)
IIL$III$
IHI$IHL$IJI$
INI$INL$
IJL$
IKI$
IFI$IKL$
IFL$ICI$ICL$
IOL$ILI$
IOI$
ILL$M$
/=4?6*2$2,8E=*2A,-2DRIILS$
#61=4?6*2$2,8E=*A,-2DRIHQS$
Jun-3-10 CS210 393
Unaligned bytes
7 0 23 15 39 31 63 55 47
7 0 23 15 39 31 63 55 47
7 0 23 15 39 31 63 55 47
Jun-3-10 CS215s2c 394
Instruction Format
R-Type: Operate (3 registers)
I-Type: Operate (2 registers + Immediate)
I-Type: Branch on condition
J-Type: Jump
I-Type: Load/Store
Src Reg 1 Dest Reg Operation Src Reg 2 0 0 0 0 0
Immediate Op code Src Reg Dest Reg
Offset Op code Reg 1 Reg 2
Target Op code
Offset Op code Base Dest Reg
0 0 0 0 0 0
Jun-3-10 CS210 395
Interesting ideas in MIPS/Alpha not in the LC-3
• Shift instruction • Subtract, multiply, divide/mod! Also Square root (floating
point) • Logical operations • Operands of size other than 16 bits • Operands of size other than register • Alignment issues (big- vs. little-endian) • Branch test using register value or compare • No condition code • “Zero” register • Fewer addressing modes (!) • Clean separation between instructions and data • Virtual memory
Computer Science 210 s1c Computer Systems 1
2010 Semester 1
Lecture Notes
Credits: Slides prepared by Gregory T. Byrd, North Carolina State University
From LC-3 to x86
Jun-3-10 CS210 416
From LC-3 to x86
• Appendix B from Introduction to Computing Systems: from bits & gates to C and beyond, by Yale Patt and Sanjay Patel, 2nd Edition (2004), McGraw-Hill
• The material in Appendix B.1, pp. 547-557 will not be included in the test, but will be covered on the final exam.
Jun-3-10 CS210 417
Jun-3-10 CS210 418 Jun-3-10 CS210 419
X-86 History
• 1974: Intel i8080 (8 bits) • 1979: 8086/8088 (16 bits) • 1982: 80286 (16 bits) • 1985: 80386 (32 bits) • 1989: 80486 (32 bits) • 1992: Pentium (32 bits) • 1995: PentiumPro (32 bits) • 1997: Pentium II (32 bits) • 1999: Pentium III (32 bits) • 2001: Pentium 4 (32 bits) • 2006: Xeon Woodcrest (64-bit) • 2006: Dual-core Zeon (32-bit) • 2006: Quad-core Clovertown (32-bit) • 2008: Nehalem (64-bit, 4 cores)
Jun-3-10 CS210 420
Data Types
• Integer ! 2’s complement ! Unsigned
• BCD Integer (string of 4-bit digits stored in bytes) • Packed BCD Integer (string of 4-bit digits) • Floating point (IEEE standard) • Bit string • MMX
Jun-3-10 CS210 421
Integer:7S
0
15S
0
31 0S
Unsigned Integer:7 0
15 0
31 0
Floating Point:
S2231
exponent fraction63S
51 0
exponent fraction79 63 0S
exponent fraction
0S
Bit String:
MMX Data Type:
last bit bit 0
63 48 32 16 0
element 3 element 2 element 1 element 0
63 56 48 40 32 24 16 8 0
7 6 5 4 3 2 1 element 0
X + 1X + 2X + 3X + 4… address X
BCD Integer:
digit N
digitN
digitN – 1
digit3
digit 0digit 1digit 2
048121620
Packed BCD: 04812…
digit2
digit1
digit0
length of bit string
Also 64-bit!
Jun-3-10 CS210 422
Opcodes
• Several hundred • Usually one byte; sometimes two bytes • Variable-length instructions • Many formats • Two operands: one may come from memory • Many, inconsistent addressing modes
Jun-3-10 CS210 423
Instruction Fields
• Prefix ! Indicates some form of modification of the instruction
• Opcode • Mode
! Indicates addressing mode(s) to follow
• SIB (scale, index, base) ! Optional ! Indicates memory addressing information
• Displacement ! Optional ! Indicates offset for memory address
• Immediate ! Optional ! Contains value