languages and the machine
DESCRIPTION
Languages and the Machine. Chapter 5 CS221. Topics. The Compilation Process The Assembly Process Linking and Loading Macros We will skip Case Study: Extensions to the Instruction Set – The Intel MMX™ and Motorola AltiVec™ SIMD Instructions. Compilation Process. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Languages and the Machine](https://reader036.vdocuments.net/reader036/viewer/2022062517/5681332e550346895d9a2b6a/html5/thumbnails/1.jpg)
Languages and the Machine
Chapter 5
CS221
![Page 2: Languages and the Machine](https://reader036.vdocuments.net/reader036/viewer/2022062517/5681332e550346895d9a2b6a/html5/thumbnails/2.jpg)
Topics
• The Compilation Process• The Assembly Process• Linking and Loading• Macros• We will skip
– Case Study: Extensions to the Instruction Set – The Intel MMX™ and Motorola AltiVec™ SIMD Instructions
![Page 3: Languages and the Machine](https://reader036.vdocuments.net/reader036/viewer/2022062517/5681332e550346895d9a2b6a/html5/thumbnails/3.jpg)
Compilation Process
• Assembly to Machine code fairly straightforward, but compilation is not• Translate a program written in a high level language into a functionally
equivalent program in assembly language• Consider a simple high-level language assignment statement:
– Foo = Bar + Zot + 15;• • Steps involved in compiling this statement into assembly code:
– Lexical analysis: separate into tokens, Foo, =, +, etc.– Syntactic Analysis / Parsing : Determine that we are performing an assignment,
VAR = EXPRESSION– Semantic Analysis : Determine that Foo, Bar, Zot are names, 4 is an integer– Code Generation : Determine the proper assembly code to perform the action
• ld [Bar], %r0, %r1• ld [Zot], %r0, %r2• addcc %r1, %r2, %r1• addcc %r1, 15, %r2• st %r2, %r0, [Foo]
![Page 4: Languages and the Machine](https://reader036.vdocuments.net/reader036/viewer/2022062517/5681332e550346895d9a2b6a/html5/thumbnails/4.jpg)
Compiler Issues
• Each compiler specific to a particular ISA– E.g., an int on one machine may be 32 bits, on another may be 64
bits • Cause of error in networking library ported to Alpha• Int issue not a problem in Java; JVM specifies 32 bits
– E.g., in previous example, if the ISA allowed operands of addcc to be memory addresses, we could have done
• addcc [Bar], [Zot], %r1• addcc %r1, 15, [Foo]
– Hopefully the compiler generates efficient code but optimization is a tough issue!
• Cross compiler: one that generates code for a different ISA (example, CodeWarrior)
![Page 5: Languages and the Machine](https://reader036.vdocuments.net/reader036/viewer/2022062517/5681332e550346895d9a2b6a/html5/thumbnails/5.jpg)
Mapping Variables to Memory
• Global variables– Accessible from anywhere in the program, given a fixed address– E.g., global variable X at memory address 400
• Local variables– Also called automatic variables– Defined inside a function or method, e.g.
void foo(){
int a,b;…
}– These variables created when foo is invoked, destroyed when foo
exits– These variables are created by pushing them on the stack when the
function is invoked, and are popped off when the function exits
![Page 6: Languages and the Machine](https://reader036.vdocuments.net/reader036/viewer/2022062517/5681332e550346895d9a2b6a/html5/thumbnails/6.jpg)
Local Variables and the Stack• Recall that the stack typically grows downward in
memory
• Here we start with 1234 stored on the top of the stack
Mem
048…
SP = 8
Push FFFF
Mem
048…
SP = 4
FFFF12341234
![Page 7: Languages and the Machine](https://reader036.vdocuments.net/reader036/viewer/2022062517/5681332e550346895d9a2b6a/html5/thumbnails/7.jpg)
Local Variables and the Stack
• In our case, local variables are “pushed” on the stack upon entering the function– void foo() { int a; }
• Copy SP into Frame Pointer FP (also called the Base Pointer, or BP)
Mem before Foo
048…
SP = 8
Mem in Foo
048…
SP = 4
Var a12341234
FP = 8
![Page 8: Languages and the Machine](https://reader036.vdocuments.net/reader036/viewer/2022062517/5681332e550346895d9a2b6a/html5/thumbnails/8.jpg)
Accessing Stack Variables
• These variables are referenced as offsets from the frame pointer, called based addressing
• To access a: [%fp – 4] Mem in Foo
048…
SP = 4
Var a1234
FP = 8
Why not use [%sp] ?Consider pushing lotsof stuff on the stack…Or data structures
![Page 9: Languages and the Machine](https://reader036.vdocuments.net/reader036/viewer/2022062517/5681332e550346895d9a2b6a/html5/thumbnails/9.jpg)
C to ASM Example on x86
#include <stdio.h>int c;int main(){
int a,b; a=3;
b=4;c=a*b;
}
pushl %ebpmovl %esp, %ebpsubl $8, %esp
movl $3, -4(%ebp)movl $4, -8(%ebp)
movl -4(%ebp),%eaximul1 -8(%ebp),%eaxmovl %eax, c….comm c,4,4
![Page 10: Languages and the Machine](https://reader036.vdocuments.net/reader036/viewer/2022062517/5681332e550346895d9a2b6a/html5/thumbnails/10.jpg)
Arrays in Memory• Arrays may be allocated on the stack or allocated off the
heap, a pool of memory where portions may be dynamically allocated. Access elements of an array a bit different than regular variables.
• int A[10]; Array of 10 integers
Mem allocated for A
048…40
A (Base) = 4 A[0]A[1]…A[9]
ElementAddr = A + (Index*Size)e.g. A[2] is at 4 + (2*4) = 12
![Page 11: Languages and the Machine](https://reader036.vdocuments.net/reader036/viewer/2022062517/5681332e550346895d9a2b6a/html5/thumbnails/11.jpg)
If-Statements
• Conditional statements map to a comparison and a branch instruction
• C– if (x==y) statement1; else statement2;
• Assembly (assume X in r1, Y in r2)– subcc %r1, %r2 ! Zero flag set if res=0– bne Statement2 ! Branch if zero flag is not set– ! Statement1 code– ba StatementNext ! Branch always
• Statement2: ! Statement2 code• StatementNext:
![Page 12: Languages and the Machine](https://reader036.vdocuments.net/reader036/viewer/2022062517/5681332e550346895d9a2b6a/html5/thumbnails/12.jpg)
Loops
• While, Do-While, For loops implemented using the same conditional check and branch as the if-then statement– The branch returns back to previous code
instead of jumping forward over code
![Page 13: Languages and the Machine](https://reader036.vdocuments.net/reader036/viewer/2022062517/5681332e550346895d9a2b6a/html5/thumbnails/13.jpg)
Production Level Assemblers
• Allow programmer to specify location of data and code• Provide mnemonics for all instructions and addressing
modes• Permit the use of symbolic labels to represent addresses
and constants• Provide a means to specify the starting address of the
program• Include a way to share variables between different
assembled programs• Support macros
![Page 14: Languages and the Machine](https://reader036.vdocuments.net/reader036/viewer/2022062517/5681332e550346895d9a2b6a/html5/thumbnails/14.jpg)
Assembly Example
![Page 15: Languages and the Machine](https://reader036.vdocuments.net/reader036/viewer/2022062517/5681332e550346895d9a2b6a/html5/thumbnails/15.jpg)
Assembled Code
![Page 16: Languages and the Machine](https://reader036.vdocuments.net/reader036/viewer/2022062517/5681332e550346895d9a2b6a/html5/thumbnails/16.jpg)
Two Pass Assemblers
• Most assemblers are “two-pass”– First pass
• Determine addresses of all data and instructions• Perform any assembly-time arithmetic• Put definitions and constants into the symbol table
– Second pass• Generate machine code• Insert actual addresses and values of symbols which are
known from the symbol table
– Two passes useful for forward references, i.e. referencing later on in the program
![Page 17: Languages and the Machine](https://reader036.vdocuments.net/reader036/viewer/2022062517/5681332e550346895d9a2b6a/html5/thumbnails/17.jpg)
Forward Reference
![Page 18: Languages and the Machine](https://reader036.vdocuments.net/reader036/viewer/2022062517/5681332e550346895d9a2b6a/html5/thumbnails/18.jpg)
Symbol Table
• Generated during the first pass• Maps identifiers to values, table filled in as values
are encountered and the program is parsed from top to bottom
• .org 2048 ; Says assemble code starting at 2048• const .equ value ; Defines const equal to value
![Page 19: Languages and the Machine](https://reader036.vdocuments.net/reader036/viewer/2022062517/5681332e550346895d9a2b6a/html5/thumbnails/19.jpg)
![Page 20: Languages and the Machine](https://reader036.vdocuments.net/reader036/viewer/2022062517/5681332e550346895d9a2b6a/html5/thumbnails/20.jpg)
Assembled Program
![Page 21: Languages and the Machine](https://reader036.vdocuments.net/reader036/viewer/2022062517/5681332e550346895d9a2b6a/html5/thumbnails/21.jpg)
Final Tasks of the Assembler
• Linking and Loading• We need the following additional info
– Module name and size– Address start symbol– Information about global and external symbols– Information about any library routines– Values of constants– Relocation information
![Page 22: Languages and the Machine](https://reader036.vdocuments.net/reader036/viewer/2022062517/5681332e550346895d9a2b6a/html5/thumbnails/22.jpg)
Location of Programs in Memory
• We have been using .org to specify a fixed start location• Typically we will want programs capable of running in
arbitrary locations– If we are concatenating together different modules, the addresses
for identifiers in the different modules must be relocated
• Linker : software that combines separately assembled modules
• Loader : software that loads another program into memory and may modify addresses if the program is loaded in a location different from the origin– Must also set appropriate registers, e.g. %SP
![Page 23: Languages and the Machine](https://reader036.vdocuments.net/reader036/viewer/2022062517/5681332e550346895d9a2b6a/html5/thumbnails/23.jpg)
Linking: .global and .extern• A .global is used in the module that a symbols is defined and .extern is used
in every other module that refers to it
![Page 24: Languages and the Machine](https://reader036.vdocuments.net/reader036/viewer/2022062517/5681332e550346895d9a2b6a/html5/thumbnails/24.jpg)
Linking and Loading
• Symbol tables for previous example• Symbols whose address might change market
relocatable (not all addresses! Some may be fixed)
![Page 25: Languages and the Machine](https://reader036.vdocuments.net/reader036/viewer/2022062517/5681332e550346895d9a2b6a/html5/thumbnails/25.jpg)
DLL’s
• Windows uses Dynamic Link Libraries, or DLL’s• Linking a common routine in many programs results in
duplicate code from that common routine in each program• In a DLL, commonly used routines (e.g. memory
management, graphics) present in only one place, the DLL– Smaller program sizes, each program does not need to have its own
copy– All programs share the exact same code while executing– Don’t need recompiling or relinking
• Disadvantages– Deletion of a shared DLL by mistake can cause problems– Versions must be the same– DLL code file can live in many places in Windows– “DLL Hell”
![Page 26: Languages and the Machine](https://reader036.vdocuments.net/reader036/viewer/2022062517/5681332e550346895d9a2b6a/html5/thumbnails/26.jpg)
Macros• An assembly macro looks kind of like defining a
subroutine• For example, there say that there is no PUSH
instruction to push data on the stack. We can make a macro for push:
![Page 27: Languages and the Machine](https://reader036.vdocuments.net/reader036/viewer/2022062517/5681332e550346895d9a2b6a/html5/thumbnails/27.jpg)
Macro Expansion
• Given the previous macro, we could now write the following code:
push %r15 ! Push r15 on the stackpush %r20 ! Push r20 on the stack
• Upon assembly, these macros are expanded to generate the following actual code:
addcc %r14, -4, %r14 st %r15, %r14 addcc %r14, -4, %r14 st %r20, %r14
![Page 28: Languages and the Machine](https://reader036.vdocuments.net/reader036/viewer/2022062517/5681332e550346895d9a2b6a/html5/thumbnails/28.jpg)
Macros vs. Subroutines
• Later we will see how to write actual subroutines we can call– Only one copy of the shared code in a subroutine
• Tradeoffs– Subroutines
• Takes up less memory since only one copy of the code• But slower than macros; subroutines have overhead of
invoking and returning– Macros
• Take up more space than subroutine call due to macro expansion for each occurrence of the macro
• Faster than subroutines; no overhead to invoke/return
![Page 29: Languages and the Machine](https://reader036.vdocuments.net/reader036/viewer/2022062517/5681332e550346895d9a2b6a/html5/thumbnails/29.jpg)
Skipping for now
• Discussion on Pentium MMX
• We may return to this later if time permits