c programming and assembly language janakiraman v – jramaanv@yahoo.comjramaanv@yahoo.com nitk...
Post on 15-Dec-2015
215 Views
Preview:
TRANSCRIPT
C Programming and Assembly Language
Janakiraman V – jramaanv@yahoo.com
NITK Surathkal
2nd August 2014
Motivation
Do you know how all this is implemented in assembly?
Agenda
•Brief introduction to the 8086 processor architecture
•Describe commonly used assembly instructions
•Use of stack and related instructions
•Translate high level function calls into low level assembly language
•Familiarize the calling conventions
•Explain how variables are passed and accessed
8086 Architecture
•ALU – Arithmetic and Logical unit – The heart of the processor
•Control Unit – Decodes instructions, Controls the execution flow
•Registers – Implicit memory locations within the processor
•Registers – Serve as arguments to most operations
•Flags – All ALU operations will set particular bits after execution
Registers
•EAX – Stores integer return values
•ECX – Stores the counters for loops and also stores “THIS” pointer
•EIP –Instruction pointer. Stores the address of the next instruction to be executed
•ESP – The Stack pointer. Implicitly changed during Call/ Ret instructions.
•EBP – Base pointer. Used to access local variables and function parameters.
Registers Contd…
•EBX – A general purpose register
•ESI– The source index register for string instructions
•EDI - The destination index registers for string instructions
•EFL – Flag register. Stores the flag bits of various flags like Carry, Zero, etc.
•Segment registers point to a segment of memory. EDS, ESS, EES, ECS
•EDX – Stores high 32 bits of 64 bit values
Instruction Set
•Data transfer
•Arithmetic and logical
•Stack Operations
•Branching and Looping
•Function calls
•String Instructions
•Prefix to instructions
Data transfer instructions
MOV Destination, Source - Format» Data transfer is always from RIGHT to LEFT.
» Source Register is unaffected.
LEA – Load effective address.» Loads the offset Address of the specified variable into the
destination.
» Equivalent of int y = &x;
Arithmetic and Logical instructions
•Operation destination, source – Format
»ADD AX, BX
»SUB AX, [BX]
»OR AX, [BX+4]
»XOR AX, AX – Fastest way to clear registers
Exercise 1
Write an assembly program to evaluate the following expression. (All variables are 32 bit integers)
» EAX = x*y + a – b
» EBX =( x^y) | ( a&b)
int x=4, y=6, a=3, b=2;
__asm
{
MOV EAX, x
MUL y
ADD EAX, a
SUB EAX, b
MOV EBX, x
XOR EBX, y
MOV ECX, a
AND ECX, b
OR EBX, ECX
}
Branching and Looping
•JMP Addr – Loads EIP with Addr
•Conditional Jumps» Transfers control based on a condition
» Based on state of one or more flags
» ALU operation sets flags
Exercise 2
Write an assembly program to evaluate the expression “ z = x * y ”using
» Repeated addition
» MUL instruction
Write an assembly program to calculate the string length of a constant string
Multiplication by repeated addition.
int x =9, y=10, z=0;
__asm
{
XOR EAX, EAX
MOV EBX, y
MULT: ADD EAX, x
DEC EBX
JNZ MULT
MOV z, EAX
}
String length of a constant string
char* pChar = “Test data";
MOV EDI, pChar
XOR ECX, ECX
COMPARE: CMP [EDI], 0
JNZ INCREASE
JMP DONE
INCREASE: INC ECX
INC EDI
JMP COMPARE
DONE: MOV len, ECX
Stack Operations
PUSH: PUSH EAX
» ESP decreases by 4/ 2/ 1
» Data is moved on to top of stack
» Used extensively to pass parameters to functions.
POP: POP EAX
» ESP increases 4/ 2/ 1
» Data is copied to the destination
» Compliment of PUSH
Exercise 3
Write an assembly program to swap two integers x and y.
Write a C program to swap two numbers using a function Swap(int* pX, int* pY). Implement the Swap function directly in assembly language
Swap two integers.
int x=4, y=5;
__asm
{
PUSH x
PUSH y
POP x
POP y
}
Function to swap variables
void swap(int* pX, int* pY)
{
__asm
{
MOV EAX, pX
MOV EBX, pY
PUSH DWORD PTR [EAX]
PUSH DWORD PTR [EBX]
POP DWROD PTR [EAX]
POP DWORD PTR [EBX]
}
}
Function calls
CALL – CALL ADDR» Used for function calls.
» Implicitly pushes the EIP on to the stack.
» Reads the address specified (ADDR) and loads EIP with ADDR.
RET – RET n» Used to return to the calling function.
» Implicitly pops the DWORD on the TOS into EIP.
» ‘n’ Specifies the number to be added to ESP after returning. Used for stack clean up.
Compile the C program!!
int g_iVar = 5;
void main()
{
int z=0;
z = Fn(2,4);
g_iVar = z;
}
int Fn(int x, int y)
{
int z=0;
z = x+ y
return z;
}
C and assembly language - FAQ
•How are function calls in ‘C’ translated into assembly?
•How are parameters passed to the function?
•What does it mean to say local variables are stored on stack? Scope of local variables!
•How are global variables accessed?
C and Assembly language Contd….
•Cannot pass many parameters in registers
•Scope – Desirable feature
•Stack – Ideal to store local variables
•ESP cannot be used to access the local variables
•EBP is used to access them!!!
Parameters, Local and Global variables
•Before a function is called parameters are pushed onto stack
•Parameters are accessed by [EBP +n]
•Local variables are accessed by [EBP –n]
•Integers are returned in EAX
•Global variables are accessed by direct address values
Compile the C program Contd…
void main()
{
int z=0;
MOV z, 0
z = Fn(2,4);
PUSH 0x00000004
PUSH 0x00000002
CALL Fn
MOV z, EAX
g_iVal = z;
MOV [g_iVar], EAX
}
int Fn(int x, int y)
{
int z=0;
MOV z, 0
z = x+ y;
MOV EAX, x
ADD EAX, y
MOV z, EAX
return z;
RET
}
Compile the C Program Contd….
CODE SEGMENT – Function – main()
.
int z = 0;
C100 MOV [EBP-4], 0
z = Fn(2,4);
C101 PUSH 0x00000004
C102 PUSH 0x00000002
C103 Call C200
C104 MOV [EBP-4], EAX
g_iVar = z;
C105 MOV [g_iVar], EAX
.
.
STACK SEGMENT
0x00000004
0x00000002
C104
ESP
ESP
ESP
ESP
ESP
0x00000000 local var Z
EBP
Compile the C Program Contd….
CODE SEGMENT – Function – Fn()
C200 MOV EBP, ESP
C201 SUB ESP, 0x40
int z=0;
C202 MOV [EBP-4], 0
z = x+ y
C203 MOV EAX, [EBP+4]
C204 ADD EAX, [EBP+8]
C205 MOV [EBP-4], EAX
return z;
C206 ADD ESP, 0x40
C206 RET
STACK SEGMENT
0x00000004
0x00000002
C104ESP
EBP
ESP
Local variable space
0x00000000
ESP
0x000000060x00000006Z
EBP
0x00000000 local var Z
CODE SEGMENT – Function – main()
.
int z = 0;
C100 MOV [EBP-4], 0
z = Fn(2,4);
C101 PUSH 0x00000004
C102 PUSH 0x00000002
C103 Call C200
C104 MOV [EBP-4], EAX
g_iVar = z;
C105 MOV [g_iVar], EAX
C106 RET
STACK SEGMENT
0x00000004
0x00000002
C104
0x00000000 Local var Z
ESP
EBP
0x000000060x00000006Stack corruption!!!!!
You have accessed the stack of the function “Fn()”
You computer will now REBOOT!!!!!
Compile the C Program Contd….
CODE SEGMENT – Function – main()
.
int z = 0;
C100 MOV [EBP-4], 0
z = Fn(2,4);
C101 PUSH 0x00000004
C102 PUSH 0x00000002
C103 Call C200
C104 MOV [EBP-4], EAX
g_iVar = z;
C105 MOV [g_iVar], EAX
.
.
STACK SEGMENT
0x00000004
0x00000002
C104
ESP
ESP
ESP
ESP
ESP
0x00000000 local var Z
EBP
Compile the C Program Contd….
CODE SEGMENT – Function – Fn()
C200 PUSH EBP
C202 MOV EBP, ESP
C203 SUB ESP, 0x40
int z=0;
C204 MOV [EBP-4], 0
z = x+ y
C205 MOV EAX, [EBP+8]
C206 ADD EAX, [EBP+12]
C207 MOV [EBP-4], EAX
return z;
C208 ADD ESP, 0x40
C209 POP EBP
C20A RET 8
STACK SEGMENT
0x00000004
0x00000002
C104ESPEBP
ESPLocal variable space
0x00000000
ESP
0x000000060x00000006Z
EBP
0x00000000 local var Z
EBP - main()ESP
EBP
ESP
CODE SEGMENT – Function – main()
.
int z = 0;
C100 MOV [EBP-4], 0
z = Fn(2,4);
C101 PUSH 0x00000004
C102 PUSH 0x00000002
C103 Call C200
C104 MOV [EBP-4], EAX
g_iVar = z;
C105 MOV [g_iVar], EAX
C106 Epilogue
STACK SEGMENT
0x00000004
0x00000002
C104
0x00000000 Local var Z
ESP
EBP
0x0000006
0x00000006
ESP
ESP
Function calls in C - Summary
Function call gets translated to CALL addr
Prologue
» Store the current EBP on stack
» Set up the stack - Initialize the EBP
» Allocate space for local variables.
Execute the function accordingly
Epilogue
» Set the ESP to its original value
» Set the EBP back to its original value
Stack clean up
•When?» Happens after returning from a function
•Why?» Undo the effect of pushing parameters
•How?» RET N or ADD ESP, N
C Program Assembly Contd…
void main()
{
int z = 0;
z = Function(2, 4);
}
/*Contd……*/
Prologue
MOV [EBP-4], 0
PUSH 0x00000004
PUSH 0x00000002
CALL Function
MOV [EBP-4], EAX
Epilogue
Contd……
C Program Assembly Contd…
int Function(int a, int b)
{
int c=0;
c = a + b;
return c;
}
PUSH EBP
MOV EBP, ESP --------- Prologue
SUB ESP, N
MOV [EBP-4], 0
MOV EAX, [EBP + 8] --- Body
ADD EAX, [EBP+12]
MOV [EBP-4], EAX
ADD ESP, N
POP EBP ----------------- Epilogue
RET 8
Calling conventions
__cdecl » Default calling convention of C functions
» Needed for variable argument list
» Caller cleans the stack - ADD ESP, N instruction
__stdcall» Faster than the __cdecl call.
» Callee cleans the stack - RET N instruction
Contd……
Back to Exercise 3
Write a C program to swap two numbers using a function Swap(int* pX, int* pY). Implement the Swap function directly in assembly language
Function to swap variables
void swap(int* pX, int* pY)
{
__asm
{
PUSH DWORD PTR [pX]
PUSH DWORD PTR [pY]
POP DWROD PTR [pX]
POP DWORD PTR [pY]
}
}
Function to swap variables
void swap(int* pX, int* pY)
{
__asm
{
MOV DWORD PTR EAX, [EBP+4]
MOV DWORD PTR EBX, [EBP+8]
PUSH DWORD PTR [EAX]
PUSH DWORD PTR [EBX]
POP DWROD PTR [EAX]
POP DWORD PTR [EBX]
}
}
Function to swap variables
void swap(int* pX, int* pY)
{
__asm
{
PUSH DWORD PTR [[EBP+4]]
PUSH DWORD PTR [[EBP+8]]
POP DWROD PTR [[EBP+4]]
POP DWORD PTR [[EBP+8]]
}
}
Double indirection is not a valid instruction
What about C++?
struct stTest
{
int x;
int y;
};
void FnTest(stTest* pSt)
{
pSt->x = 0;
pSt->y = 1;
}
void main()
{
stTest obj;
FnTest(&obj);
}
class clsTest
{
int x;
int y;
public:
void FnTest()
{
x = 0;y=1;
}
};
void main()
{
clsTest obj;
obj.FnTest();
}
Calling convention Contd…
this call – The C++ calling convention» Behaves like the __cdecl call in most ways
» This pointer is passed in the ECX register
» Stores the this pointer in [EBP-4] location on stack
String Instructions
•Uses ESI, EDI as its operands.
•After the operation ESI and EDI are automatically Incremented/ Decremented depending on the direction flag.
•Usually used with the Prefix instructions.
•Very efficient for standard looping instructions.
Prefix to instructions
REP – REP MOVSB» Used to repeat instructions unconditionally
» Implicitly decrements ECX by 1 after each execution
» Stops once ECX = 0
REPNE/ REPE – REPE SCASB» Used to repeat instructions conditionally
» Implicitly decrements ECX by 1 after each execution
» Stops once ECX = 0 or ZERO flag is set/ reset
Optimized C functions
•Memcpy
•Strlen
•Memset
top related