lecture 2: basic instructions cs 2011 fall 2014, dr. rozier
TRANSCRIPT
Lecture 2: Basic Instructions
CS 2011
Fall 2014, Dr. Rozier
PROBLEM SETS
Consider the following processors, P1, P2, and P3 executing the same instruction set with clock rates and CPI as indicated
1.Which processor has the highest performance in terms of instructions per second?2.If the processors each execute a program in 10s, find the number of cycles and the number of instructions3.We are trying to reduce the execution time by 30% but this leads to an increase in CPI of 20%. What clock rate should we have to get this reduction?
Processor Clock Rate CPI
P1 3 GHz 1.5
P2 2.5 GHz 1.0
P3 4 GHz 2.2
Consider a computer running code with four main routines, A, B, C, and D.
1.How much is the total time reduced if the time for Routine A is reduced by 20%?2.How much is the time for Routine B reduced if the total time is reduced by 20%?3.Can the total time be reduced by 20% by only reducing the time for Routine D?
Routine A Routine B Routine C Routine D Total Time
40s 90s 60s 20s 210s
Consider a computer running code with four main routines, A, B, C, and D.
1.How much is the total time reduced if the time for Routine A is reduced by 20%?2.How much is the time for Routine B reduced if the total time is reduced by 20%?3.Can the total time be reduced by 20% by only reducing the time for Routine D?
Routine A Routine B Routine C Routine D Total Time
Exec Time 40s 90s 60s 20s 210s
Instructions 50x10^6 110x10^6 80x10^6 16x10^6 -
Avg CPI 1 1 4 2 -
Consider a computer running code with four main routines, A, B, C, and D.
1.How much must we improve the CPI of Routine A if we want the program to run twice as fast?2.How much must we improve the CPI of Routine C if we want the program to run twice as fast?3.How much is the execution time improved if the CPI of routines A and B are reduced by 40%, and the CPI of routines C and D are reduced by 30%?
Routine A Routine B Routine C Routine D Total Time
Exec Time 40s 90s 60s 20s 210s
Instructions 50x10^6 110x10^6 80x10^6 16x10^6 -
Avg CPI 1 1 4 2 -
REPRESENTING NUMBERS
Networking and Communication
• What if we encode the signal into pulses?
• Detect if the value is above or below some threshold, and decide it represents a 1, or a 0.
• Strings of 1’s and 0’s can be interpreted as a number.
Some simple things we can represent with 1’s and 0’s
• True or false…– 1 – true– 0 – false– We already were doing this with pure signals.
Some simple things we can represent with 1’s and 0’s
• Integers
• Examples– 00000000 – 0 - 00000010 - 2– 00000001 – 1 - 00001010 – 10– 00000011 – 3 - 10010011 – 147
Unsigned Binary Integers• Given an n-bit number
00
11
2n2n
1n1n 2x2x2x2xx
Range: 0 to +2n – 1 Example
0000 0000 0000 0000 0000 0000 0000 10112
= 0 + … + 1×23 + 0×22 +1×21 +1×20
= 0 + … + 8 + 0 + 2 + 1 = 1110
Using 32 bits 0 to +4,294,967,295
Hexadecimal• Base 16
– Compact representation of bit strings– 4 bits per hex digit
0 0000 4 0100 8 1000 c 1100
1 0001 5 0101 9 1001 d 1101
2 0010 6 0110 a 1010 e 1110
3 0011 7 0111 b 1011 f 1111
Example: eca8 6420 1110 1100 1010 1000 0110 0100 0010 0000
BASIC INSTRUCTIONS
Instruction Set
• The repertoire of instructions of a computer• Different computers have different instruction
sets– But with many aspects in common
• Early computers had very simple instruction sets– Simplified implementation
• Many modern computers also have simple instruction sets
MIPS vs ARMv6
• The book uses the MIPS instruction set.• We will be using ARMv6 in our labs.
• Both are RISC (reduced instruction set computer) architectures.– Many similarities.
MIPS
• Used in many embedded systems– Routers, gateways– Playstation 2 and PSP
• Invented by Prof John Hennessy at Stanford, the first RISC architecture.
ARM
• Introduced in 1985• Focused on low-power friendly operation.• Since 2005, over 98% of all mobile phones had
at least one ARM processor.• Over 37 billion ARM processors in use in 2013.
• Rapidly becoming the dominant processor architecture in the world.
Instructions
• C code:– f = (g + h) – (i + j);
• Compile ARM code:– add r0, r3, r4 # temp t0 = g + h– add r1, r5, r6 # temp t1 = i + j– sub r2, r0, r1 # f = t0 – t1
Register Operands
• Instructions use registers for operands.• Registers are extremely fast SRAM locations
that are directly accessible by the processor.– Very fast, but very expensive, so very small.
Registers
• Each register holds a word (4 bytes).• Registers r0-r12 are general purpose.
Name Function Name Function
r0 General Purpose r8 General Purpose
r1 General Purpose r9 General Purpose
r2 General Purpose r10 General Purpose
r3 General Purpose r11 General Purpose
r4 General Purpose r12 General Purpose
r5 General Purpose r13 Stack Pointer
r6 General Purpose r14 Link Register
r7 General Purpose r15 Program Counter
Registers
• Registers r13 – r15 have special purposes• The PC, r15, is very dangerous.
Name Function Name Function
r0 General Purpose r8 General Purpose
r1 General Purpose r9 General Purpose
r2 General Purpose r10 General Purpose
r3 General Purpose r11 General Purpose
r4 General Purpose r12 General Purpose
r5 General Purpose r13 Stack Pointer
r6 General Purpose r14 Link Register
r7 General Purpose r15 Program Counter
Registers
• The register r13 holds the stack pointer– Also called sp– Points to a special part of memory called the
stack.– More about this later.
Registers
• The register r14 holds the link register– Also called lr– Holds the value of a return address that allows for
fast and efficient implementation of subroutines.
Registers
• The register r15 holds the program counter– Also called pc– Holds an address of an instruction. Keeps track of
where your program is in its execution of machine code.
– PC holds the address of the instruction to be fetched next.
Registers
• One additional register, the “current program status register”
• Four most significant bits hold flags which indicate the presence or absence of certain conditions.
31 30 29 28 27…8 7 6 5 4…0
N Z C V Reserved I F T MODE
Registers
• N – negative flag• Z – zero flag• C – carry flag• V – overflow flag
31 30 29 28 27…8 7 6 5 4…0
N Z C V Reserved I F T MODE
Registers
• N – set by an instruction if the result is negative (set equal to the two’s complement sign bit)
• N – negative flag• Z – zero flag• C – carry flag• V – overflow flag
31 30 29 28 27…8 7 6 5 4…0
N Z C V Reserved I F T MODE
Registers
• Z – set by an instruction if the result of the instruction is zero.
• N – negative flag• Z – zero flag• C – carry flag• V – overflow flag
31 30 29 28 27…8 7 6 5 4…0
N Z C V Reserved I F T MODE
Registers
• C – set by an instruction if the result of an unsigned operation overflows the 32-bit register. Can be used for 64-bit arithmetic
• N – negative flag• Z – zero flag• C – carry flag• V – overflow flag
31 30 29 28 27…8 7 6 5 4…0
N Z C V Reserved I F T MODE
Registers
• V – works the same as the C flag, but for signed operations.
• N – negative flag• Z – zero flag• C – carry flag• V – overflow flag
31 30 29 28 27…8 7 6 5 4…0
N Z C V Reserved I F T MODE
MORE ABOUT THESE LATER…
The Memory Hierarchy
Load-Store Architecture
• RISC architectures, like ARM and MIPS utilize a load-store architecture.
• Memory cannot be part of arithmetic operations.– Only registers can do this
• Access memory is through loads and stores.
Register Memory Architecture
• Featured on many CISC architectures, like x86
• Allows direct access to memory by instructions.
Load Store and ARM
• Register space is pretty cramped!!!• LoaD to a Register with LDR• SToRe to memory with STR• ldr <register>, [<base>{,<offset>}]
– Loads a byte from <base>+<offset> into <register>
• str <register>, [<base>{,<offset>}]– Stores a byte from <register> into <base>+<offset>
Load Store and ARM
• Example– ldr r0, [r1,r2]
• Load data from location r1+r2 into r0.
– ldr r0, =string• Load data from label string into r0.
• Special cases exist, see ARM manual– Example: ldrb loads a single byte, padded with
zeros.
Constants or Immediates
• Operands can contain registers, or immediate values.– An immediate is like a constant– Represent immediates as follows:
• #20• add r0, r1, #20 – adds 20 to the value of r1 and stores it
in r0.
Arithmetic Instructions
• Addition– add, adc, adds, etc
• Subtraction– sub, sbc, rsb, subs, etc
• Multiply– mul, mla, etc
Move Instruction
• mov <destination>, <operand>– mov r0, r1 – copy the contents of r1 into r0.– mov r0, #20 – copy an immediate value of 20 into
r0.
• mvn <destination>, <operand>– Move negative, negates operand before copying it.
Compare Instructions
• cmp <operand1>, <operand2>• cmn <operand1>, <operand2>
• Don’t change the operands, update special status register flags.
• cmp – subtracts operand2 from operand1 and discards the result.
• cmn – adds operand2 to operand1 and discards the result.
Status Register Flags
• Compare instructions and the special “S” versions of instructions (adds, subs, movs) set the status register flags.
• Can be used with conditional suffixes to make conditionally executed instructions.
Conditional Execution
• Just as the special “S” suffix can be added to set status flags, other suffixes can be added to act on status flags.
EQ: Equal Z=1
• Using the EQ suffix on an instruction will cause it to only be executed if the zero flag is set.
cmp r0, r1 @ Set flags based on r0-r1
adds r0, r1, r2 @ Set flags based on r0 = r1 + r2
movs r0, r1 @ Set flags based on r0 = r1
EQ: Equal Z=1
• Using the EQ suffix on an instruction will cause it to only be executed if the zero flag is set.
Examplecmp r0, r1 @ Set flags based on r0-r1addeq r2, r0, r1 @ Conditional addition
NE: Equal Z=0
• Using the NE suffix on an instruction will cause it to only be executed if the zero flag is not set.
Other conditional suffixes
• VS – overflow set, V=1• VC – overflow clear, V=0• MI – minus set, N=1• PL – minus clear, N=0• CS – carry set, C=1• CC – carry clear, C=0• AL – always, unconditional• NV – never, unconditional
Multiple Conditional Suffixes
• HI – higher (unsigned), C=1 and Z=0– Unsigned greater than
• LS – lower (unsigned), C=0 and Z=1– Unsigned less than
• GE – greater or equal (signed), N=1, V=1 OR N=0, V=0– Signed greater than or equal to
• LT – less than (signed), N=1, V=0, OR N=0,V=1– Signed less than
Multiple Conditional Suffixes
• GT – greater than (signed), N=1, V=1, OR N=0, V=0 AND Z=0– Signed greater than
• LE – less than or equal (signed), N=1, V=0, OR N=0, V=1, OR Z=1– Signed less than or equal to
For next time
Continue discussion on basic instructions.