chap. 3 arm cpu architecture. 2 outline 3.1 registers 3.2 memory 3.3 exceptions
TRANSCRIPT
3
3.1 Registers
Introduction ARM Processor Core
Processor Modes
Register Organization
Accessing Registers
The Program Status Registers (CPSR and SPSRs)
Condition Flags
Conditional Execution
4
IntroductionARM has 37 registers in total, all of which are
32-bits long.
1 dedicated program counter
1 dedicated current program status register
5 dedicated saved program status registers
30 general purpose registers
5
Introduction (cont’d)
However these are arranged into several banks, with the accessible bank being governed by the processor mode. Each mode can access
a particular set of r0-r12 registers a particular r13 (the stack pointer) and r14 (link registe
r) r15 (the program counter) cpsr (the current program status register)
and privileged modes can also access a particular spsr (saved program status register)
6
ARM Processor CoreArchitecture
Versions 1 and 2 – Acorn RISC, 26-bit address Version 3 – 32-bit address, CPSR, and SPSR Version 4 – half-word, Thumb Version 5 – BLX, CLZ and BRK instructions
Processor cores ARM7TDMI (Thumb, debug, multiplier, ICE) – version 4T, low-
end ARM core, 3-stage pipeline, 50-100MHz ARM9TDMI – 5-stage pipeline, 130MHz or 200MHz ARM10TDMI – version 5, 300MHz
CPU Core: co-processor, MMU, AMBA ARM 710, 720, 740 ARM 920, 940
7
Processor Modes
The ARM has six operating modes: UserUser (unprivileged mode under which most tasks run) FIQFIQ (entered when a high priority (fast) interrupt is raised) IRQIRQ (entered when a low priority (normal) interrupt is raised) SupervisorSupervisor (entered on reset and when a Software Interrupt
instruction is executed) AbortAbort (used to handle memory access violations) UndefUndef (used to handle undefined instructions)
ARM Architecture Version 4 adds a seventh mode: SystemSystem (privileged mode using the same registers as user mode)
8
Register OrganizationGeneral registers and Program Counter
Program Status Registers
r15 (pc)
r14 (lr)
r13 (sp)
r14_svc
r13_svc
r14_irq
r13_irq
r14_abt
r13_abt
r14_undef
r13_undef
User32 / System FIQ32 Supervisor32 Abort32 IRQ32 Undefined32
cpsr
sprsr_fiqsprsr_fiqsprsr_fiq spsr_abtspsr_svcsprsr_fiqsprsr_fiqspsr_fiq sprsr_fiqsprsr_fiqsprsr_fiqsprsr_fiqsprsr_fiqspsr_irq
r12
r10
r11
r9
r8
r7
r4
r5
r2
r1
r0
r3
r6
r7
r4
r5
r2
r1
r0
r3
r6
r12
r10
r11
r9
r8
r7
r4
r5
r2
r1
r0
r3
r6
r12
r10
r11
r9
r8
r7
r4
r5
r2
r1
r0
r3
r6
r12
r10
r11
r9
r8
r7
r4
r5
r2
r1
r0
r3
r6
r12
r10
r11
r9
r8
r7
r4
r5
r2
r1
r0
r3
r6
r15 (pc) r15 (pc) r15 (pc) r15 (pc) r15 (pc)
cpsrcpsrcpsrcpsrcpsr
r14_fiq
r13_fiq
r12_fiq
r10_fiq
r11_fiq
r9_fiq
r8_fiq
sprsr_fiqsprsr_fiqsprsr_fiqsprsr_fiqsprsr_fiqspsr_undef
Ref. [8]
9
Register Example: User to FIQ Mode
spsr_fiq
cpsr
r7
r4
r5
r2
r1
r0
r3
r6
r15 (pc)
r14_fiq
r13_fiq
r12_fiq
r10_fiq
r11_fiq
r9_fiq
r8_fiq
r14 (lr)
r13 (sp)
r12
r10
r11
r9
r8
User mode CPSR copied to FIQ mode SPSR
cpsr
r15 (pc)
r14 (lr)
r13 (sp)
r12
r10
r11
r9
r8
r7
r4
r5
r2
r1
r0
r3
r6
r14_fiq
r13_fiq
r12_fiq
r10_fiq
r11_fiq
r9_fiq
r8_fiq
Return address calculated from User mode PC value and stored in FIQ mode LR
Registers in use Registers in use
EXCEPTION
User Mode FIQ Mode
spsr_fiq
Ref. [8]
10
Accessing Registers
No breakdown of currently accessible registers. All instructions can access r0-r14 directly. Most instructions also allow use of the PC.
Specific instructions to allow access to CPSR and SPSR.
Note : When in a privileged mode, it is also possible to load / store the (banked out) user mode registers to or from memory.
11
The Program Status Registers (CPSR and SPSRs)
Condition Code Flags N = Negative result from ALU flag. Z = Zero result from ALU flag. C = ALU operation Carried out V = ALU operation oVerflowed
Interrupt Disable bits. I = 1, disables the IRQ. F = 1, disables the FIQ.
T Bit: Processor in ARM (0) or Thumb (1) Mode Bits: processor mode
Copies of the ALU status flags (latched if the instruction has the "S" bit set).
N ModeZ C V
2831 8 4 0
I F T
12
Condition Flags
Flag Logical Instruction Arithmetic Instruction
Negative No meaning Bit 31 of the result has been set(N=‘1’) Indicates a negative number in
signed operations
Zero Result is all zeroes Result of operation was zero(Z=‘1’)
Carry After Shift operation Result was greater than 32 bits(C=‘1’) ‘1’ was left in carry flag
oVerflow No meaning Result was greater than 31 bits(V=‘1’) Indicates a possible corruption of
the sign bit in signed numbers
13
The Condition Field of Instruction Set2831 24 20 16 12 8 4 0
Cond
Not equalUnsigned higher or sameUnsigned lowerMinus
Equal
OverflowNo overflowUnsigned higherUnsigned lower or same
Positive or Zero
Less thanGreater thanLess than or equalAlways
Greater or equal
EQNECS/HSCC/LO
PLVS
HILSGELTGTLEAL
MI
VC
Suffix Description
Z=0C=1C=0
Z=1Flags
N=1N=0V=1V=0C=1 & Z=0C=0 or Z=1N=VN!=VZ=0 & N=VZ=1 or N=!Vnone
0000000100100011
01010110
1000100110101011110011011110
0100
0111
Code
14
Conditional ExecutionMost instruction sets only allow branches to be
executed conditionally.
However by reusing the condition evaluation hardware, ARM effectively increases number of instructions.
All instructions contain a condition field which determines whether the CPU will execute them.
Non-executed instructions soak up 1 cycle.
Still have to complete cycle so as to allow fetching and decoding of following instructions.
15
Conditional Execution (cont’d)
This removes the need for many branches, which stall the pipeline (3 cycles to refill).
Allows very dense in-line code, without branches.
The time penalty of not executing several conditional instructions is frequently less than overhead of the branch or subroutine call that would otherwise be needed.
16
Using and updating the Condition Field
To execute an instruction conditionally, simply postfix it with the appropriate condition:
For example an add instruction takes the form: ADD r0,r1,r2 ; r0 = r1 + r2 (ADDAL)
To execute this only if the zero flag is set: ADDEQ r0,r1,r2 ; If zero flag set then…
; ... r0 = r1 + r2
17
Using and updating the Condition Field (cont’d)
By default, data processing operations do not affect the condition flags (apart from the comparisons where this is the only effect).
To cause the condition flags to be updated, the S bit of the instruction needs to be set by postfixing the instruction (and any condition code) with an “S”.
For example to add two numbers and set the condition flags:
ADDS r0,r1,r2 ; r0 = r1 + r2 ; ... and set flags
18
Conditional Execution Example Greatest Common Divisor (最大公因數 )
Normal Assembler
gcd cmp r0, r1 ;reached the end? beq stop blt less ;if r0 > r1 sub r0, r0, r1 ;subtract r1 from r0 bal gcdless sub r1, r1, r0 ;subtract r0 from r1 bal gcdstop
ARM Conditional Assembler
gcd cmp r0, r1 ;if r0 > r1 subgt r0, r0, r1 ;subtract r1 from r0 sublt r1, r1, r0 ;else subtract r0 from r1 bne gcd ;reached the end?
19
The Barrel ShifterARM has a barrel shifter which provides a
mechanism to carry out shifts as part of other instructions.
Immediate value 8 bit number can be rotated right through an
even number of positions. assembler will calculate rotate
for you from constant.
Register, optionally with shift operation applied.
Shift value can be either be: 5 bit unsigned integer specified in bottom byte of
another register.
Operand 1
Result
ALU
Barrel Shifter
Operand 2
20
Second Operand: Shifted Register
Using a multiplication instruction to multiply by a constant load the constant into a register wait for a number of internal cycles for the multiplication to complete
A more optimum solution can be often found by using some combination of MOVs, ADDs, SUBs and RSBs with shifts.
Multiplications by a constant equal to a ((power of 2) ± 1) can be done in one cycle.
Example: r0 = r1 * 5 r0 = r1 + (r1 * 4)
ADD r0, r1, r1, LSL #2
Example: r2 = r3 * 105 r2 = r3 * 15 * 7 r2 = r3 * (16 - 1) * (8 - 1) RSB r2, r3, r3, LSL #4 ; r2 = r3 * 15
RSB r2, r2, r2, LSL #3 ; r2 = r2 * 7
23
IntroductionWord, half-word alignment (xxxx00 or xxxxx0)
ARM can be set up to access data in either little-endian or big-endian format, through they default to little-endian.
The ARM uses a pipeline in order to increase the speed of the flow of instructions to the processor.
Allows several operations to be undertaken simultaneously, rather than serially.
Rather than pointing to the instruction being executed, the PC points to the instruction being fetched.
24
Memory Organization
byte3 byte2 byte1 byte0
3 2 1 0
7 6 5 4
byte6 half-word4
11 10 9 8
--------------word8-----------------
-------------word16----------------
half-word14 half-word12
12131415
16171819
20212223
bit 31 bit 0
byteaddress
byte0 byte1 byte2 byte3
0 1 2 3
4 5 6 7
byte5 half-word6
8 9 10 11
--------------word8-----------------
-------------word16----------------
half-word12 half-word14
15141312
19181716
23222120
bit 31 bit 0
byteaddress
(a) Little-endian memory organization
(b) Big-endian memory organization
25
Pipeline
3 stages (ARM7) and 5 stages (ARM9TDMI)
fetch
memory
executedecode
fetch write-backexecutedecode
PC PC-4 PC-8 access memory write resultLoad an instruction if needed to registerfrom memory
26
Memory Access The ARM7/9 is a Von Neumann, load/store architecture, i.e.,
Only 32 bit data bus for both instr. and data. Only the load/store instr. (and SWP) access memory.
Memory is addressed as a 32 bit address space.
Data type can be 8 bit bytes, 16 bit half-words, or 32 bit words, and may be seen as a byte line folded into 4-byte words.
Words must be aligned to 4 byte boundaries, and half-words to 2 byte boundaries.
Always ensure that memory controller supports all three access sizes.
28
3.3 Exceptions
Introduction
Types of ARM exceptions
Exception and Interrupt
Entering an Exception
Returning from an Exception
Exception Entry/Exit
Exceptions and the Vector Table Address
29
IntroductionExceptions and interrupts break the sequential flow of a
program, jumping to architecturally defined memory locations.
In ARM, Software Interrupt (SWI) is the “system call” exception.
30
Types of ARM exceptions reset
when CPU reset pin is asserted
undefined instruction when CPU tries to execute an undefined op-code
software interrupt when CPU executes the SWI instruction
prefetch abort when CPU tries to execute an instruction pre-fetched from an illegal addr
data abort when data transfer instruction tries to read or write at an illegal address
IRQ when CPU's external interrupt request pin is asserted
FIQ when CPU's external fast interrupt request pin is asserted
31
Exception and InterruptThe terms exception and interrupt are often confused.
Exception usually refers to an internal CPU event such as floating point overflow MMU fault (e.g., page fault) trap (SWI)
Interrupt usually refers to an external I/O event such as I/O device request Timer interrupt
In the ARM architecture manuals, the two terms are mixed together.
32
Entering an ExceptionWhen an exception is generated, the processor takes the
following actions:
Copy the CPSR to SPSR for the mode in which the exception is to be handled.
Change the appropriate CPSR mode bits in order to Change to the appropriate mode, and map in the appropriate
banked registers for that mode. Disable interrupts.
IRQs are disabled when any exception occurs.FIQs are disabled when a FIQ occurs, and on reset.
Set lr_mode to the return address. Set PC to the vector address for the exception.
33
Returning from an ExceptionThe actions taken by the processor
Restore the CPSR from SPSR_mode Restore the PC using return address stored in lr_mode
The way to return depends on whether a stack is used during entering the subroutine Without a stack
Performing a data processing instruction with S flag set and the PC as the destination register.
With a stack Restoring the saved registers by performing
LDMFD sp!, {r0-r12, pc}^^ indicates that the CPSR is restored from the SPSR
34
Returning from SWI and Undefined Instruction Handlers
SWI and UI exceptions are generated by the instruction itself, so the PC is not updated when the exception is taken. Thus, storing (PC-4) in lr_mode makes lr_mode points to the next instruction be executed.
SWI xxx being executed PC-8INST-1 being decoded PC-4INST-2 being fetched PC
Restoring the PC from lr Without a stack
MOVS pc, lr With a stack
STMFD sp!, {reglist,lr}…LDMFD sp!, {reglist, pc}^
35
Returning from FIQ and IRQ Check IRQ and FIQ at the end of executing each instruction.
INST-1 being executed PC-12 , IRQ or FIQ checkedINST-2 being decoded PC-8INST-3 being fetched PC-4INST-4 PC
Restoring PC from lr Without a stack
SUBS pc, lr, #4 With a stack
SUB lr, lr, #4STMFD sp!, {reglist, lr}…LDMFD sp!, {reglist, pc}^
36
Returning from Prefetch Abort Prefetch abort is generated when it reaches the execution stage.
INST-1 being executed PC-8 , AbortedINST-2 being decoded PC-4INST-3 being fetched PC
Restoring PC from lr Without a stack
SUBS pc, lr, #4 With a stack
SUB lr, lr, #4STMFD sp!, {reglist, lr}…LDMFD sp!, {reglist, pc}^
37
Returning from Data Abort When a data abort occurs, the program counter has been updated.
INST-1 being executed PC-12 , AbortedINST-2 being decoded PC-8INST-3 being fetched PC-4INST-4 PC
Restoring PC from lr Without a stack
SUBS pc, lr, #8 With a stack
SUB lr, lr, #8STMFD sp!, {reglist, lr}…LDMFD sp!, {reglist, pc}^
39
Exceptions and the Vector Table Address
Exception Vectors
PriorityAddress Exception type Exception Mode (1=high,6=low)0x00000000 Reset Supervisor 10x00000004 Undefined instruction Undefined 60x00000008 Software Interrupt Supervisor 60x0000000C Abort (prefetch) Abort 50x00000010 Abort (data) Abort 20x00000014 Reserved Reserved Not applicable0x00000018 IRQ IRQ 40x0000001C FIQ FIQ 3
40
References[1] Andrew Sloss, Dominic Symes and Chris Wright, “ARM System Developer's
Guide“, published by MORGAN KAUFFMAN, 2004
[2] David Seal, “ARM Architecture Reference Manual “, published by Addison-Wesley, 2000
[3] ARM DUI 0021A “Programming Techniques“, 1995
[4] http://www.samsung.com/Products/Semiconductor/SystemLSI/Networks/PersonalNTASSP/CommunicationProcessor/S3C4510B/um_s3c4510b_rev1.pdf
[5] www.arm.com
[6] http://www.arm.com/pdfs/DUI0056D_ADS1_2_Dev.pdf
[7] http://nthucad.cs.nthu.edu.tw/~wcyao/
[8] www-courses.cs.uiuc.edu/ ~cs433/Processors/ARM/ARMInstV1.0.ppt
41
Exercise
1. How many registers does ARM have? And what purpose do they use for?
2. Describe the processor modes of ARM in detail.
3. What is the meaning of ARM condition flags in logical instructions and in Arithmetic Instructions?
4. What is the benefit of Conditional Execution?
5. What is the difference of little-endian and big-endian format?
6. Please describe ARM memory cycle types and how to decide which type.
7. List all type of ARM exceptions and describe their addresses and priorities.