02 arm architecture
TRANSCRIPT
ARM7TDMI-S CPU
MultiMarket SemiconductorsBL Standard ICs - MicrocontrollersFebruary 2004
Semiconductors 2
ARM Architecture
• Thumb state
• Instruction set
• Processor Modes
• Register usage
• Interrupt Handling
• 3-stage Pipeline
Semiconductors 3
ARM7TDMI-S
The ARM7TDMI-S is based on ARM7 core
– 3 stage pipeline
– Von Neumann architecture
– CPI ~1.9
– T: Thumb instruction set
– D: includes debug extensions
– M: enhanced multiplier (32x8) with instructions for 64-bit results
– I: core has EmbeddedICE logic extensions
– S: fully synthesisable (soft IP)
Semiconductors 4
Thumb State
Semiconductors 5
Thumb state
• ARM uses a 32-bit architecture with a subset of 16-bit instructions, still using 32-bit data and registers.
• Set of instructions re-coded into 16 bits
– Improved code density by ~ 30%
– saving program memory space
• In Thumb state only the program code is 16-bit wide
– after fetching the 16-bit instructions from memory, they are de-compressed to 32 bit instructions before they are decoded and executed
– all operations are still 32-bit operations
Semiconductors 6
ARM and Thumb Interworking
• Switch between ARM state and Thumb state using BX instruction
– In ARM state: BX<condition> Rn
– In Thumb state: BX Rn
Rnn: 0-15
0131
0
0131Destinatio
naddress
ARM / Thumb selection0: ARM state1: Thumb state
BX
Semiconductors 7
Instruction Set
Semiconductors 8
ARM Instruction Set
• All instructions are 32-bits long
• Many instructions execute in a single cycle
• Most of the ARM Instructions can be conditionally executed
• Could be divided into six broad classes of instruction– Branch instructions
– Data Processing instructions
– Status register transfer instructions
– Load and Store instructions
– Coprocessor instructions
– Exception-generating instructions
Semiconductors 9
Thumb Instruction Set
• All instructions are 16-bits long
• Most of the Thumb Instructions cannot be conditionally executed
• Thumb instruction set is subset of ARM instruction set
• It takes more instructions in Thumb to do the same job in ARM resulting in a performance penalty
Semiconductors 10
Processor Modes
Semiconductors 11
Processor Modes(1)
ARM has seven operating modes– User unprivileged mode under which most applications run
– FIQ entered, when a high priority (fast) interrupt is raised
– IRQ general purpose interrupt handling
– Supervisor protected mode for the operating systementered on reset or software interrupt instruction
– System privileged mode using same registers as user mode
– Abort used to handle memory access violations
– Undefined used to handle undefined instructions
Semiconductors 12
Processor Modes(2)
User
System
FIQ
IRQ
Supervisor
Abort
Undefined
Privileged Modes Exception Modes
Semiconductors 13
Privileged and Exception Modes
• Entered when a specific exception occurs
• Each mode has additional registers to prevent corruption
• On Reset ARM core is in Supervisor mode
• Have access to system resources
• Can change modes freely using ARM instructions
FIQ IRQ Supervisor Abort Undefined
Semiconductors 14
User Mode & System Mode
• User Mode:– User mode has access to limited system resources – Cannot change modes freely within User mode– User program can make a supervisor call using the SWI
instruction(SWI- Software Interrupt but is usually called Supervisor call)
• System Mode:– System mode is similar to User mode but used by OS which
needs access to system resources (Privileged)– System mode also used during nested interrupt handling
Semiconductors 15
ARM Registers
Semiconductors 16
Registers (1)
An ARM core has 37 registers (32-bits wide)
• General purpose registers
– 1 program counter
– 30 general purpose registers
• Status registers
– 1 current program status register(CPSR)
– 5 saved program status registers(SPSR)
These registers are not all accessible at the same time. The processor state and operating mode determine which registers are available to the programmer.
Semiconductors 17
Registers (II)
• Depending on processor mode one of several banks is accessible. Each mode can access
– the program counter r15 (PC)
– a particular r13 (stack pointer SP) and r14 (subroutine link register, LR)
– a particular set of r0-r12 registers
– the current program status register (CPSR)
• Privileged modes (except System mode) can also access
– a particular SPSR (saved program status register)
Semiconductors 18
Register Bankingr0
r1
r2
r3
r4
r5
r6
r7
r8
r9
r10
r11
r12
r13 (SP)
r14 (LR)
r15 (PC)
CPSR SPSR_undSPSR_abtSPSR_svcSPSR_irqSPSR_fiq
r13_fiq (SP)
r14_fiq (LR)
r13_irq (SP)
r14_irq (LR)
r13_svc (SP)
r14_svc (LR)
r13_abt (SP)
r14_abt (LR)
r13_und (SP)
r14_und (LR)
r9_fiq
r10_fiq
r11_fiq
r12_fiq
Undefined
AbortSupervisorIRQFIQ
User and System
Banked registers
r8_fiq
Semiconductors 19
Registers in Thumb State
• The Thumb state register set is a subset of the ARM state set. The programmer has direct access to:
– eight general registers r0 - r7
– the program counter PC
– a Stack pointer SP
– a Link register LR
– the current program status register CPSR
• In Thumb state, the high registers (r8 - r12) are not part of the standard register set. The assembly language programmer has limited access to them, but can use them for fast temporary storage
Semiconductors 20
Thumb vs. ARM
Thumb
State
r0
r1
r2
r3
r4
r5
r6
r7
r8
r9
r10
r11
r12
r13 (SP)
r14 (LR)
r15 (PC)
CPSR
SPSR
r0
r1
r2
r3
r4
r5
r6
r7
r13 (SP)
r14 (LR)
r15 (PC)
CPSR
SPSR
ARM
State
Thumb state
Low registers
Thumb state
High registers
Semiconductors 21
Register Overview
r0
r1
r2
r3
r4
r5
r6
r7
r8
r9
r10
r11
r12
r13 (SP)
r14 (LR)
r15 (PC)
CPSR
SPSR_undSPSR_abtSPSR_svcSPSR_irqSPSR_fiq
r13_fiq (SP)
r14_fiq (LR)
r13_irq (SP)
r14_irq (LR)
r13_svc (SP)
r14_svc (LR)
r13_abt (SP)
r14_abt (LR)
r13_und (SP)
r14_und (LR)
r9_fiq
r10_fiq
r11_fiq
r12_fiq
Undefined
AbortSupervisor
IRQFIQUser and System
r0
r1
r2
r3
r4
r5
r6
r7
r8_fiq
r15 (PC)
CPSR
r0
r1
r2
r3
r4
r5
r6
r7
r8
r15 (PC)
CPSR
r9
r10
r11
r12
r0
r1
r2
r3
r4
r5
r6
r7
r8
r15 (PC)
CPSR
r9
r10
r11
r12
r0
r1
r2
r3
r4
r5
r6
r7
r8
r15 (PC)
CPSR
r9
r10
r11
r12
r0
r1
r2
r3
r4
r5
r6
r7
r8
r15 (PC)
CPSR
r9
r10
r11
r12
Thum
b s
tate
Low
re
gis
ters
Thum
b s
tate
Hig
h
regis
ters
Semiconductors 22
Program Status Register (1)
• Condition Code Flags
– N: Negative or less than
– Z: Zero
– C: Carry or borrow or extend
– V: Overflow
To not corrupt reserved bits, a read-modify-write strategy should be applied to change PSR bits.
Condition code flags
N Z C V Q J I F T mode
045678151623242728293031
Reserved Control bits
Semiconductors 23
Program Status Register (2)
Control bits
N Z C V Q J I F T mode
045678151623242728293031
ReservedCondition code flags
• Interrupt Disable Bits
– I: IRQ interrupts disable
– F: FIQ interrupts disable
• T Bit
– Thumb mode (when set)
– ARM mode (when cleared)
• Mode Bits
10000 User
10001 FIQ
10010 IRQ
10011 Supervisor
10111 Abort
11011 Undefined
11111 System
Semiconductors 24
Program Counter (r15)
• When the processor is executing in ARM state
– all instructions are 32 bits wide
– all instructions must be word aligned
– bits [31:2] contain the PC, bits [1:0] are zero
(instructions cannot be halfword or byte aligned)
• When the processor is executing in Thumb state
– all instructions are 16 bits wide
– all instructions must be halfword aligned
– bits [31:1] contain the PC, bit [0] is zero
(instructions cannot be byte aligned)
Semiconductors 25
Interrupt Handling
Semiconductors 26
ARM Exception Vectors and processor mode
Reset Supervisor
Undefined Instruction Undefined
Software Interrupt(SWI) Supervisor
Prefetch Abort Abort
Data Abort Abort
Interrupt(IRQ) IRQ
Fast Interrupt(FIQ) FIQ
Semiconductors 27
Exception Vectors table
Reset
Undefined Instruction
Software Interrupt
Prefetch Abort
Data Abort
(Reserved)
IRQ
FIQ
0x00
0x04
0x08
0x0C
0x10
0x14
0x18
0x1C
.
.
.
Semiconductors 28
Exception Handling
• Entering an exception the ARM core
– saves the address of the next instruction in the appropriate LR
– copies the CPSR into the appropriate SPSR
– sets appropriate CPSR bits
• interrupt disable bits
• mode field bits
• if running in Thumb state, enter ARM state*
– forces PC to fetch next instruction from relevant exception vector*: all exceptions switch to ARM state!
r15 (PC) r14_<mode> (LR)
CPSR SPSR_<mode>
Control bits
I F T mode
045678
CPSR:
Semiconductors 29
Leaving Exception(1)
• To leave an exception, the exception handler must
– copy SPSR back into CPSR
(automatically restoring also I, F and T)
– move contents of current LR minus offset* to PC
*: varies according to type of exception: 2, 4
r15 (PC)r14_<mode> (LR)
PC - offset
CPSRSPSR_<mode>
Control bits
I F T mode
045678
CPSR:
Semiconductors 30
Leaving Exception-Example(2)
• After servicing IRQ execute the following instruction
SUBS PC,R14_irq,#4
• This restores both PC and CPSR
r15 (PC)r14_<irq> (LR) PC - offset
CPSRSPSR_irq>
Semiconductors 31
Multiple Exceptions
• Exception priorities
– When multiple exceptions arise at the same time, a fixed priority sytem determines the order in which they are handled
1. Reset highest priority
2. Data Abort (data memory access cannot be completed)
3. FIQ
4. IRQ
5. Prefetch Abort (instruction memory access cannot be completed)
6. Undefined Instruction
7. SWI - Software Interrupt (to enter supervisor mode) lowest priority
Semiconductors 32
FIQ-Why is it called so ?
• This mode has its own set of banked registers from R8-R12. Hence no or minimal stack operations are required
• FIQ is the last interrupt vector in the vector table. Hence jump is not needed to reach ISR
• ARM recommends only one interrupt source to be classified as FIQ
Semiconductors 33
Interrupt Latency
• Latency could be between 5 to 27 processor clocks
• Ask customers to refer to ARM7TDMI-S Technical Reference Manual for details
Semiconductors 34
Instruction Pipeline
Semiconductors 35
Instruction Pipeline
• The ARM7TDMI-S core uses a pipeline to increase the speed of the flow of instructions to the processor. This enables several operations to take place simultaneously
• The Program Counter (PC) points to the instruction being fetched rather than to the instruction being executed
• During normal operation, while one instruction is being executed, its successor is being decoded, and a third instruction is being fetched from memory
Semiconductors 36
Fetch
Decode
Execute
ARM
PC
PC - 4
PC - 8
Thumb
PC
PC - 2
PC - 4
3-Stage Instruction Pipeline
Instruction Fetched from Memory
Thumb only: Thumb instruction decompressed to ARM instruction
Instruction decoded
Registers read from Register Bank, Shift and ALU operations performed, Registers written back to Register Bank
Semiconductors 37
Optimal Pipelining
2 3 5 7Cycle
1 4 6 8
Fetch Decode
ExecuteFetch Decode
ExecuteFetch Decode
ExecuteFetch Decode
ExecuteFetch Decode
ExecuteFetch Decode
ExecuteFetch DecodeFetch
ADD
SUB
MOV
AND
ORR
EOR
CMP
RSB
– In this example it takes 6 clock cycles to execute 6 instructions
– All operations are on registers (single cycle instructions)
– Clock cycles per instruction (CPI) = 1
Semiconductors 38
Branch Pipeline Example
– Branches break the pipeline
– Example in ARM state
1 2 4 6Cycle
3 5 7
Fetch Decode
ExecuteFetch DecodeFetch
Fetch Decode
Execute
BL
X
X
ADD
SUB
MOV
Linkret
Adjust
Fetch Decode
ExecuteFetch
FetchAND
0x8000
0x8004
0x8008
0x8FEC
0x8FF0
0x8FF4
0x8FF8
Decode
Semiconductors 39
Reference
• ARM Architecture Reference Manual
– Available with ARM tools
– Also available on PDF
• ARM System-on-Chip Architecture
– By Steve Furber