chap. 3 arm cpu architecture. 2 outline 3.1 registers 3.2 memory 3.3 exceptions

41
Chap. 3 ARM CPU Architecture

Upload: gloria-glenn

Post on 31-Dec-2015

220 views

Category:

Documents


1 download

TRANSCRIPT

Chap. 3

ARM CPU Architecture

2

Outline3.1 Registers

3.2 Memory

3.3 Exceptions

3

3.1 Registers

Introduction ARM Processor Core

Processor Modes

Register Organization

Accessing Registers

The Program Status Registers (CPSR and SPSRs)

Condition Flags

Conditional Execution

4

IntroductionARM has 37 registers in total, all of which are

32-bits long.

1 dedicated program counter

1 dedicated current program status register

5 dedicated saved program status registers

30 general purpose registers

5

Introduction (cont’d)

However these are arranged into several banks, with the accessible bank being governed by the processor mode. Each mode can access

a particular set of r0-r12 registers a particular r13 (the stack pointer) and r14 (link registe

r) r15 (the program counter) cpsr (the current program status register)

and privileged modes can also access a particular spsr (saved program status register)

6

ARM Processor CoreArchitecture

Versions 1 and 2 – Acorn RISC, 26-bit address Version 3 – 32-bit address, CPSR, and SPSR Version 4 – half-word, Thumb Version 5 – BLX, CLZ and BRK instructions

Processor cores ARM7TDMI (Thumb, debug, multiplier, ICE) – version 4T, low-

end ARM core, 3-stage pipeline, 50-100MHz ARM9TDMI – 5-stage pipeline, 130MHz or 200MHz ARM10TDMI – version 5, 300MHz

CPU Core: co-processor, MMU, AMBA ARM 710, 720, 740 ARM 920, 940

7

Processor Modes

The ARM has six operating modes: UserUser (unprivileged mode under which most tasks run) FIQFIQ (entered when a high priority (fast) interrupt is raised) IRQIRQ (entered when a low priority (normal) interrupt is raised) SupervisorSupervisor (entered on reset and when a Software Interrupt

instruction is executed) AbortAbort (used to handle memory access violations) UndefUndef (used to handle undefined instructions)

ARM Architecture Version 4 adds a seventh mode: SystemSystem (privileged mode using the same registers as user mode)

8

Register OrganizationGeneral registers and Program Counter

Program Status Registers

r15 (pc)

r14 (lr)

r13 (sp)

r14_svc

r13_svc

r14_irq

r13_irq

r14_abt

r13_abt

r14_undef

r13_undef

User32 / System FIQ32 Supervisor32 Abort32 IRQ32 Undefined32

cpsr

sprsr_fiqsprsr_fiqsprsr_fiq spsr_abtspsr_svcsprsr_fiqsprsr_fiqspsr_fiq sprsr_fiqsprsr_fiqsprsr_fiqsprsr_fiqsprsr_fiqspsr_irq

r12

r10

r11

r9

r8

r7

r4

r5

r2

r1

r0

r3

r6

r7

r4

r5

r2

r1

r0

r3

r6

r12

r10

r11

r9

r8

r7

r4

r5

r2

r1

r0

r3

r6

r12

r10

r11

r9

r8

r7

r4

r5

r2

r1

r0

r3

r6

r12

r10

r11

r9

r8

r7

r4

r5

r2

r1

r0

r3

r6

r12

r10

r11

r9

r8

r7

r4

r5

r2

r1

r0

r3

r6

r15 (pc) r15 (pc) r15 (pc) r15 (pc) r15 (pc)

cpsrcpsrcpsrcpsrcpsr

r14_fiq

r13_fiq

r12_fiq

r10_fiq

r11_fiq

r9_fiq

r8_fiq

sprsr_fiqsprsr_fiqsprsr_fiqsprsr_fiqsprsr_fiqspsr_undef

Ref. [8]

9

Register Example: User to FIQ Mode

spsr_fiq

cpsr

r7

r4

r5

r2

r1

r0

r3

r6

r15 (pc)

r14_fiq

r13_fiq

r12_fiq

r10_fiq

r11_fiq

r9_fiq

r8_fiq

r14 (lr)

r13 (sp)

r12

r10

r11

r9

r8

User mode CPSR copied to FIQ mode SPSR

cpsr

r15 (pc)

r14 (lr)

r13 (sp)

r12

r10

r11

r9

r8

r7

r4

r5

r2

r1

r0

r3

r6

r14_fiq

r13_fiq

r12_fiq

r10_fiq

r11_fiq

r9_fiq

r8_fiq

Return address calculated from User mode PC value and stored in FIQ mode LR

Registers in use Registers in use

EXCEPTION

User Mode FIQ Mode

spsr_fiq

Ref. [8]

10

Accessing Registers

No breakdown of currently accessible registers. All instructions can access r0-r14 directly. Most instructions also allow use of the PC.

Specific instructions to allow access to CPSR and SPSR.

Note : When in a privileged mode, it is also possible to load / store the (banked out) user mode registers to or from memory.

11

The Program Status Registers (CPSR and SPSRs)

Condition Code Flags N = Negative result from ALU flag. Z = Zero result from ALU flag. C = ALU operation Carried out V = ALU operation oVerflowed

Interrupt Disable bits. I = 1, disables the IRQ. F = 1, disables the FIQ.

T Bit: Processor in ARM (0) or Thumb (1) Mode Bits: processor mode

Copies of the ALU status flags (latched if the instruction has the "S" bit set).

N ModeZ C V

2831 8 4 0

I F T

12

Condition Flags

Flag Logical Instruction Arithmetic Instruction

Negative No meaning Bit 31 of the result has been set(N=‘1’) Indicates a negative number in

signed operations

Zero Result is all zeroes Result of operation was zero(Z=‘1’)

Carry After Shift operation Result was greater than 32 bits(C=‘1’) ‘1’ was left in carry flag

oVerflow No meaning Result was greater than 31 bits(V=‘1’) Indicates a possible corruption of

the sign bit in signed numbers

13

The Condition Field of Instruction Set2831 24 20 16 12 8 4 0

Cond

Not equalUnsigned higher or sameUnsigned lowerMinus

Equal

OverflowNo overflowUnsigned higherUnsigned lower or same

Positive or Zero

Less thanGreater thanLess than or equalAlways

Greater or equal

EQNECS/HSCC/LO

PLVS

HILSGELTGTLEAL

MI

VC

Suffix Description

Z=0C=1C=0

Z=1Flags

N=1N=0V=1V=0C=1 & Z=0C=0 or Z=1N=VN!=VZ=0 & N=VZ=1 or N=!Vnone

0000000100100011

01010110

1000100110101011110011011110

0100

0111

Code

14

Conditional ExecutionMost instruction sets only allow branches to be

executed conditionally.

However by reusing the condition evaluation hardware, ARM effectively increases number of instructions.

All instructions contain a condition field which determines whether the CPU will execute them.

Non-executed instructions soak up 1 cycle.

Still have to complete cycle so as to allow fetching and decoding of following instructions.

15

Conditional Execution (cont’d)

This removes the need for many branches, which stall the pipeline (3 cycles to refill).

Allows very dense in-line code, without branches.

The time penalty of not executing several conditional instructions is frequently less than overhead of the branch or subroutine call that would otherwise be needed.

16

Using and updating the Condition Field

To execute an instruction conditionally, simply postfix it with the appropriate condition:

For example an add instruction takes the form: ADD r0,r1,r2 ; r0 = r1 + r2 (ADDAL)

To execute this only if the zero flag is set: ADDEQ r0,r1,r2 ; If zero flag set then…

; ... r0 = r1 + r2

17

Using and updating the Condition Field (cont’d)

By default, data processing operations do not affect the condition flags (apart from the comparisons where this is the only effect).

To cause the condition flags to be updated, the S bit of the instruction needs to be set by postfixing the instruction (and any condition code) with an “S”.

For example to add two numbers and set the condition flags:

ADDS r0,r1,r2 ; r0 = r1 + r2 ; ... and set flags

18

Conditional Execution Example Greatest Common Divisor (最大公因數 )

Normal Assembler

gcd cmp r0, r1 ;reached the end? beq stop blt less ;if r0 > r1 sub r0, r0, r1 ;subtract r1 from r0 bal gcdless sub r1, r1, r0 ;subtract r0 from r1 bal gcdstop

ARM Conditional Assembler

gcd cmp r0, r1 ;if r0 > r1 subgt r0, r0, r1 ;subtract r1 from r0 sublt r1, r1, r0 ;else subtract r0 from r1 bne gcd ;reached the end?

19

The Barrel ShifterARM has a barrel shifter which provides a

mechanism to carry out shifts as part of other instructions.

Immediate value 8 bit number can be rotated right through an

even number of positions. assembler will calculate rotate

for you from constant.

Register, optionally with shift operation applied.

Shift value can be either be: 5 bit unsigned integer specified in bottom byte of

another register.

Operand 1

Result

ALU

Barrel Shifter

Operand 2

20

Second Operand: Shifted Register

Using a multiplication instruction to multiply by a constant load the constant into a register wait for a number of internal cycles for the multiplication to complete

A more optimum solution can be often found by using some combination of MOVs, ADDs, SUBs and RSBs with shifts.

Multiplications by a constant equal to a ((power of 2) ± 1) can be done in one cycle.

Example: r0 = r1 * 5 r0 = r1 + (r1 * 4)

ADD r0, r1, r1, LSL #2

Example: r2 = r3 * 105 r2 = r3 * 15 * 7 r2 = r3 * (16 - 1) * (8 - 1) RSB r2, r3, r3, LSL #4 ; r2 = r3 * 15

RSB r2, r2, r2, LSL #3 ; r2 = r2 * 7

21

Outline3.1 Registers

3.2 Memory

3.3 Exceptions

22

3.2 MemoryIntroduction

Memory Organization

Pipeline

Memory Access

ARM Memory Interface

23

IntroductionWord, half-word alignment (xxxx00 or xxxxx0)

ARM can be set up to access data in either little-endian or big-endian format, through they default to little-endian.

The ARM uses a pipeline in order to increase the speed of the flow of instructions to the processor.

Allows several operations to be undertaken simultaneously, rather than serially.

Rather than pointing to the instruction being executed, the PC points to the instruction being fetched.

24

Memory Organization

byte3 byte2 byte1 byte0

3 2 1 0

7 6 5 4

byte6 half-word4

11 10 9 8

--------------word8-----------------

-------------word16----------------

half-word14 half-word12

12131415

16171819

20212223

bit 31 bit 0

byteaddress

byte0 byte1 byte2 byte3

0 1 2 3

4 5 6 7

byte5 half-word6

8 9 10 11

--------------word8-----------------

-------------word16----------------

half-word12 half-word14

15141312

19181716

23222120

bit 31 bit 0

byteaddress

(a) Little-endian memory organization

(b) Big-endian memory organization

25

Pipeline

3 stages (ARM7) and 5 stages (ARM9TDMI)

fetch

memory

executedecode

fetch write-backexecutedecode

PC PC-4 PC-8 access memory write resultLoad an instruction if needed to registerfrom memory

26

Memory Access The ARM7/9 is a Von Neumann, load/store architecture, i.e.,

Only 32 bit data bus for both instr. and data. Only the load/store instr. (and SWP) access memory.

Memory is addressed as a 32 bit address space.

Data type can be 8 bit bytes, 16 bit half-words, or 32 bit words, and may be seen as a byte line folded into 4-byte words.

Words must be aligned to 4 byte boundaries, and half-words to 2 byte boundaries.

Always ensure that memory controller supports all three access sizes.

27

Outline3.1 Registers

3.2 Memory

3.3 Exceptions

28

3.3 Exceptions

Introduction

Types of ARM exceptions

Exception and Interrupt

Entering an Exception

Returning from an Exception

Exception Entry/Exit

Exceptions and the Vector Table Address

29

IntroductionExceptions and interrupts break the sequential flow of a

program, jumping to architecturally defined memory locations.

In ARM, Software Interrupt (SWI) is the “system call” exception.

30

Types of ARM exceptions reset

when CPU reset pin is asserted

undefined instruction when CPU tries to execute an undefined op-code

software interrupt when CPU executes the SWI instruction

prefetch abort when CPU tries to execute an instruction pre-fetched from an illegal addr

data abort when data transfer instruction tries to read or write at an illegal address

IRQ when CPU's external interrupt request pin is asserted

FIQ when CPU's external fast interrupt request pin is asserted

31

Exception and InterruptThe terms exception and interrupt are often confused.

Exception usually refers to an internal CPU event such as floating point overflow MMU fault (e.g., page fault) trap (SWI)

Interrupt usually refers to an external I/O event such as I/O device request Timer interrupt

In the ARM architecture manuals, the two terms are mixed together.

32

Entering an ExceptionWhen an exception is generated, the processor takes the

following actions:

Copy the CPSR to SPSR for the mode in which the exception is to be handled.

Change the appropriate CPSR mode bits in order to Change to the appropriate mode, and map in the appropriate

banked registers for that mode. Disable interrupts.

IRQs are disabled when any exception occurs.FIQs are disabled when a FIQ occurs, and on reset.

Set lr_mode to the return address. Set PC to the vector address for the exception.

33

Returning from an ExceptionThe actions taken by the processor

Restore the CPSR from SPSR_mode Restore the PC using return address stored in lr_mode

The way to return depends on whether a stack is used during entering the subroutine Without a stack

Performing a data processing instruction with S flag set and the PC as the destination register.

With a stack Restoring the saved registers by performing

LDMFD sp!, {r0-r12, pc}^^ indicates that the CPSR is restored from the SPSR

34

Returning from SWI and Undefined Instruction Handlers

SWI and UI exceptions are generated by the instruction itself, so the PC is not updated when the exception is taken. Thus, storing (PC-4) in lr_mode makes lr_mode points to the next instruction be executed.

SWI xxx being executed PC-8INST-1 being decoded PC-4INST-2 being fetched PC

Restoring the PC from lr Without a stack

MOVS pc, lr With a stack

STMFD sp!, {reglist,lr}…LDMFD sp!, {reglist, pc}^

35

Returning from FIQ and IRQ Check IRQ and FIQ at the end of executing each instruction.

INST-1 being executed PC-12 , IRQ or FIQ checkedINST-2 being decoded PC-8INST-3 being fetched PC-4INST-4 PC

Restoring PC from lr Without a stack

SUBS pc, lr, #4 With a stack

SUB lr, lr, #4STMFD sp!, {reglist, lr}…LDMFD sp!, {reglist, pc}^

36

Returning from Prefetch Abort Prefetch abort is generated when it reaches the execution stage.

INST-1 being executed PC-8 , AbortedINST-2 being decoded PC-4INST-3 being fetched PC

Restoring PC from lr Without a stack

SUBS pc, lr, #4 With a stack

SUB lr, lr, #4STMFD sp!, {reglist, lr}…LDMFD sp!, {reglist, pc}^

37

Returning from Data Abort When a data abort occurs, the program counter has been updated.

INST-1 being executed PC-12 , AbortedINST-2 being decoded PC-8INST-3 being fetched PC-4INST-4 PC

Restoring PC from lr Without a stack

SUBS pc, lr, #8 With a stack

SUB lr, lr, #8STMFD sp!, {reglist, lr}…LDMFD sp!, {reglist, pc}^

38

Exception Entry/Exit

Ref. [4]

39

Exceptions and the Vector Table Address

Exception Vectors

PriorityAddress Exception type Exception Mode (1=high,6=low)0x00000000 Reset Supervisor 10x00000004 Undefined instruction Undefined 60x00000008 Software Interrupt Supervisor 60x0000000C Abort (prefetch) Abort 50x00000010 Abort (data) Abort 20x00000014 Reserved Reserved Not applicable0x00000018 IRQ IRQ 40x0000001C FIQ FIQ 3

40

References[1] Andrew Sloss, Dominic Symes and Chris Wright, “ARM System Developer's

Guide“, published by MORGAN KAUFFMAN, 2004

[2] David Seal, “ARM Architecture Reference Manual “, published by Addison-Wesley, 2000

[3] ARM DUI 0021A “Programming Techniques“, 1995

[4] http://www.samsung.com/Products/Semiconductor/SystemLSI/Networks/PersonalNTASSP/CommunicationProcessor/S3C4510B/um_s3c4510b_rev1.pdf

[5] www.arm.com

[6] http://www.arm.com/pdfs/DUI0056D_ADS1_2_Dev.pdf

[7] http://nthucad.cs.nthu.edu.tw/~wcyao/

[8] www-courses.cs.uiuc.edu/ ~cs433/Processors/ARM/ARMInstV1.0.ppt

41

Exercise

1. How many registers does ARM have? And what purpose do they use for?

2. Describe the processor modes of ARM in detail.

3. What is the meaning of ARM condition flags in logical instructions and in Arithmetic Instructions?

4. What is the benefit of Conditional Execution?

5. What is the difference of little-endian and big-endian format?

6. Please describe ARM memory cycle types and how to decide which type.

7. List all type of ARM exceptions and describe their addresses and priorities.