chapter xi reduced instruction set computing (risc) cs 147 li-chuan fang

42
Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang Li-Chuan Fang

Post on 20-Dec-2015

222 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

Chapter XIReduced Instruction Set Computing

(RISC)

CS 147

Li-Chuan FangLi-Chuan Fang

Page 2: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

Introduction

The world of microprocessors and CPUs can be divided into two parts:

complex instruction set computers (CISC processors) reduced instruction set computers (RISC processors)

CISC processors have larger instruction sets that often include some particularly complex instructions. These instructions usually correspond to specific statements in high-level languages.

RISC processors exclude these instructions, opting for a smaller instruction set with simpler instructions.

Page 3: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

Overview

the rationale for RISC processors

RISC instruction sets

instruction pipelines and register windows

Page 4: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

RISC Rationale

The first microprocessors ever developed were very simple processors with very simple instruction sets.

Current CISC microprocessor instruction sets may include over 300 instructions.

In general, the greater the number of instructions in an instruction set, the propagation delay is within the CPU.

Page 5: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

RISC’s Features

Fixed - Length InstructionsLimited Loading and Storing Instructions

Access MemoryFewer Addressing ModesInstruction PipelineLarge Number of Registers

Page 6: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

RISC’s Features cont.

Hardwired Control UnitDelayed Loads and BranchesSpeculative Execution of InstructionsOptimizing CompilerSeparate Instruction and Data Streams

Page 7: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

Fixed - Length Instructions

In RISC processors, every instruction has

the same size. For instance, an immediate

mode instruction might include an 8-bit

operand. Other instructions might use these

8 bits for opcodes of address information.

Page 8: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

Limited Loading and Storing Instruction Access Memory

All processors can load data from and store data to memory.

RISC processors limit interaction with memory to loading and storing data. If a value from memory is to be ANDed with the accumulator, the CPU first loads the value into a register and then performs the AND operation.

Page 9: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

Fewer Addressing Modes

RISC processors typically allow only a few addressing modes that can be processed quickly, such as register indirect and relative modes.

Page 10: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

Instruction Pipeline

A pipeline is like an assembly line in which many products are being worked on simultaneously, each at a different station.

In RISC processors, one instruction is executed while the following instruction is being fetched. By overlapping these operations, the CPU executes one instruction per clock cycle, even though each instruction requires three cycles to be fetched, decoded, and executed.

Page 11: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

Large Number of Registers

Having a large number of registers allows the CPU to store many operands internally.

When the operands are needed, the CPU fetches them from the registers, rather than from memory. This reduces the access time significantly. The registers can also be used to pass parameters to subroutines in an efficient manner; this is accomplished using register windowing.

Page 12: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

Hardwired Control Unit

Combinatorial logical generally has a lower propagation delay than a lookup ROM. For this reason, a hardwired control unit can run at a higher clock frequency than its corresponding microcoded control unit.

For RISC processors, the benefit of a higher clock rate outweighs the advantages offered by microcoded control units, such as ease of modification.

Page 13: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

Delayed loads and Branches

RISC processors use delayed loads and delayed branches to avoid waiting time.

The RISC instruction pipeline can encounter hazards during branch instructions or consecutive instructions that use a common operand.

Page 14: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

Speculative Execution of Instructions

In speculative execution, the CPU executes the instruction but does not store its result. If the instruction is to be executed, the result is stored. If not, the result is discarded.

Page 15: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

Optimizing Compiler

An optimizing compiler can arrange instructions to facilitate delayed loads and branches, as well as to optimally assign operands to registers. Fewer instructions make it much simpler to design an optimizing compiler for a RISC processor than for a CISC processor.

Page 16: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

Separate Instruction and Data Streams

The instruction pipeline may need to access instructions and operands from memory simultaneously. Separating the instruction and data streams helps to avoid memory access conflicts.

Page 17: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

RISC Instruction Sets

The instruction sets of RISC processors are reduced, or smaller in size than those of CISC processors.

A CISC processor might have over 300 instructions in its instruction set, but RISC CPUs typically have fewer than 100.

Page 18: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

Instruction Formats for the SPIM (MIPS) CPU

addi Rt, Rs, Imm

0x08 Rs Rt Immediate value

6 5 5 16

jal label0x03 absolute jump target address

6 26

Page 19: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

Instruction Pipelines and Register Windows

two implementation techniques commonly used in RISC processors to improve performance:

instruction pipelineallows RISC processors to execute one instruction per

clock cycle incorporation of large numbers of registers within

the CPUallows more variables to be stored in registers, rather

than memory, which reduces the time needed to

access data

Page 20: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

Instruction Pipelines

An instruction pipeline is very similar to a manufacturing assembly line.

An instruction pipeline processes an instruction the way the assembly line processes a product.

The first stage fetches the instruction from memory.

The second stage decodes the instruction and fetches any required operands.

Page 21: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

Instruction Pipelines cont.

The third stage executes the instruction.The fourth stage stores the result.As with the assembly line, each stage

processes instruction simultaneously (after an initial latency, or delay, to fill the pipeline).

This allows the CPU to execute one instruction per clock cycle.

Page 22: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

Instruction Pipelines cont.

The IBM 801, the first RISC computer, also uses a four-stage instruction pipeline.

Other processors, such as the RISC II use only three stages; they combine the execute and store result operations in a single stage.

The MIPS processor uses a five-stage pipeline; it decodes the instruction and selects the operand registers in separate stages.

Note that each stage has a register that latches its data at the end of the stage to synchronize data flow between stages.

Page 23: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

Instruction Pipelines cont.

Although we could employ several complete control units to process instructions, a single pipelined control unit offers hardware several advantages.

The primary advantage is the reduced hardware requirements of the pipeline.

A second advantage of instruction pipelines is the reduced complexity of the memory interface.

Page 24: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

Fetch instruction

Decode instr.select regs.

Execute instr.store result

Three-stage RISC Pipeline

Page 25: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

Storeresult

Fetch instruction

Decode instr.select regs.

Execute instruction

Four-stage RISC Pipeline

Page 26: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

Storeresult

Execute instruction

Fetch instruction

Decodeinstruction

Selectregisters

Five-stage RISC Pipeline

Page 27: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

Data flow through three-stage RISC pipeline

Clock cycle

Stage

1 2 3 4 5 6 7

1 11 12 13 14 15 16 17

2 -- 11 12 13 14 15 16

3 -- -- 11 12 13 14 15

Page 28: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

Data flow through four-stage RISC pipeline

Clock cycle

Stage

1 2 3 4 5 6 7

1 11 12 13 14 15 16 17

2 -- 11 12 13 14 15 16

3 -- -- 11 12 13 14 15

4 -- -- -- 11 12 13 14

Page 29: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

Data flow through five-stage RISC pipeline

Clock cycle

Stage

1 2 3 4 5 6 7

1 11 12 13 14 15 16 17

2 -- 11 12 13 14 15 16

3 -- -- 11 12 13 14 15

4 -- -- -- 11 12 13 14

5 -- -- -- -- 11 12 13

Page 30: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

Pipelines’ Problems

One problem is memory access. Another problem is caused by branch

statements.

Page 31: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

How to solve these problems?

problem 1 - memory access As we noted previously, the cache must separate

instructions and data to avoid memory conflicts from the different stages of the pipeline.

problem 2 - branch statements There is not much that the pipeline can do about

this. Instead, an optimizing compiler is needed to reorder the instructions to avoid this problem.

Page 32: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

Register Windowing

The CPU can access data in registers more quickly than data in memory, so having more registers makes more data available faster.

Having more registers also helps reduce the number of memory references, particularly when calling and returning from subroutines.

Page 33: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

Register Windowing cont.

Although a RISC processor has many registers, it may not be able to access all of them at any given time.

Most RISC CPUs have some global registers, which are always accessible.

The remaining registers are windowed so that only a subset of the registers are accessible at any specific time.

Page 34: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

Global registers (8)

Windowed

registers

Common input

registers

(8)

Window # 1

Window # 2

Window # 3

Local registers

(8)

Common output

registers

(8)

Register windowing in the SPARC processor

Page 35: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

Register Windowing cont.

The RISC CPU must keep track of which window is active and which windows contain valid data.

A window pointer register contains the value of the window that is currently active.

A window mask register contains 1 bit per window and denotes which windows contain valid data.

Page 36: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

Register Windowing cont.

Register windows provide their greatest benefit when the CPU calls a subroutine.

During the calling process, the register window is moved down one window position.

In SPARC example, if window 1is active and the CPU calls a subroutine, the processor activates window 2 by updating the window pointer and window mask registers.

Page 37: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

Register Windowing cont.

The CPU can pass parameters to the subroutine via the registers that overlap both windows, instead of through memory; this save a significant amount of time in accessing data.

The CPU can use the same registers to return results to the calling routine.

Page 38: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

0

1213141516

47

Window pointer register (First window active)

0 0

Window mask register

1 0 0 0

(Only first window has valid data)

Param. # 1 Param. # 2 Param. # 3

First window

Register windowing in a CPU:

during execution of the main routine

Page 39: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

Register windowing in a CPU:

executing a subroutine

Window pointer register (Second Window active)

0 1

Window mask register

1 10 0

(First two windows has valid data)

47

Param. # 1 Param. # 2 Param. # 3

Result

First window

0

1213141516 Second

window

27

Page 40: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

Register windowing in a CPU:

after returning from the subroutine

0

1213141516

47

Window pointer register (First window active)

0 0

Window mask register

1 0 0 0

(Only first window has valid data)

First window

Result

Page 41: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

Register Renaming

Most recent processors may use register renaming to add flexibility to the idea of register windowing.

A processor that uses register renaming can select any registers to comprise its working register “window”.

The CPU uses pointers to keep track of which registers are active and which physical register corresponds to each logical register.

Page 42: Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang

Register Renaming cont.

Unlike register windowing, in which only specific registers are active at any given time, register renaming allows any group of physical registers to be active.