datapath control unit - uweespecified number of bit positions in single clock cycle . can easily...

26
- 1 of 26 - Registers and the Register Transfer Level Introduction Let’s now continue moving down From instruction set level To the register level – closer to physical hardware To examine some of components Contributing to instruction execution In this section Will begin with Registers Register operations Move to Register transfer Register transfer notation Register microoperations Most digital systems Not limited to computers Can partition system into two major types of modules Datapath Performs data processing operations Control Unit Specifies and controls sequencing of data processing operations Send control signals To datapath unit to activate operations Receives status information from datapath Datapath Comprises data processing logic Datapath defined by Registers Operations performed on data stored in registers With support logic Basic register has capability to perform one or more elementary operations Such operations called microoperations Examples include Load or store to / from memory Loading contents of one register into another Adding contents of two registers Incrementing or shifting contents of register

Upload: others

Post on 14-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Datapath Control Unit - UWEESpecified number of bit positions in single clock cycle . Can easily implement with set of multiplexers . Following diagram illustrates 4 bit circular shift

- 1 of 26 -

Registers and the Register Transfer Level Introduction

Let’s now continue moving down From instruction set level To the register level – closer to physical hardware To examine some of components

Contributing to instruction execution In this section

Will begin with Registers Register operations

Move to Register transfer Register transfer notation Register microoperations

Most digital systems Not limited to computers Can partition system into two major types of modules

Datapath Performs data processing operations

Control Unit Specifies and controls sequencing of data processing operations Send control signals

To datapath unit to activate operations Receives status information from datapath

Datapath

Comprises data processing logic Datapath defined by

Registers Operations performed on data stored in registers

With support logic

Basic register has capability to perform one or more elementary operations Such operations called microoperations

Examples include Load or store to / from memory Loading contents of one register into another Adding contents of two registers Incrementing or shifting contents of register

Page 2: Datapath Control Unit - UWEESpecified number of bit positions in single clock cycle . Can easily implement with set of multiplexers . Following diagram illustrates 4 bit circular shift

- 2 of 26 -

Control unit Comprises logic that determines and controls

Sequences of data processing steps Performed by datapath

Results of current operation may determine

Sequence of control signals Sequence of future microoperations

Such actions described by register transfer notation – RTN

Actions are the microoperations

Registers We’ll begin with registers Registers and latches one of fundamental building blocks of

Computer system Specifically computer system’s datapath

Using contemporary logic Single latch or flip-flop can store a one bit of information One logical 1 or logical 0

Collection of such devices treated as a single entity

Called a register or latch Depending upon mechanism by which data

Enters and is stored in device Register utilizes

Strobe or clock to enter data Latch utilizes

Gate or enable Collection of registers with some special properties

Called register file We define 3 basic operations on data in system

All involve registers Operations

• Store data • Transfer data • Operate on data

Instructions are implemented in part

By movement of data through registers Using register view of system

Simplifies and aides understanding

Page 3: Datapath Control Unit - UWEESpecified number of bit positions in single clock cycle . Can easily implement with set of multiplexers . Following diagram illustrates 4 bit circular shift

- 3 of 26 -

Basic Register Operations We express basic register operations

According to following timing diagrams Reflected are

• Read • Write

All other operations built on these Such operations include

• Load • Shift • Clear • Count

On write operation

Data changed on inputs to register Following delay

To allow data to settle on bus Write signal asserted

In drawing signal asserted low This is typical

Read follows similarly

The read signal is asserted In drawing

Asserted low Following some delay Data appears on output of register

This will be copy of contents of register We take several views of a register

Based upon the level of detail we need Simplest view shows simple box

With bits numbered More complex shows

Inputs and outputs Some control signals

Let’s now examine registers in greater detail

Data

/Write

/Read

Data

/Write

/Read

D0

Dn-1

D0

Dn-1

D0

Dn-1

D0

Dn-1

clock

Output Enable

0

n-1

Page 4: Datapath Control Unit - UWEESpecified number of bit positions in single clock cycle . Can easily implement with set of multiplexers . Following diagram illustrates 4 bit circular shift

- 4 of 26 -

Storage Registers Registers used to hold data

Form one small component of the memory system In CPLD, FPGA, microprocessor

Often used for temporary storage during program execution

On software side Frequently used values

Control variable in a for or while loop On hardware side

Data written to or read from I/O port No restrictions placed on the size of a register

Number of flip-flops or latches However common practice to design binary sized groups

4, 8, 16, or 32 bit registers Size of a register called its width Devices comprising a register all have

• Common clock or gate • May have common reset (and preset) • Work as a single unit

As noted earlier common parlance refers to the device as

A latch If comprised of gated devices

Typically such devices are single bit latches themselves A register

If the member devices are clocked or strobed devices Typically such devices are flip-flops

The accompanying logic diagram illustrates Four bit latch and a similar sized register

Any values placed on inputs to device Clocked or gated to the outputs

With a simple inversion

Sense of gate or clock can be modified Register is sometimes implemented with

D Q

G

D Q

G

D Q

G

D Q

G

D Q

D Q

D Q

D Q

4 Bit D Latch 4 Bit D Register

D Q

D

Enable

clock

clearRegister Cell

Page 5: Datapath Control Unit - UWEESpecified number of bit positions in single clock cycle . Can easily implement with set of multiplexers . Following diagram illustrates 4 bit circular shift

- 5 of 26 -

Registers with Load Control Register transfers

Essential to operation of computer To be able to effect and control such transfers

Must be able to control when register is loaded To incorporate such capability

Will begin with basic D register Add load control logic

As diagram for register cell indicates When Enable control

Logical 0 – cell holds its state Logical 1 – cell D input enabled

Value on D stored on clock 4 register cells then combined into 4 bit register

Shift Right Shift Register

Four D flip flops in the configuration shown Implement a four bit shift right - shift register Illustrate the basic architecture for the family of devices

Parallel In / Serial Out – Serial In / Parallel Out Left Shift Registers

Shift register is also a convenient means Converting between serial and parallel data

First diagram implements a simple four-bit serial to parallel converter

Four bit word is entered into the shift register

In serial through the input labeled Data After four clock pulses

Word appears on the four output lines labeled D3..D0. One common extension to the basic design

Use a tristate buffer on the outputs Permit the outputs of several such devices to be multiplexed

Onto a common bus

D Q D Q D Q D Q

Clock

Data In

A B C D

Data Out

D Q D Q D QData

Clock

D2D3 D1

QQQ

D Q

D0

QD3 D2 D1 D0

D Q

D Q

D Q

D QEN

EN

EN

EN

Load

Clock

4 Bit D RegisterWith Load Control

D0

D1

D2

D3

Clear

Page 6: Datapath Control Unit - UWEESpecified number of bit positions in single clock cycle . Can easily implement with set of multiplexers . Following diagram illustrates 4 bit circular shift

- 6 of 26 -

As implemented one must count the number of bits entered Ensure that the register is not overrun Addition of one flip-flop to incorporate a marker bit into the design

Can provide an alternate approach

Prior to entering the data word

Reset input is asserted Following the reset

0th stage is in the logical 1 state All others are at logical 0’s

After four shifts Logical 1 is in the last state Complete signal is now asserted

Implementing a parallel in / serial out shift register

Entails adding a two-to-one multiplexer on the input of each stage Plus selector control input

Select between loading and shifting Such a design is presented in the diagram below

When the Select input is in the logical 1 state

Circuit acts like a four bit shift right shift register When it is in the logical 0 state

Parallel input data can be stored on the next clock rising edge

Barrel Shifter Barrel shifter designed to shift word

Specified number of bit positions in single clock cycle Can easily implement with set of multiplexers Following diagram illustrates 4 bit circular shift implementation

Shifts one position to right Right most bit circulates back to MSB position

D Q D Q D QData

Clock

D2D3 D1

QQQ

D Q

D0

QD3 D2 D1 D0

D Q

Q

Reset

S

R R R R

Complete

D Q

Clock

Data In

A

Data OutD Q

B

1

0

1

0

D Q

A

D Q

B

1

0

1

0

Select

DB0 DB1 DB2 DB3

Page 7: Datapath Control Unit - UWEESpecified number of bit positions in single clock cycle . Can easily implement with set of multiplexers . Following diagram illustrates 4 bit circular shift

- 7 of 26 -

Design can shift as follows S1 S0 F3 F2 F1 F0 0 0 F3 F2 F1 F0 0 1 F0 F3 F2 F1 1 0 F1 F0 F3 F2 1 1 F2 F1 F0 F3

Number of positions

0..3 enables corresponding input on each mux

Linear Feedback Shift Registers Linear feedback shift register (LFSR)

Finds wide application in any embedded applications That utilize pseudo-random sequences

Such applications include Random noise generation Development of ‘random’ vectors in test systems Encoding and encryption, Wireless telecommunication systems utilizing

CDMA or spread spectrum techniques One cannot generate truly random numbers

Using a finite state machine Finite number of states

Ensures that any path through the sequence of states Must repeat eventually

Best that can be expected is

Period of the machine is ‘very long’ Thus, such a machine is called pseudo-random

Upper limit to the length of any such sequence is given by 2n – 1 Where n is the number of flip-flops in the shift register

Such a sequence is called a maximal length sequence Shift register configuration

Described as a maximal length shift register Upper bound is not 2n as one might expect with n stages

Because the all zero state is not permitted Once the generator enters the all zero state

3 2 01Data Bits D3..D0 Selectors

4:1 Mux F0F1F2F3 4:1 Mux4:1 Mux

S0S1

3 2 01S0S1

3 2 01S0S1

3 2 01

4:1 Mux

S0S1

3 2 01

S0 S1

Page 8: Datapath Control Unit - UWEESpecified number of bit positions in single clock cycle . Can easily implement with set of multiplexers . Following diagram illustrates 4 bit circular shift

- 8 of 26 -

Will not be able to exit Because of their application to noise generation

Maximal length LFSRs sequences are often termed Pseudo noise sequences or PN sequences

High level block diagram for such a design

Called a Linear Feedback Shift Register, LFSR Given in accompanying figure

Subset of the outputs from the shift register Fed back as input data according to the polynomial given in

Length of the generated sequence

Appearing on the output as a series of 0’s and 1’s Determined by the starting value in the shift register By which outputs are fed back

Specified by the values for the vi Generator will produce a maximal length sequence

If connection polynomial is irreducible Can’t be factored

Such a polynomial is called a primitive polynomial 41 XXinput ++=

Equivalent to a prime number

1 1 0 0 1 → 19 1 2 4 8 16

D flip-flop used in the implementation Has both an asynchronous set and an asynchronous reset input

Since only the reset input is used Set input must be defined

Cannot be left floating Variable pullUp serves that purpose

An LFSR configured according to the given polynomial

FeedbackLogic

0

N Bit Shift Register

clock

Output

n-1

Input

outputsflopfliptherepresentXORexclusiveanasdimplemente

orvXvXvXvXvvinput

i

i

nn

−+=

+++++= −−

10... 1

13

32

210

Page 9: Datapath Control Unit - UWEESpecified number of bit positions in single clock cycle . Can easily implement with set of multiplexers . Following diagram illustrates 4 bit circular shift

- 9 of 26 -

Illustrated in the following diagram

Register File Register file collection of registers

In CPU Usually implemented as fast SRAM

Such SRAMs are typically multiported Specifically have separate read and write ports

General idea behind register file

Can read from multiple registers simultaneously Can write to one register Can provide 2 inputs to ALU and store answer

To be able to do so design must support Ability to individually address or select each register

For read, write, output High level diagram for MIPS 32 x 32 register file given as ARM register file is 64 x 32 Moving inside device

Read and write portions given as

D Q D Q D Qinput

A B C

Clock

QQQ

D Q

C Q

X1X2X3X4Output

Reset

+Vcc

pullUp

Register 0

Register 1

Register 31

Write Strobe

Write Data5

32

N to 1Decoder

0

1

31

Register 0Register 1

Register 31

MU

XM

UX 32 Reg 1 Read Data

32 Reg 2 Read Data

Reg 1 Read Select 5

5Reg 2 Read Select

325

5

5

32 32

Reg 1 Read Select

Reg 2 Read Select

Write Reg Select

Write Data

Reg 1 Read Data

Reg 2 Read Data

Write Strobe

Page 10: Datapath Control Unit - UWEESpecified number of bit positions in single clock cycle . Can easily implement with set of multiplexers . Following diagram illustrates 4 bit circular shift

- 10 of 26 -

Register Operations We’ll now briefly introduce simple register transfer

Operations Language – RTL

Design at RTL level

Serves as specification for detailed design at lower level Use such notation to

Represent registers Operations on contents

RTL notation similar to what is found in HDL

Dataflow level in Verilog for example Motivation for RTL

When designing computer system Need method for

Capturing in formal way desired behaviour Natural language expressions

Good for Describing computer hardware Conveying general features and capabilities of machine’s design

To formally specify computer Need

• Precise description / specification of its function • Unambiguous mathematical notation • To be able to express

Movement of data among registers and memory Transformations on such data

Ideas originated with Gordon Bell and Alan Newell Will begin with some notation

For describing entities will be working with Register Notation – An Introduction

Registers, Memory, Basic Symbology Registers denoted by upper case alphanumeric names Familiar examples reflecting state of processor include

PC<31:0> – 32 bit register named Program Counter IR<31:0> – 32 bit register named Instruction Register MAR<31:0> – 32 bit register named Memory Address Register R4<31:0> – 32 bit general purpose register… R4 AX – register AX – 32 bit general purpose register… Register AX Run: – 1-bit run/halt signal

Page 11: Datapath Control Unit - UWEESpecified number of bit positions in single clock cycle . Can easily implement with set of multiplexers . Following diagram illustrates 4 bit circular shift

- 11 of 26 -

Strt: – 1-bit start signal Memory addresses

M[x]<7..0> – memory address of byte of memory M[x]<31..0> – memory address of 32 bit word of memory

Register contents assumed numbered n-1 .. 0

In big endian notation

Commonly used register transfer symbols given in following table

We see data transfer signified by arrow

Called replacement operator ( ← ) Expression R1 ← R0

Indicates movement by copy of contents of R0 to R1 R0 identifies source of transfer R1 designates destination

Expression only indicates copying

Gives no indication of when Suggests continuous operation

Typically not what is desired

Register Transfer Operations We have learned registers useful for

Storing data temporarily Performing some simple operations

Also important to be able to move data between registers

Essential to support internal CPU operations Such transfers assumed to be synchronous

With system clock Transfer from one register to another illustrated next

Symbol Meaning Register Transfer – RTN Letters Register R1, R2, RA, Rb

Parentheses or Angle brackets

Subset of Register R1(3), R2(31:0) R1<3>, R2<31:0>

Arrow Data Transfer R1 ← R0

Comma Simultaneous Transfer R1 ← R0, R2 ← R1

Square Brackets Memory Address R2 ← M[ADX0]

Page 12: Datapath Control Unit - UWEESpecified number of bit positions in single clock cycle . Can easily implement with set of multiplexers . Following diagram illustrates 4 bit circular shift

- 12 of 26 -

Let’s first examine a single bit within a register The first thought might be – it’s only a simple D type device Correct… however we need a bit more control Would like device

To be clocked with system clock To be able to control when loaded Load to be synchronous

Can achieve objectives with accompanying design Data transfer from one register to another

Designated using replacement arrow R1 ← R2

Source appears on right Target or destination appears on left

Assumed that data path Exists for all bits from source to destination

Following diagram illustrates transfer of 32 bit data word

From register Ri to register Rj Each register implemented as above

Performs Load to Ri Transfer to Rj

Transfer occurs as follows

1. Rising clock edge Initiates data change Generates load enablei

2. Rising clock edge Stores data into registeri Terminates load enablei Generates load enablei

3. Rising clock edge Stores data into registerj Terminates load enablei

D Q

Q

clock

loaddata

D0 D0

D31D31

D0

D31

D0

D31

clock

load enablei

Register Ri Register Rj

load enablej

clockData

load enablei

load enablej

Register Ri Out

Register Rj Out

1 2 3

Page 13: Datapath Control Unit - UWEESpecified number of bit positions in single clock cycle . Can easily implement with set of multiplexers . Following diagram illustrates 4 bit circular shift

- 13 of 26 -

Conditional Transfer Conditional transfer indicated using

if – else construct if (K1 = 1) then (R1 ← R2)

K1 is condition from control or shorthand

K1: R1 ← R2 Conditioned simultaneous transfers (interchange between two registers) follow naturally

K2: R1 ← R2, R2 ← R1 Multiplexer Control

Multiplexers are integral part of both datapath and control RTL specification follows

If – else construct Shortcut version supported as well

if-else if (K1 = 1) then R0 ← R1 else if (K2 = 1) then R0 ← R2

shortcut K1: R0 ← R1, ¬ K1K2: R0 ← R2

Next diagram illustrates transfer From either of two registers into third Design can easily be extended to more than two registers

Using three multiplexors

D0

D0

D31

D31

D0

D31 D0

D31

clock

load enablei

Register Ri

Register Rk

D0

D31

D0

D31

Register Rj

load enablej

Select Rj

Select Ri

Sel

0

1

2:1 Mux

Page 14: Datapath Control Unit - UWEESpecified number of bit positions in single clock cycle . Can easily implement with set of multiplexers . Following diagram illustrates 4 bit circular shift

- 14 of 26 -

Can transfer from any of three registers To any of other two

Such a scheme illustrated in next diagram

With such a design

Multiplexor controls must be managed independently As with other designs

Concept readily extends to more registers

Transfer from Ri to Rk can be controlled as follows Rk : S0 S1 ← 0 0 // enable Ri output to Rk input load enablek ← 1 // enable load of register k load enablek ← 0 // disable load of register k

Using register transfer notation transfer specified as

¬S0¬ S1: Ri ← Rk // enable Ri output to Rk input Sequence assumes system clock continuously running Transfer from Ri to Rk and Rj to Ri can be controlled as follows

Rk: S0 S1 ← 0 0 // enable Ri output to Rk input Ri : S0 S1 ← 0 1 // enable Rj output to Ri input load enablek ← 1 // enable load of register k load enablei ← 1 // enable load of register i load enablek ← 0 // disable load of register k load enablei ← 0 // disable load of register i

MU

X 32

clock

load enablei

MU

X 32M

UX 32

32

32

32

Reg

iste

r Rj

Reg

iste

r Ri

Reg

iste

r Rk

S0 S1

load enablej

load enablek

S0 S1

S0 S1

Page 15: Datapath Control Unit - UWEESpecified number of bit positions in single clock cycle . Can easily implement with set of multiplexers . Following diagram illustrates 4 bit circular shift

- 15 of 26 -

Similarly using register transfer notation transfer specified as

¬S0¬ S1: Ri ← Rk // enable Ri output to Rk input ¬S0S1: Ri ← Rj // enable Ri output to Ri input

Next design utilizes a tristate bus to interconnect several registers

Diagram illustrates Transferring n bit word From register A to register B or from B to A

Transfer from A to B can be controlled as follows

enable ← 0 // disable bus direction ← 1 // direction from A to B enable ← 1 // enable bus … A drives B receives load enableB ← 1 // enable load of register B load enableB ← 0 // disable load of register B

Sequence assumes system clock continuously running Arithmetic Operations

Basic arithmetic operations given as follows Add

R0 ← R1 + R2 Subtraction

Usually implemented as addition and 2’s complement Complement given as ¬

Normally overstrike character used Can also be written as ! or ~ Register complement … 1’s complement

R1 ←¬R1 Two’s complement follows by adding 1 to 1’s complement

R0 ← R1 +¬R2 +1

direction

enable

Vcc

Vcc

Bit B0 (in)

Bit Bn-1(in)

Bit B0 (out)

Bit Bn-1(out)

Bit B0 (in)

Bit Bn-1(in)

Bit B0 (out)

Bit Bn-1(out)

load enableA load enableB

load enableA load enableB

Page 16: Datapath Control Unit - UWEESpecified number of bit positions in single clock cycle . Can easily implement with set of multiplexers . Following diagram illustrates 4 bit circular shift

- 16 of 26 -

Multiplication R0 ← R1 * R2

Division R0 ← R1 / R2

Increment R0 ← R0 +1

Decrement R0 ← R0 -1

Logical and Logical Bitwise Operations

Useful for manipulating bits stored in register Given as follows

R0 ←¬R1 logical bitwise NOT – 1’s complement R0 ← R1 ^ R2 logical bitwise AND R0 ← R1 ∨ R2 logical bitwise OR R0 ← R1 ⊕ R2 logical bitwise XOR

When + used in control operation

Used to indicate logical OR operation (K1+K2): R1 ← R2 ∨ R3

Indicates R1 gets result of bitwise OR of R2 and R3

When K1 or K2 is TRUE Use bitwise operators

To set, clear, or mask bit patterns Within register contents Let R1 contain pattern

11110011 00001100 Set

Set bit 4 Let R2 contain pattern

00000000 00010000 R1 ← R1 ∨ R2

11110011 00001100 00000000 00010000 11110011 00011100

Reset Reset bit 8

Let R2 contain pattern 11111110 11111111

R1 ← R1 ^ R2 11110011 00001100 11111110 11111111 11110010 00011100

Page 17: Datapath Control Unit - UWEESpecified number of bit positions in single clock cycle . Can easily implement with set of multiplexers . Following diagram illustrates 4 bit circular shift

- 17 of 26 -

Mask Mask lower 8 bits

Let R2 contain pattern 00000000 11111111

R1 ← R1 ^ R2 11110011 00001100 00000000 11111111 00000000 00011100

Shift Operations Can shift register left or right

Specify lateral movement of data Source and destination of operation

Usually same but can differ End bit of register designated incoming or outgoing bit

Left shift Incoming – right most bit Outgoing – left most bit

Right shift Incoming – left most bit Outgoing – right most bit

Value of incoming bit

Typically assumed to be 0 However may have different value

Assume R2 contains pattern 10011101

Left shift

R1 ←sl R2 R1 – 00111010 R2 – 10011101

Right shift

R1 ←sr R2 R1 – 01001110 R2 – 10011101

Shift Registers

Following basic shift operation Now look at more complex version Shift Register with Parallel Load

Want to be able to shift or load in parallel Shift: Q ←sl Q ¬ Shift●Load: Q ←D

Page 18: Datapath Control Unit - UWEESpecified number of bit positions in single clock cycle . Can easily implement with set of multiplexers . Following diagram illustrates 4 bit circular shift

- 18 of 26 -

We have three operations Shift and load are 0

Register holds its contents Shift is 1 and load is 0

Contents are shifted to left by 1 bit position Shift is 0 and load is 1

Data loaded in parallel into register Bidirectional Shift Register

Bidirectional shift register follows as simple extension For this model

Support separate left and right shift control ¬S1●S0: Q ←sl Q S1●¬S0: Q ←sr Q S1●S0: Q ←D

We have four operations S1 and S0 are 0

Register holds its contents S1 is 0 and S0 is 1

Contents are shifted to left by 1 bit position S1 is 1 and S0 is 0

Contents are shifted to right by 1 bit position S1 is 1 and S0 is 1

Data loaded in parallel into register

Counters RTL descriptions for counters follow logically

From design of counter Logical operations expressing intended behaviour

Register Cell Design

Let’s now look at several simple register cell designs Want design to implement following transfers

AND:R1 ← R1 ^ R2 OR:R1 ← R1 ∨ R2 XOR:R1 ← R1 ⊕ R2

Assume Only one of control signals active at time None active R1 remains unchanged

Design single cell and replicate to size of register For single cell from requirements above

For first cut can write DR1i = AND• R1i • R2i + OR• (R1i + R2i) + XOR• (¬R1i R2i + R1i ¬R2i)

Page 19: Datapath Control Unit - UWEESpecified number of bit positions in single clock cycle . Can easily implement with set of multiplexers . Following diagram illustrates 4 bit circular shift

- 19 of 26 -

Rewriting to expand the inclusive OR DR1i = AND• (R1i R2i)+ OR• (R1i R2i + ¬R1i R2i + R1i ¬R2i) + XOR• (¬R1i R2i + R1i ¬R2i)

Now to accommodate case when all control signals are off

DR1i = LOAD (AND• R1i R2i + OR• (R1i R2i + ¬R1i R2i + R1i ¬R2i) + XOR• (¬R1i R2i + R1i ¬R2i)) + ¬LOAD•R1i

The control signals Shared amongst all bits in register

Individual bits

Unique to cell Rearranging to take advantage of shared logic

DR1i = LOAD ((AND + OR)• (R1i R2i)+ (OR + XOR) • (¬R1i R2i + R1i ¬R2i)) + ¬LOAD•R1i

Substituting control signals DR1i = LOAD (C0• R1i R2i + C1 • (¬R1i R2i + R1i ¬R2i)) + ¬LOAD•R1i

Control signals C0 C1 and LOAD Implemented once Shared amongst all cells in register

Remaining logic expressions Replicated for each cell

Let’s now implement following transfers SHL: R1 ← slR1 XOR:R1 ← R1 ⊕ R2 ADD: R1 ← R1 + R2 Assume

Only one of control signals active at time None active R1 remains unchanged

Simple solution Execute parallel load

When any of control signals asserted Can easily write implementing equations

LOAD = SHL + XOR +ADD DR1i = SHL• R1i-1 + ADD• ((R1i ⊕ R2i) ⊕ Ci) + XOR• (R1i ⊕ R2i) Ci+1 = (R1i ⊕ R2i) ⊕ Ci + R1i R2i

Control signals shared amongst all cells Logical operations unique to each cell

Page 20: Datapath Control Unit - UWEESpecified number of bit positions in single clock cycle . Can easily implement with set of multiplexers . Following diagram illustrates 4 bit circular shift

- 20 of 26 -

Instruction Formats Let’s now move to applying RTN to begin to express

Instructions and Instruction set

Examine Instruction Set and sequence of microoperations For illustration will start with MIPs instruction format

Then follow with ARM core (32 bit) instruction format

OP-CODE RS RT RD shamt funct

OP-CODE RS RT immediate

OP-CODE

R - Register

address

OP-CODE fmt FD FS FT funct

OP-CODE fmt FT immediate

I - Immediate

J - Jump

FR – Floating Point Register

FI – Floating Point Immediate

OP-CODE Rm shamt Rn Rd

OP-CODE

OP-CODE

R - Register

Br_address

I - Immediate

B - Branch

Rn RdRm ALU_Imm12

OP-CODE Conditional_Br_address

CB – Conditional Branch

Rd

OP-CODE

D - Memory

Rn RdDT Address

op

OP-CODE MOV_immediate

IW – Mov Wide Immediate

Rd

Page 21: Datapath Control Unit - UWEESpecified number of bit positions in single clock cycle . Can easily implement with set of multiplexers . Following diagram illustrates 4 bit circular shift

- 21 of 26 -

Now see how representative ARM instructions expressed using RTL notation

We’ve learned different fields within register May have specific purposes

Ability to distinguish such fields

Important in interpreting meaning of instruction

Typical examples include Register – R type

op<10..0>:= IR<31..21> – Opcode field rm<4..0>:= IR<20..16> – first source operand field shamt<5..0>:= IR<15..10> – shift amount field rn<4..0>:= IR<09..05> – second source operand field rd<4..0>:= IR<04..00> – destination field

Immediate – I type op<09..0>:= IR<31..22> – Opcode field ALU_immed<11..0>:= IR<21..10> – immediate operand field rn<4..0>:= IR<09..00> – first source operand field rd<4..0>:= IR<04..00> – destination field

Branch – B type

op<05..0>:= IR<31..26> – Opcode field BrAdx<25..0>:= IR<25..00> – branch address field

Conditional Branch – CB type

op<7..0>:= IR<31..24> – Opcode field CondAdx<4..0>:= IR<23..05> – conditional branch address field rd<4..0>:= IR<04..00> – destination field

Memory – D type op<10..0>:= IR<31..21> – Opcode field DTAdx<8..0>:= IR<20..12> – constant op<1..0>:= IR<11..10> – rn<4..0>:= IR<09..05> – source operand field rd<4..0>:= IR<04..00> – destination/target register

Page 22: Datapath Control Unit - UWEESpecified number of bit positions in single clock cycle . Can easily implement with set of multiplexers . Following diagram illustrates 4 bit circular shift

- 22 of 26 -

Micro Operations Observed that register level micro operations

Performed on data stored in registers Let’s examine several of these Effective Address

To support instructions for Branching and jump Data movement

Must be able To specifying development of next address

Based upon results of current operation Identify address in memory

Branch and Jump

Because PC identifies next instruction to be executed Can specify next instruction

By controlling contents of PC In abstract sense

Desired range of branch or jump can be Short

In support of basic looping constructs Long

In support of function calls Displacement can thus be

Absolute – go to an address Relative to PC – delta forward or backwards from PC

All conditional branches are PC relative Given in number of words

PC relative illustrated as

(R[rs] == R[rt]) → (PC ← PC + 4*IR<23..05>) x → y

Interpreted as if x then y Offset algebraically added to PC word aligned

Displacement illustrated as

PC ← PC + 4*IR<25..0> Offset algebraically added to PC word aligned

Data Movement Data movement to or from memory

Not based upon PC contents Want large address range

Utilize full expressive power of register

Page 23: Datapath Control Unit - UWEESpecified number of bit positions in single clock cycle . Can easily implement with set of multiplexers . Following diagram illustrates 4 bit circular shift

- 23 of 26 -

Load to register from memory R[Rt] = M[R[rn] + Dt Address] To indicate

Memory access Access address identified by contents of designated register

Store to memory from register

M[R[rn] + Dt Address] = R[Rt] To indicate

Memory access Access address identified by contents of designated register

Instruction Execution

We now how might utilize effective address calculation Load and Store

Load and store instructions moving data From or to memory To or from register

LDUR X2, [X3, #14] // R[x2] ← M[ R[x3] + off14 ] STUR X6, [X15, #14] // M[ R[x3] + off14 ] ← R[x2]

Branch and Jump

Branch jump instructions provide means for Alternate path of execution

Unconditional Conditional based upon some condition

Condition codes

Basic conditions upon which branching decisions made N – Negative Z – Zero V – Overflow C – Carry

Additional conditions EQ (==) Equal NE (!=) Not Equal GE (>=) Greater Than or Equal LE (<=) Less Than or Equal GT (>) Greater Than LT (<) Less Than

Page 24: Datapath Control Unit - UWEESpecified number of bit positions in single clock cycle . Can easily implement with set of multiplexers . Following diagram illustrates 4 bit circular shift

- 24 of 26 -

Possible alternatives include CMP r0,r3 B.EQ label // (R[r0] == R[r3]) → (PC ← PC + off18)

off18 is 4*IR<23..05>

B.NE label // (R[r0] != R[r3]) → (PC ← PC + off18) off18 is 4*IR<23..05>

B addr28 // (PC ← PC<31..28> + addr28) addr28 is 4*IR<25..0>

BL addr28 // R[30] ← PC + 4, (PC ← PC + addr28) R[30] is return address addr28 is 4*IR<25..0>

Addressing Modes

Learned earlier many ways by which operands and results accessed Identified number of common addressing modes Can express each in RTN format

Illustrated in following table

Addressing Mode Assembler - ISA Register Transfer - RTN Register MOV R1, R2 R1 ← R2 Register Indirect MOV R1,@R2 R1 ← M[R2] Immediate MOV R1, #val R1 ← val Direct MOV R1,anAdx R1 ← M[anAdx] Indirect MOV R1,@anAdx R1 ← M[M[anAdx]] Indexed MOV R1, aVal(R2) R1 ← M[x + R2] R2 holds aVal Relative MOV PC, PC+disp R1 ← M[PC + disp] Autoincrement MOV R1, (R2)+ R1 ← R1+ 1 Autodecrement MOV R1, (R2)- R1 ← R1- 1

Page 25: Datapath Control Unit - UWEESpecified number of bit positions in single clock cycle . Can easily implement with set of multiplexers . Following diagram illustrates 4 bit circular shift

- 25 of 26 -

RTN Operations Notation expressing operations

Typical commonly used • Arithmetic and logical • Data movement • Flow of control

Operations summarized in following table

Type Instruction Assembler - ISA Register Transfer - RTN Data Transfer move register MOV R1,R2 R1 ← R2 move from memory MOV R1,memadx R1 ← M[memadx] move to memory MOV memadx, R1 M[memadx] ← R1 move immediate MVI R1,#DEAD R1 ← #DEAD Logic complement accumulator CMA A ← ~A AND register

Bitwise AND AND R1 AND R1, R2

A ← A ∧ R1 R1 ← R1 ∧ R2

OR register Bitwise OR

OR R1 OR R1, R2

A ← A ∨ R1 R1 ← R1 ∨ R2

Execution Flow

unconditional jump JMP $1 PC ← $1

conditional jump J<cond> if<cond> == 1 PC ← $1

Arithmetic ADD register with carry ADD R1 A ← A + R1 + C R1 + R0 transferred to R2 R2 ← R1 + R0 clear carry CLC C ← 0 increment register Inc R0 R0 ← R0 +1 Program Control

Don’t execute an instruction NOP

Stop executing instructions HALT

Page 26: Datapath Control Unit - UWEESpecified number of bit positions in single clock cycle . Can easily implement with set of multiplexers . Following diagram illustrates 4 bit circular shift

- 26 of 26 -

Register View – The RTL Architecture As a first illustration of putting registers to work

We are now designing at register level Can express architecture of simple microprocessor

From register point of view See this in following figure

We will Explore and analyze such an architecture

In upcoming discussions Examine several different RTL models

Summary

Have introduced Registers Basic RTL expression language Implemented several register transfer designs

Will develop these concepts more fully In upcoming discussions

Datapath and datapath control

InputSubsystem

OutputSubsystem

Program Counter

Arithmetic and Logical Unit

TR0

Flag Register

General PurposeRegisters

Instruction Register

Instruction Decoder

TR1

System Bus

Accumulator

Control

Control Bus

Time Base

R0

Rn-1

Memory Subsystem

Memory

Memory Address Register

Memory Data Register

Mem

ory

Inte

rface