unit i

CS 2022 Computer architecture UNIT I

UNIT IINSTRUCTION SET ARCHITECTURE Introduction to computer architecture - Review of digital design – Instructions and addressing – procedures and data – assembly language programs – instruction set variations

INTRODUCTION TO COMPUTER ARCHITECTURE

A Computer is a machine which accepts input information in the

digitized form, processes the input according to a set of stored instructions

and produces the resulting output information.

PROGRAM AND DATA

The set of stored instructions written using a computer to solve the

task is called program and input and output information is called data.

The internal storage where programs are stored is called Memory.

Characteristics of computer

1. Speed: Computers perform various operations at a very high speed.

2. Accuracy: Computers are very accurate. Do not make mistakes in

calculations.

3. Reliability: Computers gives correct and consistent results always even if

they are used in adverse conditions. Many times errors are caused by human

interventions not by computer. Computer output is reliable, subject to the

condition that the input data and the instructions (programs) are correct.

Incorrect input data and unreliable programs give us wrong results.

4 Storage Capacity: The computer can store large amount of data and can

be retrieved at any time in fractions of a second. This data can be stored in

permanent storage devices like hard disk, CDs etc.

5 Versatility: Computers can do a variety of jobs based on the instructions

given to them. They are used in each and every field, making the tasks

easier.

Limitations of a Computer:-1) Not intelligent2) Inactive

Computer = Hardware + Software

of 51 PEC


Hardware: • Hardware is the physical aspect of computers, telecommunications,

and other device.

• Hardware implies permanence and invariability

• The components include keyboard, floppy drive, hard disk, monitor,

CPU, printer, wires, transistors, circuits etc.

Software: It is a set of programs used to perform certain tasks. Program is set of instructions to carry out a particular task

Hardware and Software

Hardware Software

The physical components

making up the system are

termed as Hardware.

Software is a set of programs

used to perform certain

tasks(logical component)

The components include

keyboard, floppy drive, hard

disk, monitor, CPU,

printer, wires, transistors,

circuits etc.

Software’s include compliers,

loaders, Banking s/w, library

s/w, payroll s/w etc.

Hardware works based on

instructions

Software tell the hardware

what to do

TYPES OF COMPUTER

Computers are classified according to size, cost, power of the

processor, and type of usage.

Some of types of computer are

Personal computers(PC)

Widely used in homes, schools, and business offices.

Notebook computers

of 51 PEC


It’s a compact version of the PC with all the components are

packed together into a single unit.

Workstations

It has a High resolution –Graphics input / output capabilities.

Desktop Computers

They have processing and storage units, visual display and audio

output displays.

Enterprise systems or mainframes and Servers

Mainframes are used for business data processing in medium

and large corporations that require more computing power and

storage capacity.

Review of digital design

Servers:

Servers contain sizable database storage units and are capable of handling

large volumes of request to access the data. The request and responses are

transported over internet communication facilities.

Supercomputers

They are used for large – scale numerical calculations required in

applications such as weather forecasting and aircraft design and

simulation.

I. FUNCTIONAL UNITS

A computer consists of five functionally independent main parts. They are,

Input

Memory

Arithmetic and logic

Output

Control unit

of 51 PEC


Basic functional units of a computer

Figure: The operation of a computer can be summarized as

follows

The computer accepts programs and the data through an input and stores

them in the memory.

The stored data are processed by the arithmetic and logic unit under

program control.

The processed data is delivered through the output unit.

All above activities are directed by control unit.

The information is stored either in the computer’s memory for later use or

immediately used by ALU to perform the desired operations.

Instructions are explicit commands that

Manage the transfer of information within a computer as well as

between the computer and its I/O devices.

Specify the arithmetic and logic operations to be performed.

To execute a program, the processor fetches the instructions one after

another, and performs the desired operations.

The processor accepts only the machine language program.

To get the machine language program, Complier is used.

of 51 PEC

CPU

ALU

MEMORY UNIT

MAIN

SECONDARY

O/P UNITI/P UNIT


Note: Compiler is software (Translator) which converts the High

Level Language program (source program) into Machine language

program (object program)

Input unit

The computer accepts coded information through input unit. The input

can be from human operators, electromechanical devices such as

keyboards or from other computer over communication lines.

Examples of input devices are

Keyboard, joysticks, trackballs and mouse are used as graphic

input devices in conjunction with display.

Microphones can be used to capture audio input which is then

sampled and converted into digital code for storage and

processing.

Keyboard

It is a common input device.

Whenever a key is pressed, the corresponding letter or digit is

automatically translated into its corresponding binary code and

transmitted over cable to the memory of the computer.

Memory unit

Memory unit is used to store programs as well as data.

Memory is classified into primary and secondary storage.

Primary storage

It also called main memory.

It operates at high speed and it is expensive.

It is made up of large number of semiconductor storage cells, each

capable of storing one bit of information.

These cells are grouped together in a fixed size called word. This

facilitates reading and writing the content of one word (n bits) in single

basic operation instead of reading and writing one bit for each

operation

of 51 PEC


Each word is associated with a distinct address that identifies word

location. A given word is accessed by specifying its address.

Word length

The number of bits in each word is called word length of the

computer.

Typical word lengths range from 16 to 64bits.

Programs must reside in the primary memory during execution.

RAM

It stands for Random Access Memory. Memory in which any

location can be reached in a short and fixed amount of time by

specifying its address is called random-access memory.

Memory access time

Time required to access one word is called Memory access

time.

This time is fixed and independent of the word being

accessed.

It typically ranges from few nano seconds (ns) to about

100ns.

Caches

They are small and fast RAM units.

They are tightly coupled with the processor.

They are often contained on the same integrated circuits(IC) chip

to achieve high performance.

Secondary storage

It is slow in speed.

It is cheaper than primary memory.

Its capacity is high.

It is used to store information that is not accessed frequently.

Various secondary devices are magnetic tapes and disks, optical disks

(CD-ROMs), floppy etc.

of 51 PEC


Arithmetic and logic unit Arithmetic and logic unit (ALU) and control unit

together form a processor. Actual execution of most computer operations

takes place in arithmetic and logic unit of the processor.

Example: Suppose two numbers located in the memory are to be added.

They are brought into the processor, and the actual addition is carried out by

the ALU.

Registers:

Registers are high speed storage elements available in the processor.

Each register can store one word of data.

When operands are brought into the processor for any operation, they

are stored in the registers.

Accessing data from register is faster than that of the memory.

Output unit

The function of output unit is to produce processed result to the

outside world in human understandable form.

Examples of output devices are Graphical display, Printers such as

inkjet, laser, dot matrix and so on. The laser printer works faster.

Control unit

Control unit coordinates the operation of memory, arithmetic and logic

unit, input unit, and output unit in some proper way. Control unit sends

control signals to other units and senses their states.

Example:

Data transfers between the processor and the memory are

controlled by the control unit through timing signals.

Timing signals are the signals that determine when a given

action is to take place.

Control units are well defined, physically separate unit that

interact with other parts of the machine.

A set of control lines carries the signals used for timing and

synchronization of events in all units.

Differences between:

of 51 PEC


Primary Memory Secondary MemoryAlso called as Main memory. Also called as Auxiliary memory.

Accessing the data is faster. Accessing the data is slow.

CPU can access directly CPU cannot access directly

Semiconductor memory. Magnetic memory.

Data storage capacity is less. Data storage capacity is more or

huge.

Expensive. Not expensive.

It is Internal memory. It is External memory.

Examples : RAM, ROM Examples: hard disk, floppy disk, magnetic tape etc.

of 51 PEC

RAM ROM

Random Access Memory. Read Only Memory.

Volatile memory.

The contents of the RAM are lost

when power is turned off.

Non-volatile memory.

The contents of the ROM are not

lost when power is turned off.

Temporary storage medium. Permanent storage medium.

The data can be read and written. The data can only be read, but the

data cannot be written.

The programs are brought into

RAM just before execution.

BIOS and monitor programs are

stored.


Combinational Digital Circuits

z ¢ =

1 – z NOT converted to arithmetic form xy AND same as multiplication

(when doing the algebra, set zk = z) x Ú y = x + y - xy OR converted to arithmetic form x Å y = x + y - 2xy XOR converted to arithmetic form

Example: Prove the identity xyz Ú x ¢ Ú y ¢ Ú z ¢ º? 1LHS = [xyz Ú x ¢] Ú [y ¢ Ú z ¢] = [xyz + 1 – x – (1 – x)xyz] Ú [1 – y + 1 – z – (1 – y)(1 – z)] = [xyz + 1 – x] Ú [1 – yz] = (xyz + 1 – x) + (1 – yz) – (xyz + 1 – x)(1 – yz) = 1 + xy2z2 – xyz = 1 = RHS

of 51 PEC

OR NOR NAND AND XNOR

Enable/Pass signal e

Data in x

Data out x or 0

Data in x

Enable/Pass signal e

Data out x or “high impedance”

(a) AND gate for controlled transfer (b) Tristate buffer

(c) Model for AND switch.

x

e

No data or x

0

1 x

e

ex

0

1 0

(d) Model for tristate buffer.

e

e

e Data out (x, y, z, or high

impedance)

(b) Wired OR of t ristate outputs

e

e

e

Data out (x, y, z, or 0)

(a) Wired OR of product terms

z

x

y

z

x

y

z

x

y

z

x

y


Variations in Gate Symbols

Gates with more than two inputs and/or with inverted signals at input or output.

Gates as Control Elements

Wired OR allows tying together of several controlled signalsWired OR and Bus Connections

of 51 PEC


Wired OR allows tying together of several controlled signals.

Boolean Functions and Expressions

Ways of specifying a logic function Truth table: 2n row, “don’t-care” in input or output

Logic expression: w ¢ (x Ú y Ú z), product-of-sums, sum-of-products, equivalent expressions

Word statement: Alarm will sound if the door is opened while the security system is engaged, or when the smoke detector is triggered

Logic circuit diagram: Synthesis vs analysis

Manipulating Logic Expressions

Instructions and AddressingAbstract View of Hardware

Name of law OR version AND version

Identity x Ú 0 = x x 1 = x

One/Zero x Ú 1 = 1 x 0 = 0

Idempotent x Ú x = x x x = x

Inverse x Ú x ¢ = 1 x x ¢ = 0

of 51 PEC


Commutative x Ú y = y Ú x x y = y x

Associative (x Ú y) Ú z = x Ú (y Ú z) (x y) z = x (y z)

Distributive x Ú (y z) = (x Ú y) (x Ú z) x (y Ú z) = (x y) Ú (x z)

DeMorgan’s (x Ú y)¢ = x ¢ y ¢ (x y)¢ = x ¢ Ú y ¢

The miniMIPS(minimal MIPS is a load/store instruction set, meaning that data elements must be copied or loaded into registers before they can be processed, operation results also go into registers and must be explicitly copied back into memory through separate store operations.

The fig shows the MiniMips memory unit with its up to 2 30 words (2 32 bytes)an execution and integer unit(EIU) a floating point unit(FPU) and a trap and memory unit (TMU).

FPU and TMU are shown for completeness.

The EIU has 32 general purpose registers each of which is 32 bits wide and can thus hold the content of one memory location.

The arithmetic / logic unit(ALU) executes addition subtraction and logical instructions.

A separate arithmetic unit is dedicated to multiplication and division instructions whose results are placed in two special registers, named “Hi” and “Lo”, form where they can be moved into general purpose registers.

Aview of MiniMIPS registers and data sizes is presented in the fig. all registers except register 0($0) which is permanently holds the constant 0, are general purpose and can be used to store arbitrary data words.

of 51 PEC


REGISTRES AND DATS SIZES IN MINIMIPS

A 32 bit data element stored in a register or a memory location is referred to as a word. Assume that word holds an unsigned integer, a floating point number or a string of ASCII characters .

MiniMIPS words are stored in byte addressable memory.

of 51 PEC


MiniMIPS used the “big-endian” scheme where the most significant end appears in the first .

Double word occupies two consecutive registers or memory locations.

Instruction Formats

A typical MiniMIPS machine instruction is add $t8,$s2,$s1, which causes the contents of registers $s2and $s1 to be added, with the result stored in registers $t8.

This might correspond to the compiled form of the high level language statement a=b+c and is in turn represented in a machine word using 0s and 1s to encode the operation and registers specifications.

Add, Subtract, and Specification of Constants

of 51 PEC


A machine instruction for an arithmetic operand specifies an opcode, one or more source operands and usually one destination operand.

Opcode is a binary code that defines an operation. The operands of an arithmetic or logical instruction may come from a variety of sources.

The method used to specify where the operands are to be found and where the results must go is referred to as the addressing mode.

of 51 PEC


MiniMIPS Instruction Formats

Register or R type instructions operate on the two registers identified in the rs and rt fields and store the result in register rd.

The function fn fields serves as an extension of the opcode to allow for more operations to be defined and the shift amount filed is used in instructions that specify a constant shift amount.

Immediate or I Type instructions are really of two different varieties. In Immediate instructions the 16 bit operand field in bits 0-15 holds an integer that plays the same role as rt in R-type instructions.

In load store or branch instructions the 16bit filed is interpreted as an offset, or relative address, that is to be added to the base value in registers rs to obtain a memory address for reading or writing.

For data access the offset is interpreted as the number of bytes forward or backward relative to the base address.

For branch instructions the offset is in words given that instructions always occupy complete 32 bit memory.

Jump or J type instructions cause unconditional transfer of control to the instructions in the specified address. MiniMIPS address are 32 bits wide. Whereas only 26 bits are available in the address field of a j-type instruction.

Two conventions are used first the 26 bit field is assumed to carry a word as opposed to a byte address. The hardware attaches two 0s to the right end of the 26 bit address field to derive a 28 bit word address.

of 51 PEC

5 bits 5 bits

31 25 20 15 0

Opcode Source register 1

Source register 2

op rs rt

R 6 bits 5 bits

rd

5 bits

sh

6 bits

10 5 fn

Destination register

Shift amount

Opcode extension

Immediate operand or address offset

31 25 20 15 0

Opcode Destination or data

Source or base

op rs rt operand / offset

I 5 bits 6 bits 16 bits 5 bits

0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0

31 0

Opcode

op jump target address

J Memory word address (byte address divided by 4)

26 bits

25

6 bits


Simple Arithmetic/Logic Instructions Add and subtract instructions work on registers containing whole 32 bit word. For example

Logical instructions operate on a pair of operands on a bit by bit basis.

Arithmetic/Logic with One Immediate Operand

of 51 PEC

0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 x 1 0 0 0 0 0 0

31 25 20 15 0

lw = 35 sw = 43

Base register

Data register

Offset relative to base


I 1 1 0 0 1 1 1 1 1

A[0] A[1] A[2]

A[i]

Address in base register

Offset = 4i

.

.

.

Memory

Element i of array A

Note on base and offset: The memory address is the sum of (rs) and an immediate value. Calling one of these the base and the other the offset is quite arbitrary. It would make perfect sense to interpret the address A($s3) as having the base A and the offset ($s3). However, a 16-bit base confines us to a small portion of memory space.


Load and Store Instructions

Basic load and store instructions transfer whole words(32 bits) between memory and registers. Each such instructions specifies a register and a memory address.

The register which is the data destination for load and data source for store is specified in the rt field of an I format instruction.

lw, sw, and lui Instructions

load and store instructions that deal with data types, load instructions that allow us to place an arbitrary constant in a desired register.

A small constant one that is represent able in 16 bits or less can be loaded into a register through a single addi instruction whose operand is register $ zero.

A larger constant must be placed in a register in two steps. The upper half of the register through the load upper immediate

(lui)instruction and the lower 16 bits are then inserted through an “or immediate”(ori)

of 51 PEC


Initializing a Register

Example:

Show how each of these bit patterns can be loaded into $s0:

0010 0001 0001 0000 0000 0000 0011 1101 1111 1111 1111 1111 1111 1111 1111 1111

Solution

The first bit pattern has the hex representation: 0x2110003d

lui $s0,0x2110 # put the upper half in $s0 ori $s0,0x003d # put the lower half in $s0

Same can be done, with immediate values changed to 0xffff for the second bit pattern. But, the following is simpler and faster:

nor $s0,$zero,$zero # because (0 Ú 0)¢ = 1

Jump and Branch InstructionsUnconditional jump and jump through register instructions

j verify # go to mem loc named “verify” jr $ra # go to address that is in $ra;

of 51 PEC


# $ra may hold a return address

The first instruction is a simple jump which causes program execution to proceed form the location whose numeric or symbolic address is provided instead of continuing with the next instruction in sequence.

Jump register or jr, specifies a register as holding the jump target address. This register is often $ra and the instruction jr$ra is uded to effect a return from a procedure of the point from which the procedure was called.

J instruction the 26 bit address field in the instruction is augmented with 00 to the right and 4 high order bits of the program counter(PC) to the left to form a complete 32 bit address. This is called pseudo direct addressing.

Conditional branch instructions allow us to transfer control to a given address when a condition of interest is met.

Conditional Branch Instructions

Conditional branches use PC-relative addressing

bltz $s1,L # branch on ($s1)< 0 beq $s1,$s2,L # branch on ($s1)=($s2) bne $s1,$s2,L # branch on ($s1)¹($s2)

of 51 PEC

0

0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 1 0 0 0 0 0 0 0 0 0

31 0

j = 2

op jump target address

J

Effective target address (32 bits)

25

From PC

0 0

x x x x

0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0

31 25 20 15 0

ALU instruction

Source register

Unused

op rs rt

R rd sh

10 5 fn

Unused Unused jr = 8


Comparison Instructions for Conditional Branching

slt $s1,$s2,$s3 # if ($s2)<($s3), set $s1 to 1 # else set $s1 to 0;

# often followed by beq/bne slti $s1,$s2,61 # if ($s2)<61, set $s1 to 1 # else set $s1 to 0

of 51 PEC

0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 1 1 0 0 0 0 0 0

31 25 20 15 0

bltz = 1 Zero Source Relative branch distance in words


I 0

1 1 0 0 x 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 1 1 0 0 0 0 0 0

31 25 20 15 0

beq = 4 bne = 5

Source 2 Source 1 Relative branch distance in words


I 1

1 1 1 0 1 1 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0

31 25 20 15 0

ALU instruction

Source 1 register

Source 2 register

op rs rt

R rd sh

10 5 fn

Destination Unused slt = 42

1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 1 1 0 0 0 0 0 0

31 25 20 15 0

slti = 10 Destination Source Immediate operand


I 1


Examples for Conditional Branching

Addressing Modes

Addressing mode is the method by which the location of an operand is specified within an instruction. MiniMIPS uses six address modes.

Implied addressing:Operand comes from or result goes to a predefined place that is

not explicitly specified in the instruction.

Immediate addressing:Operand is given in the instruction itself. Examples include addi,

andi, ori and xori instructions in which the second operand is supplied as part of the instruction.

Register addressing:Operand is taken from or result placed in a specified register R-

type instruction in MiniMIPS specify up to three registers as locations of their operand. Registers are specified by their 5 bit indices.

Base addressing:

of 51 PEC


Operand is in memory and its location is computed by adding an offset to the contents of a specified base register. This is the addressing mode of lw and sw instructions.

Pc relative addressing:Same as base addressing but with the register always

being the program counter and the offset appended with 0s at the right end addressing is always used.

Pseudo direct addressing:In MiniMIPS given 32 bit instructions do not have enough room

to carry full 32 bit addresses. The j instruction comes close to direct addressing because it contains 26 bits of the jump target address which is padded with 00 at the right end and 4 bits form the program counter at the left to form a full 32 bit address hence the name pseudo direct.

Procedures and Data

Simple Procedure Calls

Using a procedure involves the following sequence of actions:

of 51 PEC


1. Put arguments in places known to procedure (reg’s $a0-$a3) 2. Transfer control to procedure, saving the return address (jal) 3. Acquire storage space, if required, for use by the procedure 4. Perform the desired task 5. Put results in places known to calling program (reg’s $v0-$v1) 6. Return control to calling point (jr)

MiniMIPS instructions for procedure call and return from procedure:

jal proc # jump to loc “proc” and link; # “link” means “save the return

# address” (PC)+4 in $ra ($31)

jr rs # go to loc addressed by rs

Illustrating a Procedure Call

of 51 PEC


A Simple MiniMIPS ProcedureProcedure to find the absolute value of an integer.

$v0 ¬ |($a0)|

Solution

The absolute value of x is –x if x < 0 and x otherwise.

of 51 PEC


abs: sub $v0,$zero,$a0 # put -($a0) in $v0; # in case ($a0) < 0 bltz $a0,done # if ($a0)<0 then done add $v0,$a0,$zero # else put ($a0) in $v0 done: jr $ra # return to calling program

In practice, we seldom use such short procedures because of the overhead that they entail. In this example, we have 3-4 instructions of overhead for 3 instructions of useful computation.

Using the Stack for Data Storage: A common mechanism for saving things or making room for temporary

data that a procedure needs is the use of a dynamic data structure known as a stack.

Fig shows a map of the MiniMIPS memory and the use of the three pointer register $gp,$sp,$fp.

The second half of the MiniMIPS memory is used for memory mapped I/O and thus not available for storing programs or data.

of 51 PEC


Overview of the memory address space in MiniMIPS.

The first half of memory extending form address 0 to address 0x7ffffffff is divided into four segments as follows.

The first 1M words (4mb) are reserved for system use. The next 63M words (252 MB) hold the text of the program being

executed.

of 51 PEC


Beginning at hex address 0x100000000the program’s dats is stored. Beginning at hex address 0x7fffffffc and growing backward is the

stack.

The program’s dynamic data and the stack can grow in size upto the maximum available memory. If we set the global pointer register ($gp)to hold the address 0x10008000, then the first 216

bytes of the program’s data become readily accessible through base addressing of the form imm($gp), where imm is a 16 bit signed interger.

Stack is a dynamic data structure in which data can be placed and retrieved in last in first out order. It can be likened to the stack of trays in a cafeteria. As trays are cleaned and become ready and become ready for use, they are placed on top of the stack of trays,

Data elements are added to the stack by pushing them onto the stack and are retrieved by popping them. The stack push and pop operations are illustrated in fig.

For stack that has data elements b and a as its top two elements. The stack pointer points to the top element of the stack currently holding b.

This means that the instruction lw $t0, 0($sp) causes the value b to be copied into $t0 and sw $st1, 0($sp) causes b to be overwritten with the contents of $t1.

Thus a new element c currently in register $t4 can be pushed onto the stack by means of the following two MiniMIPS instructions.

push: addi $sp,$sp,-4addi $sp,$sp,4

To pop the element b from the stack an lw instruction is used for copying b into a desired register and the stack pointer is incremented by 4 to point to the next stack element.

pop: lw $t5,0($sp) sw $t4,0($sp)

of 51 PEC


Parameters and Results

of 51 PEC


Before a procedure call the calling program pushes the contents of any register that need to be saved onto the top of the stack and follows these with any additional arguments for the procedure. The procedure can access these arguments in the stack.

After the procedure terminates the calling program expects to find the stack pointer undisturbed, thus allowing it to restore the saved registers to their original states and proceed with its own computation.

Thus a procedure that uses the stack by modifying the stack pointer must save the content of the stack pointer at the end, restore $sp to its original state. This is done by copying the stack pointer into the frame pointer register $fp.

The three parameters a,b,c are passed to the procedure by placing them on top of the stack before the procedure is called. The procedure first pushes the contents of $fp onto the stack, copies the stack pointer into $fp, pushes the contents of registers that need to be saved onto the stack, uses the stack to hold those temporary local variables that cannot be held in register and so on.

Example of Using the StackSaving $fp, $ra, and $s0 onto the stack and restoring them at the end of the procedure

proc: sw $fp,-4($sp) # save the old frame pointeraddi $fp,$sp,0 # save ($sp) into $fpaddi $sp,$sp,–12 # create 3 spaces on top of

stacksw $ra,-8($fp) # save ($ra) in 2nd stack

elementsw $s0,-12($fp) # save ($s0) in top stack

element . . .lw $s0,-12($fp) # put top stack element in $s0lw $ra,-8($fp) # put 2nd stack element in $raaddi $sp,$fp, 0 # restore $sp to original statelw $fp,-4($sp) # restore $fp to original statejr $ra # return from procedure

Data Types

of 51 PEC


Data size (number of bits), data type (meaning assigned to bits)

Signed integer: byte wordUnsigned integer: byte wordFloating-point number: word doublewordBit string: byte word doubleword

Converting from one size to another

Type 8-bit number Value 32-bit version of the number

Unsigned 0010 1011 43 0000 0000 0000 0000 0000 0000 0010 1011

Unsigned 1010 1011 171 0000 0000 0000 0000 0000 0000 1010 1011

Signed 0010 1011 +43 0000 0000 0000 0000 0000 0000 0010 1011

Signed 1010 1011 –85 1111 1111 1111 1111 1111 1111 1010 1011

of 51 PEC

x x 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0

31 25 20 15 0

lb = 32 lbu = 36 sb = 40

Data register

Base register

Address offset

op rs rt immediate / offset

I 1 1 0 0 0 1 1


Loading and Storing Bytes

Bytes can be used to store ASCII characters or small integers. MiniMIPS addresses refer to bytes, but registers hold words.

lb $t0,8($s3) # load rt with mem[8+($s3)]# sign-extend to fill reg

lbu $t0,8($s3) # load rt with mem[8+($s3)]# zero-extend to fill reg

sb $t0,A($s3) # LSB of rt to mem[A+($s3)]

of 51 PEC


Meaning of a Word in Memory

A 32-bit word has no inherent meaning and can be interpreted in a number of equally valid ways in the absence of other cues (e.g., context) for the intended meaning.

Arrays and Pointers

Index: Use a register that holds the index i and increment the register in each step to effect moving from element i of the list to element i + 1 Pointer: Use a register that points to (holds the address of) the list element being examined and update it in each step to point to the next element

of 51 PEC

0

x

0 0

fn

0 0 0 0 0 0 0 0 0 0 0 1 0 0 x 0 0 1 1 1 0 0 0 0 0 0 0 0 0

31 25 20 15 0

ALU instruction

Unused Source register

op rs rt

R rd sh

10 5


Shift amount

sll = 0 srl = 2

1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0

31 25 20 15 0

ALU instruction

Amount register

Source register

op rs rt

R rd sh

10 5 fn


Unused sllv = 4 srlv = 6


Additional Instructions

MiniMIPS instructions for multiplication and division:

mult $s0, $s1 # set Hi,Lo to ($s0)´($s1)div $s0, $s1 # set Hi to ($s0)mod($s1)

# and Lo to ($s0)/($s1)mfhi $t0 # set $t0 to (Hi)mflo $t0 # set $t0 to (Lo)

The multiply (mult) and divide (div) instructions of MiniMIPS.

MiniMIPS instructions for copying the contents of Hi and Lo registers into general registers Logical ShiftsMiniMIPS instructions for left and right shifting:

sll $t0,$s1,2 # $t0=($s1) left-shifted by 2 srl $t0,$s1,2 # $t0=($s1) right-shifted by 2 sllv $t0,$s1,$s0 # $t0=($s1) left-shifted by ($s0) srlv $t0,$s1,$s0 # $t0=($s1) right-shifted by ($s0)

of 51 PEC

1 0 0 1 1 0 0

fn

0 0 0 0 0 0 0 0 0 0 0 0 x 0 0 1 1 0 0 0 0 0 0 0 0

31 25 20 15 0

ALU instruction

Source register 1

Source register 2

op rs rt

R rd sh

10 5

Unused Unused mult = 24 div = 26

1 0 0 0 0 0 0 1 0 0

fn

0 0 0 0 0 0 0 0 0 0 0 x 0 0 0 0 0 0 0 0 0 0

31 25 20 15 0

ALU instruction

Unused Unused

op rs rt

R rd sh

10 5


Unused mfhi = 16 mflo = 18


Unsigned Arithmetic and Miscellaneous Instructions

MiniMIPS instructions for unsigned arithmetic (no overflow exception):

addu $t0,$s0,$s1 # set $t0 to ($s0)+($s1)subu $t0,$s0,$s1# set $t0 to ($s0)–($s1)multu $s0,$s1 # set Hi,Lo to ($s0)´($s1)divu $s0,$s1 # set Hi to ($s0)mod($s1)

# and Lo to ($s0)/($s1)addiu $t0,$s0,61 # set $t0 to ($s0)+61;

# the immediate operand is# sign extended

To make MiniMIPS more powerful and complete, we introduce later:sra $t0,$s1,2 # sh. right arith (Sec. 10.5)

srav $t0,$s1,$s0 # shift right arith variablesyscall # system call (Sec. 7.6)

of 51 PEC


The 37 + 3 MiniMIPS Instructions Covered So Far

Assembly Language Programs

of 51 PEC


Machine and Assembly Languages

A computer program can be represented at different levels of abstraction. A program could be written in a machine-independent, high-level language such as Java or C++. A computer can execute programs only when they are represented in machine language specific to its architecture.

A machine language program for a given architecture is a collection of machine instructions represented in binary form.

Programs written at any level higher than the machine language must be translated to the binary representation before a computer can execute them. An assembly language program is a symbolic representation of the machine language program.

Machine language is pure binary code, whereas assembly language is a direct mapping of the binary code onto a symbolic form that is easier for humans to understand and manage.

Converting the symbolic representation into machine language is performed by a special program called the assembler.

An assembler is a program that accepts a symbolic language program (source) and produces its machine language equivalent (target).

In translating a program into binary code, the assembler will replace symbolic addresses by numeric addresses, replace symbolic operation codes by machine operation codes, reserve storage for instructions and data, and translate constants into machine representation.

of 51 PEC


A SIMPLE MACHINE Machine language is the native language of a given processor. Since

assembly language is the symbolic form of machine language, each different type of processor has its own unique assembly language.

The assembly language of a given processor, we need first to understand the details of that processor. We need to know the memory size and organization, the processor registers, the instruction format, and the entire instruction set,

A very simple hypothetical processor, which will be used in explaining the different topics in assembly language.

Our simple machine is an accumulator-based processor, which has five 16-bit registers: Program Counter (PC), Instruction Register (IR), Address Register (AR), Accumulator (AC), and Data Register (DR). The PC contains the address of the next instruction to be executed. The IR contains the operation code portion of the instruction being executed. The AR contains the address portion (if any) of the instruction being executed. The AC serves as the implicit source and destination of data. The DR is used to hold data. The memory unit is made up of 4096 words of storage. The word size is 16 bits.

of 51 PEC

0 00100000000100000000000000001001

addi $s0,$zero,9

test

done result

12

28 248

4 00000010000100000100000000100010 8 00000001001000000000000000100000 12 00010101000100000000000000001100 16 00100001000010000000000000000001 20 00000010000000000100100000100000 24 00001000000000000000000000000011 28 10101111100010010000000011111000

Determined from assembler directives not shown here

Symbol table

done: sw $t1,result($gp)

sub $t0,$s0,$s0 add $t1,$zero,$zero test: bne $t0,$s0,done addi $t0,$t0,1 add $t1,$s0,$zero j test

Assembly language program Machine language program Location

op rs rt rd sh fn Field boundaries shown to facilitate understanding


Symbol Table

Assembler DirectivesAssembler directives provide the assembler with info on how to translate the program but do not lead to the generation of machine instructions

.macro # start macro (see Section 7.4) .end_macro # end macro (see Section 7.4) .text # start program’s text segment ... # program text goes here .data # start program’s data segment tiny: .byte 156,0x7a # name & initialize data byte(s) max: .word 35000 # name & initialize data word(s)small: .float 2E-3 # name short float (see Chapter 12) big: .double 2E-3 # name long float (see Chapter 12) .align 2 # align next item on word boundaryarray: .space 600 # reserve 600 bytes = 150 words str1: .ascii “a*b” # name & initialize ASCII string str2: .asciiz “xyz” # null-terminated ASCII string .global main # consider “main” a global name

Composing Simple Assembler DirectivesWrite assembler directive to achieve each of the following objectives:

of 51 PEC


a. Put the error message “Warning: The printer is out of paper!” in memory.b. Set up a constant called “size” with the value 4.c. Set up an integer variable called “width” and initialize it to 4.d. Set up a constant called “mill” with the value 1,000,000 (one million).e. Reserve space for an integer vector “vect” of length 250.

Solution:

a. noppr: .asciiz “Warning: The printer is out of paper!”b. size: .byte 4 # small constant fits in one bytec. width: .word 4 # byte could be enough, but ...d. mill: .word 1000000 # constant too large for bytee. vect: .space 1000 # 250 words = 1000 bytes

Pseudoinstructions

Example of one-to-one pseudoinstruction: The following

not $s0 # complement ($s0) is converted to the real instruction:

nor $s0,$s0,$zero # complement ($s0) Example of one-to-several pseudoinstruction: The following abs $t0,$s0 # put |($s0)| into $t0 is converted to the sequence of real instructions: add $t0,$s0,$zero # copy x into $t0

slt $at,$t0,$zero # is x negative?beq $at,$zero,+4 # if not, skip next instrsub $t0,$zero,$s0 # the result is 0 – x

of 51 PEC


Macroinstructions

A macro is a mechanism to give a name to an often-used sequence of instructions (shorthand notation) .macro name(args) # macro and arguments named ... # instr’s defining the macro .end_macro # macro terminator How is a macro different from a pseudoinstruction? Pseudos are predefined, fixed, and look like machine instructions Macros are user-defined and resemble procedures (have arguments)

of 51 PEC


How is a macro different from a procedure? Control is transferred to and returns from a procedure After a macro has been replaced, no trace of it remains

Macro to Find the Largest of Three ValuesWrite a macro to determine the largest of three values in registers and to put the result in a fourth register.

Solution:

.macro mx3r(m,a1,a2,a3) # macro and arguments named move m,a1 # assume (a1) is largest; m = (a1) bge m,a2,+4 # if (a2) is not larger, ignore it move m,a2 # else set m = (a2) bge m,a3,+4 # if (a3) is not larger, ignore it move m,a3 # else set m = (a3) .endmacro # macro terminator

If the macro is used as mx3r($t0,$s0,$s4,$s3), the assembler replaces the arguments m, a1, a2, a3 with $t0, $s0, $s4, $s3, respectively.

Linking and Loading

The linker has the following responsibilities: Ensuring correct interpretation (resolution) of labels in all modules Determining the placement of text and data segments in memory Evaluating all data addresses and instruction labels Forming an executable program with no unresolved references The loader is in charge of the following: Determining the memory needs of the program from its header Copying text and data from the executable program file into memory Modifying (shifting) addresses, where needed, during copying Placing program parameters onto the stack (as in a procedure call) Initializing all machine registers, including the stack pointer Jumping to a start-up routine that calls the program’s main routine

of 51 PEC


Running Assembler ProgramsSpim is a simulator that can run MiniMIPS programs

The name Spim comes from reversing MIPS

Three versions of Spim are available for free downloading:

PCSpim for Windows machines

xspim for X-windows

spim for Unix systems

Input/Output Conventions for MiniMIPS

of 51 PEC


Instruction Set Variations

Review of Some Key Concepts

Complex InstructionsMachine Instruction Effect

Pentium MOVS Move one element in a string of bytes, words, or doublewords using addresses specified in two pointer registers; after the operation, increment or decrement the registers to point to the next element of the string

PowerPC cntlzd Count the number of consecutive 0s in a specified source register beginning with bit position 0 and place the count in a destination register

IBM 360-370

CS Compare and swap: Compare the content of a register to that of a memory location; if unequal, load the memory word into the register, else store the content of a different register into the same memory location

Digital VAX POLYD Polynomial evaluation with double flp arithmetic:

of 51 PEC


Evaluate a polynomial in x, with very high precision in intermediate results, using a coefficient table whose location in memory is given within the instruction

Some Details of Sample Complex Instructions

of 51 PEC


Benefits and Drawbacks of Complex Instructions

Alternative Addressing Modes

of 51 PEC


More Elaborate Addressing Modes

of 51 PEC


Usefulness of Some Elaborate Addressing Modes

Variations in Instruction Formats

Zero-Address Architecture: Stack Machine Stack holds all the operands (replaces our register file)

Load/Store operations become push/pop

Arithmetic/logic operations need only an opcode: they pop operand(s) from the top of the stack and push the result onto the stack

of 51 PEC


One-Address Architecture: Accumulator Machine

The accumulator, a special register attached to the ALU, always holds operand 1 and the operation result

Only one operand needs to be specified by the instruction

Two-Address ArchitecturesTwo addresses may be used in different ways:

Operand1/result and operand 2

of 51 PEC


Condition to be checked and branch target address

of 51 PEC


Instruction Set Design and EvolutionDesirable attributes of an instruction set: Consistent, with uniform and generally applicable rulesOrthogonal, with independent features noninterferingTransparent, with no visible side effect due to implementation detailsEasy to learn/use (often a byproduct of the three attributes above)Extensible, so as to allow the addition of future capabilitiesEfficient, in terms of both memory needs and hardware realization

The RISC/CISC Dichotomy

The RISC (reduced instruction set computer) philosophy: Complex instruction sets are undesirable because inclusion of

mechanisms to interpret all the possible combinations of opcodes and operands might slow down even very simple operations.

Ad hoc extension of instruction sets, while maintaining backward compatibility, leads to CISC; imagine modern English containingevery English word that has been used through the agesFeatures of RISC architecture

1. Small set of inst’s, each executable in roughly the same time2. Load/store architecture (leading to more registers)3. Limited addressing mode to simplify address calculations4. Simple, uniform instruction formats (ease of decoding)

of 51 PEC

unit i

Documents

computer architecture

computer output

inactive computer

characteristics of computer

computer architecture

output unit

logic unit

stored data