unit i
TRANSCRIPT
CS 2022 Computer architecture UNIT I
UNIT IINSTRUCTION SET ARCHITECTURE Introduction to computer architecture - Review of digital design – Instructions and addressing – procedures and data – assembly language programs – instruction set variations
INTRODUCTION TO COMPUTER ARCHITECTURE
A Computer is a machine which accepts input information in the
digitized form, processes the input according to a set of stored instructions
and produces the resulting output information.
PROGRAM AND DATA
The set of stored instructions written using a computer to solve the
task is called program and input and output information is called data.
The internal storage where programs are stored is called Memory.
Characteristics of computer
1. Speed: Computers perform various operations at a very high speed.
2. Accuracy: Computers are very accurate. Do not make mistakes in
calculations.
3. Reliability: Computers gives correct and consistent results always even if
they are used in adverse conditions. Many times errors are caused by human
interventions not by computer. Computer output is reliable, subject to the
condition that the input data and the instructions (programs) are correct.
Incorrect input data and unreliable programs give us wrong results.
4 Storage Capacity: The computer can store large amount of data and can
be retrieved at any time in fractions of a second. This data can be stored in
permanent storage devices like hard disk, CDs etc.
5 Versatility: Computers can do a variety of jobs based on the instructions
given to them. They are used in each and every field, making the tasks
easier.
Limitations of a Computer:-1) Not intelligent2) Inactive
Computer = Hardware + Software
Page 1 of 51 PEC
CS 2022 Computer architecture UNIT I
Hardware: • Hardware is the physical aspect of computers, telecommunications,
and other device.
• Hardware implies permanence and invariability
• The components include keyboard, floppy drive, hard disk, monitor,
CPU, printer, wires, transistors, circuits etc.
Software: It is a set of programs used to perform certain tasks. Program is set of instructions to carry out a particular task
Hardware and Software
Hardware Software
The physical components
making up the system are
termed as Hardware.
Software is a set of programs
used to perform certain
tasks(logical component)
The components include
keyboard, floppy drive, hard
disk, monitor, CPU,
printer, wires, transistors,
circuits etc.
Software’s include compliers,
loaders, Banking s/w, library
s/w, payroll s/w etc.
Hardware works based on
instructions
Software tell the hardware
what to do
TYPES OF COMPUTER
Computers are classified according to size, cost, power of the
processor, and type of usage.
Some of types of computer are
Personal computers(PC)
Widely used in homes, schools, and business offices.
Notebook computers
Page 2 of 51 PEC
CS 2022 Computer architecture UNIT I
It’s a compact version of the PC with all the components are
packed together into a single unit.
Workstations
It has a High resolution –Graphics input / output capabilities.
Desktop Computers
They have processing and storage units, visual display and audio
output displays.
Enterprise systems or mainframes and Servers
Mainframes are used for business data processing in medium
and large corporations that require more computing power and
storage capacity.
Review of digital design
Servers:
Servers contain sizable database storage units and are capable of handling
large volumes of request to access the data. The request and responses are
transported over internet communication facilities.
Supercomputers
They are used for large – scale numerical calculations required in
applications such as weather forecasting and aircraft design and
simulation.
I. FUNCTIONAL UNITS
A computer consists of five functionally independent main parts. They are,
Input
Memory
Arithmetic and logic
Output
Control unit
Page 3 of 51 PEC
CS 2022 Computer architecture UNIT I
Basic functional units of a computer
Figure: The operation of a computer can be summarized as
follows
The computer accepts programs and the data through an input and stores
them in the memory.
The stored data are processed by the arithmetic and logic unit under
program control.
The processed data is delivered through the output unit.
All above activities are directed by control unit.
The information is stored either in the computer’s memory for later use or
immediately used by ALU to perform the desired operations.
Instructions are explicit commands that
Manage the transfer of information within a computer as well as
between the computer and its I/O devices.
Specify the arithmetic and logic operations to be performed.
To execute a program, the processor fetches the instructions one after
another, and performs the desired operations.
The processor accepts only the machine language program.
To get the machine language program, Complier is used.
Page 4 of 51 PEC
CPU
ALU
MEMORY UNIT
MAIN
SECONDARY
O/P UNITI/P UNIT
CS 2022 Computer architecture UNIT I
Note: Compiler is software (Translator) which converts the High
Level Language program (source program) into Machine language
program (object program)
Input unit
The computer accepts coded information through input unit. The input
can be from human operators, electromechanical devices such as
keyboards or from other computer over communication lines.
Examples of input devices are
Keyboard, joysticks, trackballs and mouse are used as graphic
input devices in conjunction with display.
Microphones can be used to capture audio input which is then
sampled and converted into digital code for storage and
processing.
Keyboard
It is a common input device.
Whenever a key is pressed, the corresponding letter or digit is
automatically translated into its corresponding binary code and
transmitted over cable to the memory of the computer.
Memory unit
Memory unit is used to store programs as well as data.
Memory is classified into primary and secondary storage.
Primary storage
It also called main memory.
It operates at high speed and it is expensive.
It is made up of large number of semiconductor storage cells, each
capable of storing one bit of information.
These cells are grouped together in a fixed size called word. This
facilitates reading and writing the content of one word (n bits) in single
basic operation instead of reading and writing one bit for each
operation
Page 5 of 51 PEC
CS 2022 Computer architecture UNIT I
Each word is associated with a distinct address that identifies word
location. A given word is accessed by specifying its address.
Word length
The number of bits in each word is called word length of the
computer.
Typical word lengths range from 16 to 64bits.
Programs must reside in the primary memory during execution.
RAM
It stands for Random Access Memory. Memory in which any
location can be reached in a short and fixed amount of time by
specifying its address is called random-access memory.
Memory access time
Time required to access one word is called Memory access
time.
This time is fixed and independent of the word being
accessed.
It typically ranges from few nano seconds (ns) to about
100ns.
Caches
They are small and fast RAM units.
They are tightly coupled with the processor.
They are often contained on the same integrated circuits(IC) chip
to achieve high performance.
Secondary storage
It is slow in speed.
It is cheaper than primary memory.
Its capacity is high.
It is used to store information that is not accessed frequently.
Various secondary devices are magnetic tapes and disks, optical disks
(CD-ROMs), floppy etc.
Page 6 of 51 PEC
CS 2022 Computer architecture UNIT I
Arithmetic and logic unit Arithmetic and logic unit (ALU) and control unit
together form a processor. Actual execution of most computer operations
takes place in arithmetic and logic unit of the processor.
Example: Suppose two numbers located in the memory are to be added.
They are brought into the processor, and the actual addition is carried out by
the ALU.
Registers:
Registers are high speed storage elements available in the processor.
Each register can store one word of data.
When operands are brought into the processor for any operation, they
are stored in the registers.
Accessing data from register is faster than that of the memory.
Output unit
The function of output unit is to produce processed result to the
outside world in human understandable form.
Examples of output devices are Graphical display, Printers such as
inkjet, laser, dot matrix and so on. The laser printer works faster.
Control unit
Control unit coordinates the operation of memory, arithmetic and logic
unit, input unit, and output unit in some proper way. Control unit sends
control signals to other units and senses their states.
Example:
Data transfers between the processor and the memory are
controlled by the control unit through timing signals.
Timing signals are the signals that determine when a given
action is to take place.
Control units are well defined, physically separate unit that
interact with other parts of the machine.
A set of control lines carries the signals used for timing and
synchronization of events in all units.
Differences between:
Page 7 of 51 PEC
CS 2022 Computer architecture UNIT I
Primary Memory Secondary MemoryAlso called as Main memory. Also called as Auxiliary memory.
Accessing the data is faster. Accessing the data is slow.
CPU can access directly CPU cannot access directly
Semiconductor memory. Magnetic memory.
Data storage capacity is less. Data storage capacity is more or
huge.
Expensive. Not expensive.
It is Internal memory. It is External memory.
Examples : RAM, ROM Examples: hard disk, floppy disk, magnetic tape etc.
Page 8 of 51 PEC
RAM ROM
Random Access Memory. Read Only Memory.
Volatile memory.
The contents of the RAM are lost
when power is turned off.
Non-volatile memory.
The contents of the ROM are not
lost when power is turned off.
Temporary storage medium. Permanent storage medium.
The data can be read and written. The data can only be read, but the
data cannot be written.
The programs are brought into
RAM just before execution.
BIOS and monitor programs are
stored.
CS 2022 Computer architecture UNIT I
Combinational Digital Circuits
z ¢ =
1 – z NOT converted to arithmetic form xy AND same as multiplication
(when doing the algebra, set zk = z) x Ú y = x + y - xy OR converted to arithmetic form x Å y = x + y - 2xy XOR converted to arithmetic form
Example: Prove the identity xyz Ú x ¢ Ú y ¢ Ú z ¢ º? 1LHS = [xyz Ú x ¢] Ú [y ¢ Ú z ¢] = [xyz + 1 – x – (1 – x)xyz] Ú [1 – y + 1 – z – (1 – y)(1 – z)] = [xyz + 1 – x] Ú [1 – yz] = (xyz + 1 – x) + (1 – yz) – (xyz + 1 – x)(1 – yz) = 1 + xy2z2 – xyz = 1 = RHS
Page 9 of 51 PEC
OR NOR NAND AND XNOR
Enable/Pass signal e
Data in x
Data out x or 0
Data in x
Enable/Pass signal e
Data out x or “high impedance”
(a) AND gate for controlled transfer (b) Tristate buffer
(c) Model for AND switch.
x
e
No data or x
0
1 x
e
ex
0
1 0
(d) Model for tristate buffer.
e
e
e Data out (x, y, z, or high
impedance)
(b) Wired OR of t ristate outputs
e
e
e
Data out (x, y, z, or 0)
(a) Wired OR of product terms
z
x
y
z
x
y
z
x
y
z
x
y
CS 2022 Computer architecture UNIT I
Variations in Gate Symbols
Gates with more than two inputs and/or with inverted signals at input or output.
Gates as Control Elements
Wired OR allows tying together of several controlled signalsWired OR and Bus Connections
Page 10 of 51 PEC
CS 2022 Computer architecture UNIT I
Wired OR allows tying together of several controlled signals.
Boolean Functions and Expressions
Ways of specifying a logic function Truth table: 2n row, “don’t-care” in input or output
Logic expression: w ¢ (x Ú y Ú z), product-of-sums, sum-of-products, equivalent expressions
Word statement: Alarm will sound if the door is opened while the security system is engaged, or when the smoke detector is triggered
Logic circuit diagram: Synthesis vs analysis
Manipulating Logic Expressions
Instructions and AddressingAbstract View of Hardware
Name of law OR version AND version
Identity x Ú 0 = x x 1 = x
One/Zero x Ú 1 = 1 x 0 = 0
Idempotent x Ú x = x x x = x
Inverse x Ú x ¢ = 1 x x ¢ = 0
Page 11 of 51 PEC
CS 2022 Computer architecture UNIT I
Commutative x Ú y = y Ú x x y = y x
Associative (x Ú y) Ú z = x Ú (y Ú z) (x y) z = x (y z)
Distributive x Ú (y z) = (x Ú y) (x Ú z) x (y Ú z) = (x y) Ú (x z)
DeMorgan’s (x Ú y)¢ = x ¢ y ¢ (x y)¢ = x ¢ Ú y ¢
The miniMIPS(minimal MIPS is a load/store instruction set, meaning that data elements must be copied or loaded into registers before they can be processed, operation results also go into registers and must be explicitly copied back into memory through separate store operations.
The fig shows the MiniMips memory unit with its up to 2 30 words (2 32 bytes)an execution and integer unit(EIU) a floating point unit(FPU) and a trap and memory unit (TMU).
FPU and TMU are shown for completeness.
The EIU has 32 general purpose registers each of which is 32 bits wide and can thus hold the content of one memory location.
The arithmetic / logic unit(ALU) executes addition subtraction and logical instructions.
A separate arithmetic unit is dedicated to multiplication and division instructions whose results are placed in two special registers, named “Hi” and “Lo”, form where they can be moved into general purpose registers.
Aview of MiniMIPS registers and data sizes is presented in the fig. all registers except register 0($0) which is permanently holds the constant 0, are general purpose and can be used to store arbitrary data words.
Page 12 of 51 PEC
CS 2022 Computer architecture UNIT I
REGISTRES AND DATS SIZES IN MINIMIPS
A 32 bit data element stored in a register or a memory location is referred to as a word. Assume that word holds an unsigned integer, a floating point number or a string of ASCII characters .
MiniMIPS words are stored in byte addressable memory.
Page 13 of 51 PEC
CS 2022 Computer architecture UNIT I
MiniMIPS used the “big-endian” scheme where the most significant end appears in the first .
Double word occupies two consecutive registers or memory locations.
Instruction Formats
A typical MiniMIPS machine instruction is add $t8,$s2,$s1, which causes the contents of registers $s2and $s1 to be added, with the result stored in registers $t8.
This might correspond to the compiled form of the high level language statement a=b+c and is in turn represented in a machine word using 0s and 1s to encode the operation and registers specifications.
Add, Subtract, and Specification of Constants
Page 14 of 51 PEC
CS 2022 Computer architecture UNIT I
A machine instruction for an arithmetic operand specifies an opcode, one or more source operands and usually one destination operand.
Opcode is a binary code that defines an operation. The operands of an arithmetic or logical instruction may come from a variety of sources.
The method used to specify where the operands are to be found and where the results must go is referred to as the addressing mode.
Page 15 of 51 PEC
CS 2022 Computer architecture UNIT I
MiniMIPS Instruction Formats
Register or R type instructions operate on the two registers identified in the rs and rt fields and store the result in register rd.
The function fn fields serves as an extension of the opcode to allow for more operations to be defined and the shift amount filed is used in instructions that specify a constant shift amount.
Immediate or I Type instructions are really of two different varieties. In Immediate instructions the 16 bit operand field in bits 0-15 holds an integer that plays the same role as rt in R-type instructions.
In load store or branch instructions the 16bit filed is interpreted as an offset, or relative address, that is to be added to the base value in registers rs to obtain a memory address for reading or writing.
For data access the offset is interpreted as the number of bytes forward or backward relative to the base address.
For branch instructions the offset is in words given that instructions always occupy complete 32 bit memory.
Jump or J type instructions cause unconditional transfer of control to the instructions in the specified address. MiniMIPS address are 32 bits wide. Whereas only 26 bits are available in the address field of a j-type instruction.
Two conventions are used first the 26 bit field is assumed to carry a word as opposed to a byte address. The hardware attaches two 0s to the right end of the 26 bit address field to derive a 28 bit word address.
Page 16 of 51 PEC
5 bits 5 bits
31 25 20 15 0
Opcode Source register 1
Source register 2
op rs rt
R 6 bits 5 bits
rd
5 bits
sh
6 bits
10 5 fn
Destination register
Shift amount
Opcode extension
Immediate operand or address offset
31 25 20 15 0
Opcode Destination or data
Source or base
op rs rt operand / offset
I 5 bits 6 bits 16 bits 5 bits
0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0
31 0
Opcode
op jump target address
J Memory word address (byte address divided by 4)
26 bits
25
6 bits
CS 2022 Computer architecture UNIT I
Simple Arithmetic/Logic Instructions Add and subtract instructions work on registers containing whole 32 bit word. For example
Logical instructions operate on a pair of operands on a bit by bit basis.
Arithmetic/Logic with One Immediate Operand
Page 17 of 51 PEC
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 x 1 0 0 0 0 0 0
31 25 20 15 0
lw = 35 sw = 43
Base register
Data register
Offset relative to base
op rs rt operand / offset
I 1 1 0 0 1 1 1 1 1
A[0] A[1] A[2]
A[i]
Address in base register
Offset = 4i
.
.
.
Memory
Element i of array A
Note on base and offset: The memory address is the sum of (rs) and an immediate value. Calling one of these the base and the other the offset is quite arbitrary. It would make perfect sense to interpret the address A($s3) as having the base A and the offset ($s3). However, a 16-bit base confines us to a small portion of memory space.
CS 2022 Computer architecture UNIT I
Load and Store Instructions
Basic load and store instructions transfer whole words(32 bits) between memory and registers. Each such instructions specifies a register and a memory address.
The register which is the data destination for load and data source for store is specified in the rt field of an I format instruction.
lw, sw, and lui Instructions
load and store instructions that deal with data types, load instructions that allow us to place an arbitrary constant in a desired register.
A small constant one that is represent able in 16 bits or less can be loaded into a register through a single addi instruction whose operand is register $ zero.
A larger constant must be placed in a register in two steps. The upper half of the register through the load upper immediate
(lui)instruction and the lower 16 bits are then inserted through an “or immediate”(ori)
Page 18 of 51 PEC
CS 2022 Computer architecture UNIT I
Initializing a Register
Example:
Show how each of these bit patterns can be loaded into $s0:
0010 0001 0001 0000 0000 0000 0011 1101 1111 1111 1111 1111 1111 1111 1111 1111
Solution
The first bit pattern has the hex representation: 0x2110003d
lui $s0,0x2110 # put the upper half in $s0 ori $s0,0x003d # put the lower half in $s0
Same can be done, with immediate values changed to 0xffff for the second bit pattern. But, the following is simpler and faster:
nor $s0,$zero,$zero # because (0 Ú 0)¢ = 1
Jump and Branch InstructionsUnconditional jump and jump through register instructions
j verify # go to mem loc named “verify” jr $ra # go to address that is in $ra;
Page 19 of 51 PEC
CS 2022 Computer architecture UNIT I
# $ra may hold a return address
The first instruction is a simple jump which causes program execution to proceed form the location whose numeric or symbolic address is provided instead of continuing with the next instruction in sequence.
Jump register or jr, specifies a register as holding the jump target address. This register is often $ra and the instruction jr$ra is uded to effect a return from a procedure of the point from which the procedure was called.
J instruction the 26 bit address field in the instruction is augmented with 00 to the right and 4 high order bits of the program counter(PC) to the left to form a complete 32 bit address. This is called pseudo direct addressing.
Conditional branch instructions allow us to transfer control to a given address when a condition of interest is met.
Conditional Branch Instructions
Conditional branches use PC-relative addressing
bltz $s1,L # branch on ($s1)< 0 beq $s1,$s2,L # branch on ($s1)=($s2) bne $s1,$s2,L # branch on ($s1)¹($s2)
Page 20 of 51 PEC
0
0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 1 0 0 0 0 0 0 0 0 0
31 0
j = 2
op jump target address
J
Effective target address (32 bits)
25
From PC
0 0
x x x x
0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0
31 25 20 15 0
ALU instruction
Source register
Unused
op rs rt
R rd sh
10 5 fn
Unused Unused jr = 8
CS 2022 Computer architecture UNIT I
Comparison Instructions for Conditional Branching
slt $s1,$s2,$s3 # if ($s2)<($s3), set $s1 to 1 # else set $s1 to 0;
# often followed by beq/bne slti $s1,$s2,61 # if ($s2)<61, set $s1 to 1 # else set $s1 to 0
Page 21 of 51 PEC
0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 1 1 0 0 0 0 0 0
31 25 20 15 0
bltz = 1 Zero Source Relative branch distance in words
op rs rt operand / offset
I 0
1 1 0 0 x 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 1 1 0 0 0 0 0 0
31 25 20 15 0
beq = 4 bne = 5
Source 2 Source 1 Relative branch distance in words
op rs rt operand / offset
I 1
1 1 1 0 1 1 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0
31 25 20 15 0
ALU instruction
Source 1 register
Source 2 register
op rs rt
R rd sh
10 5 fn
Destination Unused slt = 42
1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 1 1 0 0 0 0 0 0
31 25 20 15 0
slti = 10 Destination Source Immediate operand
op rs rt operand / offset
I 1
CS 2022 Computer architecture UNIT I
Examples for Conditional Branching
Addressing Modes
Addressing mode is the method by which the location of an operand is specified within an instruction. MiniMIPS uses six address modes.
Implied addressing:Operand comes from or result goes to a predefined place that is
not explicitly specified in the instruction.
Immediate addressing:Operand is given in the instruction itself. Examples include addi,
andi, ori and xori instructions in which the second operand is supplied as part of the instruction.
Register addressing:Operand is taken from or result placed in a specified register R-
type instruction in MiniMIPS specify up to three registers as locations of their operand. Registers are specified by their 5 bit indices.
Base addressing:
Page 22 of 51 PEC
CS 2022 Computer architecture UNIT I
Operand is in memory and its location is computed by adding an offset to the contents of a specified base register. This is the addressing mode of lw and sw instructions.
Pc relative addressing:Same as base addressing but with the register always
being the program counter and the offset appended with 0s at the right end addressing is always used.
Pseudo direct addressing:In MiniMIPS given 32 bit instructions do not have enough room
to carry full 32 bit addresses. The j instruction comes close to direct addressing because it contains 26 bits of the jump target address which is padded with 00 at the right end and 4 bits form the program counter at the left to form a full 32 bit address hence the name pseudo direct.
Procedures and Data
Simple Procedure Calls
Using a procedure involves the following sequence of actions:
Page 23 of 51 PEC
CS 2022 Computer architecture UNIT I
1. Put arguments in places known to procedure (reg’s $a0-$a3) 2. Transfer control to procedure, saving the return address (jal) 3. Acquire storage space, if required, for use by the procedure 4. Perform the desired task 5. Put results in places known to calling program (reg’s $v0-$v1) 6. Return control to calling point (jr)
MiniMIPS instructions for procedure call and return from procedure:
jal proc # jump to loc “proc” and link; # “link” means “save the return
# address” (PC)+4 in $ra ($31)
jr rs # go to loc addressed by rs
Illustrating a Procedure Call
Page 24 of 51 PEC
CS 2022 Computer architecture UNIT I
A Simple MiniMIPS ProcedureProcedure to find the absolute value of an integer.
$v0 ¬ |($a0)|
Solution
The absolute value of x is –x if x < 0 and x otherwise.
Page 25 of 51 PEC
CS 2022 Computer architecture UNIT I
abs: sub $v0,$zero,$a0 # put -($a0) in $v0; # in case ($a0) < 0 bltz $a0,done # if ($a0)<0 then done add $v0,$a0,$zero # else put ($a0) in $v0 done: jr $ra # return to calling program
In practice, we seldom use such short procedures because of the overhead that they entail. In this example, we have 3-4 instructions of overhead for 3 instructions of useful computation.
Using the Stack for Data Storage: A common mechanism for saving things or making room for temporary
data that a procedure needs is the use of a dynamic data structure known as a stack.
Fig shows a map of the MiniMIPS memory and the use of the three pointer register $gp,$sp,$fp.
The second half of the MiniMIPS memory is used for memory mapped I/O and thus not available for storing programs or data.
Page 26 of 51 PEC
CS 2022 Computer architecture UNIT I
Overview of the memory address space in MiniMIPS.
The first half of memory extending form address 0 to address 0x7ffffffff is divided into four segments as follows.
The first 1M words (4mb) are reserved for system use. The next 63M words (252 MB) hold the text of the program being
executed.
Page 27 of 51 PEC
CS 2022 Computer architecture UNIT I
Beginning at hex address 0x100000000the program’s dats is stored. Beginning at hex address 0x7fffffffc and growing backward is the
stack.
The program’s dynamic data and the stack can grow in size upto the maximum available memory. If we set the global pointer register ($gp)to hold the address 0x10008000, then the first 216
bytes of the program’s data become readily accessible through base addressing of the form imm($gp), where imm is a 16 bit signed interger.
Stack is a dynamic data structure in which data can be placed and retrieved in last in first out order. It can be likened to the stack of trays in a cafeteria. As trays are cleaned and become ready and become ready for use, they are placed on top of the stack of trays,
Data elements are added to the stack by pushing them onto the stack and are retrieved by popping them. The stack push and pop operations are illustrated in fig.
For stack that has data elements b and a as its top two elements. The stack pointer points to the top element of the stack currently holding b.
This means that the instruction lw $t0, 0($sp) causes the value b to be copied into $t0 and sw $st1, 0($sp) causes b to be overwritten with the contents of $t1.
Thus a new element c currently in register $t4 can be pushed onto the stack by means of the following two MiniMIPS instructions.
push: addi $sp,$sp,-4addi $sp,$sp,4
To pop the element b from the stack an lw instruction is used for copying b into a desired register and the stack pointer is incremented by 4 to point to the next stack element.
pop: lw $t5,0($sp) sw $t4,0($sp)
Page 28 of 51 PEC
CS 2022 Computer architecture UNIT I
Parameters and Results
Page 29 of 51 PEC
CS 2022 Computer architecture UNIT I
Before a procedure call the calling program pushes the contents of any register that need to be saved onto the top of the stack and follows these with any additional arguments for the procedure. The procedure can access these arguments in the stack.
After the procedure terminates the calling program expects to find the stack pointer undisturbed, thus allowing it to restore the saved registers to their original states and proceed with its own computation.
Thus a procedure that uses the stack by modifying the stack pointer must save the content of the stack pointer at the end, restore $sp to its original state. This is done by copying the stack pointer into the frame pointer register $fp.
The three parameters a,b,c are passed to the procedure by placing them on top of the stack before the procedure is called. The procedure first pushes the contents of $fp onto the stack, copies the stack pointer into $fp, pushes the contents of registers that need to be saved onto the stack, uses the stack to hold those temporary local variables that cannot be held in register and so on.
Example of Using the StackSaving $fp, $ra, and $s0 onto the stack and restoring them at the end of the procedure
proc: sw $fp,-4($sp) # save the old frame pointeraddi $fp,$sp,0 # save ($sp) into $fpaddi $sp,$sp,–12 # create 3 spaces on top of
stacksw $ra,-8($fp) # save ($ra) in 2nd stack
elementsw $s0,-12($fp) # save ($s0) in top stack
element . . .lw $s0,-12($fp) # put top stack element in $s0lw $ra,-8($fp) # put 2nd stack element in $raaddi $sp,$fp, 0 # restore $sp to original statelw $fp,-4($sp) # restore $fp to original statejr $ra # return from procedure
Data Types
Page 30 of 51 PEC
CS 2022 Computer architecture UNIT I
Data size (number of bits), data type (meaning assigned to bits)
Signed integer: byte wordUnsigned integer: byte wordFloating-point number: word doublewordBit string: byte word doubleword
Converting from one size to another
Type 8-bit number Value 32-bit version of the number
Unsigned 0010 1011 43 0000 0000 0000 0000 0000 0000 0010 1011
Unsigned 1010 1011 171 0000 0000 0000 0000 0000 0000 1010 1011
Signed 0010 1011 +43 0000 0000 0000 0000 0000 0000 0010 1011
Signed 1010 1011 –85 1111 1111 1111 1111 1111 1111 1010 1011
Page 31 of 51 PEC
x x 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0
31 25 20 15 0
lb = 32 lbu = 36 sb = 40
Data register
Base register
Address offset
op rs rt immediate / offset
I 1 1 0 0 0 1 1
CS 2022 Computer architecture UNIT I
Loading and Storing Bytes
Bytes can be used to store ASCII characters or small integers. MiniMIPS addresses refer to bytes, but registers hold words.
lb $t0,8($s3) # load rt with mem[8+($s3)]# sign-extend to fill reg
lbu $t0,8($s3) # load rt with mem[8+($s3)]# zero-extend to fill reg
sb $t0,A($s3) # LSB of rt to mem[A+($s3)]
Page 32 of 51 PEC
CS 2022 Computer architecture UNIT I
Meaning of a Word in Memory
A 32-bit word has no inherent meaning and can be interpreted in a number of equally valid ways in the absence of other cues (e.g., context) for the intended meaning.
Arrays and Pointers
Index: Use a register that holds the index i and increment the register in each step to effect moving from element i of the list to element i + 1 Pointer: Use a register that points to (holds the address of) the list element being examined and update it in each step to point to the next element
Page 33 of 51 PEC
0
x
0 0
fn
0 0 0 0 0 0 0 0 0 0 0 1 0 0 x 0 0 1 1 1 0 0 0 0 0 0 0 0 0
31 25 20 15 0
ALU instruction
Unused Source register
op rs rt
R rd sh
10 5
Destination register
Shift amount
sll = 0 srl = 2
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0
31 25 20 15 0
ALU instruction
Amount register
Source register
op rs rt
R rd sh
10 5 fn
Destination register
Unused sllv = 4 srlv = 6
CS 2022 Computer architecture UNIT I
Additional Instructions
MiniMIPS instructions for multiplication and division:
mult $s0, $s1 # set Hi,Lo to ($s0)´($s1)div $s0, $s1 # set Hi to ($s0)mod($s1)
# and Lo to ($s0)/($s1)mfhi $t0 # set $t0 to (Hi)mflo $t0 # set $t0 to (Lo)
The multiply (mult) and divide (div) instructions of MiniMIPS.
MiniMIPS instructions for copying the contents of Hi and Lo registers into general registers Logical ShiftsMiniMIPS instructions for left and right shifting:
sll $t0,$s1,2 # $t0=($s1) left-shifted by 2 srl $t0,$s1,2 # $t0=($s1) right-shifted by 2 sllv $t0,$s1,$s0 # $t0=($s1) left-shifted by ($s0) srlv $t0,$s1,$s0 # $t0=($s1) right-shifted by ($s0)
Page 34 of 51 PEC
1 0 0 1 1 0 0
fn
0 0 0 0 0 0 0 0 0 0 0 0 x 0 0 1 1 0 0 0 0 0 0 0 0
31 25 20 15 0
ALU instruction
Source register 1
Source register 2
op rs rt
R rd sh
10 5
Unused Unused mult = 24 div = 26
1 0 0 0 0 0 0 1 0 0
fn
0 0 0 0 0 0 0 0 0 0 0 x 0 0 0 0 0 0 0 0 0 0
31 25 20 15 0
ALU instruction
Unused Unused
op rs rt
R rd sh
10 5
Destination register
Unused mfhi = 16 mflo = 18
CS 2022 Computer architecture UNIT I
Unsigned Arithmetic and Miscellaneous Instructions
MiniMIPS instructions for unsigned arithmetic (no overflow exception):
addu $t0,$s0,$s1 # set $t0 to ($s0)+($s1)subu $t0,$s0,$s1# set $t0 to ($s0)–($s1)multu $s0,$s1 # set Hi,Lo to ($s0)´($s1)divu $s0,$s1 # set Hi to ($s0)mod($s1)
# and Lo to ($s0)/($s1)addiu $t0,$s0,61 # set $t0 to ($s0)+61;
# the immediate operand is# sign extended
To make MiniMIPS more powerful and complete, we introduce later:sra $t0,$s1,2 # sh. right arith (Sec. 10.5)
srav $t0,$s1,$s0 # shift right arith variablesyscall # system call (Sec. 7.6)
Page 35 of 51 PEC
CS 2022 Computer architecture UNIT I
The 37 + 3 MiniMIPS Instructions Covered So Far
Assembly Language Programs
Page 36 of 51 PEC
CS 2022 Computer architecture UNIT I
Machine and Assembly Languages
A computer program can be represented at different levels of abstraction. A program could be written in a machine-independent, high-level language such as Java or C++. A computer can execute programs only when they are represented in machine language specific to its architecture.
A machine language program for a given architecture is a collection of machine instructions represented in binary form.
Programs written at any level higher than the machine language must be translated to the binary representation before a computer can execute them. An assembly language program is a symbolic representation of the machine language program.
Machine language is pure binary code, whereas assembly language is a direct mapping of the binary code onto a symbolic form that is easier for humans to understand and manage.
Converting the symbolic representation into machine language is performed by a special program called the assembler.
An assembler is a program that accepts a symbolic language program (source) and produces its machine language equivalent (target).
In translating a program into binary code, the assembler will replace symbolic addresses by numeric addresses, replace symbolic operation codes by machine operation codes, reserve storage for instructions and data, and translate constants into machine representation.
Page 37 of 51 PEC
CS 2022 Computer architecture UNIT I
A SIMPLE MACHINE Machine language is the native language of a given processor. Since
assembly language is the symbolic form of machine language, each different type of processor has its own unique assembly language.
The assembly language of a given processor, we need first to understand the details of that processor. We need to know the memory size and organization, the processor registers, the instruction format, and the entire instruction set,
A very simple hypothetical processor, which will be used in explaining the different topics in assembly language.
Our simple machine is an accumulator-based processor, which has five 16-bit registers: Program Counter (PC), Instruction Register (IR), Address Register (AR), Accumulator (AC), and Data Register (DR). The PC contains the address of the next instruction to be executed. The IR contains the operation code portion of the instruction being executed. The AR contains the address portion (if any) of the instruction being executed. The AC serves as the implicit source and destination of data. The DR is used to hold data. The memory unit is made up of 4096 words of storage. The word size is 16 bits.
Page 38 of 51 PEC
0 00100000000100000000000000001001
addi $s0,$zero,9
test
done result
12
28 248
4 00000010000100000100000000100010 8 00000001001000000000000000100000 12 00010101000100000000000000001100 16 00100001000010000000000000000001 20 00000010000000000100100000100000 24 00001000000000000000000000000011 28 10101111100010010000000011111000
Determined from assembler directives not shown here
Symbol table
done: sw $t1,result($gp)
sub $t0,$s0,$s0 add $t1,$zero,$zero test: bne $t0,$s0,done addi $t0,$t0,1 add $t1,$s0,$zero j test
Assembly language program Machine language program Location
op rs rt rd sh fn Field boundaries shown to facilitate understanding
CS 2022 Computer architecture UNIT I
Symbol Table
Assembler DirectivesAssembler directives provide the assembler with info on how to translate the program but do not lead to the generation of machine instructions
.macro # start macro (see Section 7.4) .end_macro # end macro (see Section 7.4) .text # start program’s text segment ... # program text goes here .data # start program’s data segment tiny: .byte 156,0x7a # name & initialize data byte(s) max: .word 35000 # name & initialize data word(s)small: .float 2E-3 # name short float (see Chapter 12) big: .double 2E-3 # name long float (see Chapter 12) .align 2 # align next item on word boundaryarray: .space 600 # reserve 600 bytes = 150 words str1: .ascii “a*b” # name & initialize ASCII string str2: .asciiz “xyz” # null-terminated ASCII string .global main # consider “main” a global name
Composing Simple Assembler DirectivesWrite assembler directive to achieve each of the following objectives:
Page 39 of 51 PEC
CS 2022 Computer architecture UNIT I
a. Put the error message “Warning: The printer is out of paper!” in memory.b. Set up a constant called “size” with the value 4.c. Set up an integer variable called “width” and initialize it to 4.d. Set up a constant called “mill” with the value 1,000,000 (one million).e. Reserve space for an integer vector “vect” of length 250.
Solution:
a. noppr: .asciiz “Warning: The printer is out of paper!”b. size: .byte 4 # small constant fits in one bytec. width: .word 4 # byte could be enough, but ...d. mill: .word 1000000 # constant too large for bytee. vect: .space 1000 # 250 words = 1000 bytes
Pseudoinstructions
Example of one-to-one pseudoinstruction: The following
not $s0 # complement ($s0) is converted to the real instruction:
nor $s0,$s0,$zero # complement ($s0) Example of one-to-several pseudoinstruction: The following abs $t0,$s0 # put |($s0)| into $t0 is converted to the sequence of real instructions: add $t0,$s0,$zero # copy x into $t0
slt $at,$t0,$zero # is x negative?beq $at,$zero,+4 # if not, skip next instrsub $t0,$zero,$s0 # the result is 0 – x
Page 40 of 51 PEC
CS 2022 Computer architecture UNIT I
Macroinstructions
A macro is a mechanism to give a name to an often-used sequence of instructions (shorthand notation) .macro name(args) # macro and arguments named ... # instr’s defining the macro .end_macro # macro terminator How is a macro different from a pseudoinstruction? Pseudos are predefined, fixed, and look like machine instructions Macros are user-defined and resemble procedures (have arguments)
Page 41 of 51 PEC
CS 2022 Computer architecture UNIT I
How is a macro different from a procedure? Control is transferred to and returns from a procedure After a macro has been replaced, no trace of it remains
Macro to Find the Largest of Three ValuesWrite a macro to determine the largest of three values in registers and to put the result in a fourth register.
Solution:
.macro mx3r(m,a1,a2,a3) # macro and arguments named move m,a1 # assume (a1) is largest; m = (a1) bge m,a2,+4 # if (a2) is not larger, ignore it move m,a2 # else set m = (a2) bge m,a3,+4 # if (a3) is not larger, ignore it move m,a3 # else set m = (a3) .endmacro # macro terminator
If the macro is used as mx3r($t0,$s0,$s4,$s3), the assembler replaces the arguments m, a1, a2, a3 with $t0, $s0, $s4, $s3, respectively.
Linking and Loading
The linker has the following responsibilities: Ensuring correct interpretation (resolution) of labels in all modules Determining the placement of text and data segments in memory Evaluating all data addresses and instruction labels Forming an executable program with no unresolved references The loader is in charge of the following: Determining the memory needs of the program from its header Copying text and data from the executable program file into memory Modifying (shifting) addresses, where needed, during copying Placing program parameters onto the stack (as in a procedure call) Initializing all machine registers, including the stack pointer Jumping to a start-up routine that calls the program’s main routine
Page 42 of 51 PEC
CS 2022 Computer architecture UNIT I
Running Assembler ProgramsSpim is a simulator that can run MiniMIPS programs
The name Spim comes from reversing MIPS
Three versions of Spim are available for free downloading:
PCSpim for Windows machines
xspim for X-windows
spim for Unix systems
Input/Output Conventions for MiniMIPS
Page 43 of 51 PEC
CS 2022 Computer architecture UNIT I
Instruction Set Variations
Review of Some Key Concepts
Complex InstructionsMachine Instruction Effect
Pentium MOVS Move one element in a string of bytes, words, or doublewords using addresses specified in two pointer registers; after the operation, increment or decrement the registers to point to the next element of the string
PowerPC cntlzd Count the number of consecutive 0s in a specified source register beginning with bit position 0 and place the count in a destination register
IBM 360-370
CS Compare and swap: Compare the content of a register to that of a memory location; if unequal, load the memory word into the register, else store the content of a different register into the same memory location
Digital VAX POLYD Polynomial evaluation with double flp arithmetic:
Page 44 of 51 PEC
CS 2022 Computer architecture UNIT I
Evaluate a polynomial in x, with very high precision in intermediate results, using a coefficient table whose location in memory is given within the instruction
Some Details of Sample Complex Instructions
Page 45 of 51 PEC
CS 2022 Computer architecture UNIT I
Benefits and Drawbacks of Complex Instructions
Alternative Addressing Modes
Page 46 of 51 PEC
CS 2022 Computer architecture UNIT I
More Elaborate Addressing Modes
Page 47 of 51 PEC
CS 2022 Computer architecture UNIT I
Usefulness of Some Elaborate Addressing Modes
Variations in Instruction Formats
Zero-Address Architecture: Stack Machine Stack holds all the operands (replaces our register file)
Load/Store operations become push/pop
Arithmetic/logic operations need only an opcode: they pop operand(s) from the top of the stack and push the result onto the stack
Page 48 of 51 PEC
CS 2022 Computer architecture UNIT I
One-Address Architecture: Accumulator Machine
The accumulator, a special register attached to the ALU, always holds operand 1 and the operation result
Only one operand needs to be specified by the instruction
Two-Address ArchitecturesTwo addresses may be used in different ways:
Operand1/result and operand 2
Page 49 of 51 PEC
CS 2022 Computer architecture UNIT I
Condition to be checked and branch target address
Page 50 of 51 PEC
CS 2022 Computer architecture UNIT I
Instruction Set Design and EvolutionDesirable attributes of an instruction set: Consistent, with uniform and generally applicable rulesOrthogonal, with independent features noninterferingTransparent, with no visible side effect due to implementation detailsEasy to learn/use (often a byproduct of the three attributes above)Extensible, so as to allow the addition of future capabilitiesEfficient, in terms of both memory needs and hardware realization
The RISC/CISC Dichotomy
The RISC (reduced instruction set computer) philosophy: Complex instruction sets are undesirable because inclusion of
mechanisms to interpret all the possible combinations of opcodes and operands might slow down even very simple operations.
Ad hoc extension of instruction sets, while maintaining backward compatibility, leads to CISC; imagine modern English containingevery English word that has been used through the agesFeatures of RISC architecture
1. Small set of inst’s, each executable in roughly the same time2. Load/store architecture (leading to more registers)3. Limited addressing mode to simplify address calculations4. Simple, uniform instruction formats (ease of decoding)
Page 51 of 51 PEC