computer organization & assembly language...

28
CSE 2312 Computer Organization & Assembly Language Programming 1 Spring 2015 CSE 2312 Lecture 4 Quantifying Computer Components Junzhou Huang, Ph.D. Department of Computer Science and Engineering Computer Organization & Assembly Language Programming

Upload: others

Post on 01-Apr-2020

13 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Computer Organization & Assembly Language Programmingranger.uta.edu/~huang/teaching/CSE2312/CSE2312_Lecture4.pdfSpring 2015 CSE 2312 Computer Organization & Assembly Language Programming

CSE 2312 Computer Organization & Assembly Language Programming 1Spring 2015

CSE 2312

Lecture 4 Quantifying Computer Components

Junzhou Huang, Ph.D.

Department of Computer Science and Engineering

Computer Organization &

Assembly Language Programming

Page 2: Computer Organization & Assembly Language Programmingranger.uta.edu/~huang/teaching/CSE2312/CSE2312_Lecture4.pdfSpring 2015 CSE 2312 Computer Organization & Assembly Language Programming

CSE 2312 Computer Organization & Assembly Language Programming 2Spring 2015

Quantifying Computer Components

• CPU Speed – Mhz or Ghz CPU Speed, MIPS, MFLOPS ….

– 1.33 Ghz …… Inter Atom Processor

• Bus Speed– Front Side Bus (FSB) … 533 Mhz Inter Atom,

– Number of Channels, Number of data paths

• Memory Capacity and Speed– Gigabytes, Mhz x DataRate

– 166 MHz DDR memory, Quad pump

• Disk Capacity and Bandwidth– GB, TB, MB/sec

• Power Consumption– Watts, mWatts,

– Battery life time (standby vs active) Watt-Hr

Page 3: Computer Organization & Assembly Language Programmingranger.uta.edu/~huang/teaching/CSE2312/CSE2312_Lecture4.pdfSpring 2015 CSE 2312 Computer Organization & Assembly Language Programming

CSE 2312 Computer Organization & Assembly Language Programming 3Spring 2015

Metric Units

The principal metric prefixes.

Page 4: Computer Organization & Assembly Language Programmingranger.uta.edu/~huang/teaching/CSE2312/CSE2312_Lecture4.pdfSpring 2015 CSE 2312 Computer Organization & Assembly Language Programming

CSE 2312 Computer Organization & Assembly Language Programming 4Spring 2015

Inter Atom

Page 5: Computer Organization & Assembly Language Programmingranger.uta.edu/~huang/teaching/CSE2312/CSE2312_Lecture4.pdfSpring 2015 CSE 2312 Computer Organization & Assembly Language Programming

CSE 2312 Computer Organization & Assembly Language Programming 5Spring 2015

More

• iPhone

– 620 Mhz ARM chip

– SIMD, high performance integer CPU (8-stage pipeline, 675 Dhrystone, 2.1 MIPS)

– 16 K/16 K cache

– 0.45 mW/MHz power draw (with cache)

• Wii

– CPU: PowerPC-based "Broadway" processor, 729 Mhz

– GPU: ATI "Hollywood" GPU, 243 MHz

Page 6: Computer Organization & Assembly Language Programmingranger.uta.edu/~huang/teaching/CSE2312/CSE2312_Lecture4.pdfSpring 2015 CSE 2312 Computer Organization & Assembly Language Programming

CSE 2312 Computer Organization & Assembly Language Programming 6Spring 2015

More

• iPad

– 1GHz Apple A4

– 16GB ~ 64 GB flash strorage

– Upto 10 hours of battery life

• IBM ThinkPad T42

– Pentium M Processor 735

– 1.7GHz,

– 512 MB RAM

– Intel® Core™ 2 Duo P8600

– 2.4GHz/1066Mhz FSB/3MB cache

– 4G memory,

– 100 G disk

Page 7: Computer Organization & Assembly Language Programmingranger.uta.edu/~huang/teaching/CSE2312/CSE2312_Lecture4.pdfSpring 2015 CSE 2312 Computer Organization & Assembly Language Programming

CSE 2312 Computer Organization & Assembly Language Programming 7Spring 2015

What is Performance?

• Which airplane has the best performance?

0 100 200 300 400 500

Douglas

DC-8-50

BAC/Sud

Concorde

Boeing 747

Boeing 777

Passenger Capacity

0 2000 4000 6000 8000 10000

Douglas DC-

8-50

BAC/Sud

Concorde

Boeing 747

Boeing 777

Cruising Range (miles)

0 500 1000 1500

Douglas

DC-8-50

BAC/Sud

Concorde

Boeing 747

Boeing 777

Cruising Speed (mph)

0 100000 200000 300000 400000

Douglas DC-

8-50

BAC/Sud

Concorde

Boeing 747

Boeing 777

Passengers x mph

Page 8: Computer Organization & Assembly Language Programmingranger.uta.edu/~huang/teaching/CSE2312/CSE2312_Lecture4.pdfSpring 2015 CSE 2312 Computer Organization & Assembly Language Programming

CSE 2312 Computer Organization & Assembly Language Programming 8Spring 2015

Choosing a Metric

• Which of these are good metrics?

– Energy consumption

– Instructions per second

– CPU utilization

– Execution time

– Cycles per instruction

– Clock rate

• Why?

Page 9: Computer Organization & Assembly Language Programmingranger.uta.edu/~huang/teaching/CSE2312/CSE2312_Lecture4.pdfSpring 2015 CSE 2312 Computer Organization & Assembly Language Programming

CSE 2312 Computer Organization & Assembly Language Programming 9Spring 2015

What is Time?

• Response time

– How long it takes to do a task?

• Throughput

– Total work done per unit time,

– Such as tasks, transactions, … per hour

• How are they affected by

– Replacing the processor with a faster version?

– Adding more processors?

Page 10: Computer Organization & Assembly Language Programmingranger.uta.edu/~huang/teaching/CSE2312/CSE2312_Lecture4.pdfSpring 2015 CSE 2312 Computer Organization & Assembly Language Programming

CSE 2312 Computer Organization & Assembly Language Programming 10Spring 2015

What is Execution Time?

• Elapsed time– Total response time, including all aspects, such as Processing, I/O, OS

overhead, idle time

– Determines system performance

• CPU time– Time spent processing a given job

– Discounts I/O time, other jobs’ shares

– User CPU time + system CPU time

– Different programs are affected differently by CPU and system performance

Time spent

executing the

program’s

instructions

Page 11: Computer Organization & Assembly Language Programmingranger.uta.edu/~huang/teaching/CSE2312/CSE2312_Lecture4.pdfSpring 2015 CSE 2312 Computer Organization & Assembly Language Programming

CSE 2312 Computer Organization & Assembly Language Programming 11Spring 2015

CPU Clock

• Every action is driven by a clock in the CPU

• Clock time = 1/Frequency– 1 Mhz clock = 10–6 seconds

– 1 Ghz clock = 10–9 seconds

• From CPU speed, you know time for 1 clock cycle

Page 12: Computer Organization & Assembly Language Programmingranger.uta.edu/~huang/teaching/CSE2312/CSE2312_Lecture4.pdfSpring 2015 CSE 2312 Computer Organization & Assembly Language Programming

CSE 2312 Computer Organization & Assembly Language Programming 12Spring 2015

How Long Does An Instruction Take?

• Digital logic is controlled by a clock

• Clock period: duration of a clock cycle– e.g., 250ps = 0.25ns = 250×10–12s

• Clock frequency (rate): cycles per second– e.g., 4.0GHz = 4000MHz = 4.0×109Hz

Clock (cycles)

Data transferand computation

Update state

Clock period

Page 13: Computer Organization & Assembly Language Programmingranger.uta.edu/~huang/teaching/CSE2312/CSE2312_Lecture4.pdfSpring 2015 CSE 2312 Computer Organization & Assembly Language Programming

CSE 2312 Computer Organization & Assembly Language Programming 13Spring 2015

Predicting CPU Time

• Ideal: Only need to know number of instructions

• Reality: Some instructions take longer than others

Rate Clock

nsInstructio

TimeCycle ClocknsInstructio TimeCPU

=

×=

cycle Clock

Seconds

nInstructio

cycles Clock

Program

nsInstructio TimeCPU ××=

Instruction

Count

Cycles per

instruction

Page 14: Computer Organization & Assembly Language Programmingranger.uta.edu/~huang/teaching/CSE2312/CSE2312_Lecture4.pdfSpring 2015 CSE 2312 Computer Organization & Assembly Language Programming

CSE 2312 Computer Organization & Assembly Language Programming 14Spring 2015

Instruction Count and Cycles Per Instruction

• IC determined by program, ISA, and compiler

• CPI determined by CPU and other factors– Different instructions have different CPI

– Average CPI affected by instruction mix

Rate Clock

CPIIC

TimeCycle ClockCPIIC TimeCPU

(CPI) nInstructioper Cycles

(IC)Count nInstructioCycles Clock

×=

××=

×=

Page 15: Computer Organization & Assembly Language Programmingranger.uta.edu/~huang/teaching/CSE2312/CSE2312_Lecture4.pdfSpring 2015 CSE 2312 Computer Organization & Assembly Language Programming

CSE 2312 Computer Organization & Assembly Language Programming 15Spring 2015

Improving CPU Time

Rate Clock

Cycles Clock CPU

TimeCycle ClockCycles Clock CPU TimeCPU

=

×=

=

=

×==

×=

n

1i

ii

n

1i

ii

Count nInstructio

Count nInstructioCPI

Count nInstructio

Cycles ClockCPI

Count nInstructioCPI Cycles Clock

Relative frequency

Usually a

tradeoff

Page 16: Computer Organization & Assembly Language Programmingranger.uta.edu/~huang/teaching/CSE2312/CSE2312_Lecture4.pdfSpring 2015 CSE 2312 Computer Organization & Assembly Language Programming

CSE 2312 Computer Organization & Assembly Language Programming 16Spring 2015

Compiler Matters!

• Suppose compiler has two choices:– Can use 5 or 6 instructions, as described below:

• Which is better?

Class A B C

CPI for class 1 2 3

IC in sequence 1 2 1 2

IC in sequence 2 4 1 1

• Sequence 1: IC = 5

– Clock Cycles

= 2×1 + 1×2 + 2×3= 10

– Avg. CPI = 10/5 = 2.0

• Sequence 2: IC = 6

– Clock Cycles

= 4×1 + 1×2 + 1×3= 9

– Avg. CPI = 9/6 = 1.5

Sequence 2 has lower average CPI, so it is better.

Page 17: Computer Organization & Assembly Language Programmingranger.uta.edu/~huang/teaching/CSE2312/CSE2312_Lecture4.pdfSpring 2015 CSE 2312 Computer Organization & Assembly Language Programming

CSE 2312 Computer Organization & Assembly Language Programming 17Spring 2015

Comparing Performance

• Performance = 1 / Execution Time

• “X is n times faster than Y”

• Example: time taken to run a program

– 10s on A, 15s on B… how much faster is A?

– Execution TimeB / Execution TimeA = 15s / 10s = 1.5

– So A is 1.5 times faster than B

n = (Performancex) / (Performancey)

= (Execution Timey) / (Execution Timex)

Page 18: Computer Organization & Assembly Language Programmingranger.uta.edu/~huang/teaching/CSE2312/CSE2312_Lecture4.pdfSpring 2015 CSE 2312 Computer Organization & Assembly Language Programming

CSE 2312 Computer Organization & Assembly Language Programming 18Spring 2015

CPI Example

• Computer A: Cycle Time = 250ps, CPI = 2.0

• Computer B: Cycle Time = 500ps, CPI = 1.2

• Same ISA

• Which is faster, and by how much?

1.2500psI

600psI

A TimeCPU

B TimeCPU

600psI500ps1.2I

B TimeCycleBCPICount nInstructioB TimeCPU

500psI250ps2.0I

A TimeCycleACPICount nInstructioA TimeCPU

=××

=

×=××=

××=×=××=

××=

A is faster…

…by this much

Page 19: Computer Organization & Assembly Language Programmingranger.uta.edu/~huang/teaching/CSE2312/CSE2312_Lecture4.pdfSpring 2015 CSE 2312 Computer Organization & Assembly Language Programming

CSE 2312 Computer Organization & Assembly Language Programming 19Spring 2015

CPU Example

• Computer A: – 2GHz clock, 10s CPU time

• Let’s design Computer B– Aim for 6s CPU time

– Can do faster clock, but causes 1.2x clock cycles

• How fast must new clock be?

4GHz6s

1024

6s

10201.2Rate Clock

10202GHz10s

Rate Clock TimeCPUCycles Clock

6s

Cycles Clock1.2

TimeCPU

Cycles ClockRate Clock

99

B

9

AAA

A

B

B

B

=××

=

×=×=

×=

×==

Page 20: Computer Organization & Assembly Language Programmingranger.uta.edu/~huang/teaching/CSE2312/CSE2312_Lecture4.pdfSpring 2015 CSE 2312 Computer Organization & Assembly Language Programming

CSE 2312 Computer Organization & Assembly Language Programming 20Spring 2015

Time for a Program

• CPU executes various instructions

• A Program has several Instructions, how many?– Depends on program, compiler

• Each Instruction can take several CPU cycles, how many?– Depends on the Instruction Set Architecture (ISA)

– ISA: Learn in this course

• Each cycle has a fixed time based on CPU, BUS speed. What is the clock time, memory speed etc? – Depends on the hardware, organization

– Computer Architecture – Learn in this course

Page 21: Computer Organization & Assembly Language Programmingranger.uta.edu/~huang/teaching/CSE2312/CSE2312_Lecture4.pdfSpring 2015 CSE 2312 Computer Organization & Assembly Language Programming

CSE 2312 Computer Organization & Assembly Language Programming 21Spring 2015

CPU Performance Equation

Page 22: Computer Organization & Assembly Language Programmingranger.uta.edu/~huang/teaching/CSE2312/CSE2312_Lecture4.pdfSpring 2015 CSE 2312 Computer Organization & Assembly Language Programming

CSE 2312 Computer Organization & Assembly Language Programming 22Spring 2015

Performance Summary

• Performance depends on– Algorithm: affects IC, possibly CPI

– Programming language: affects IC, CPI

– Compiler: affects IC, CPI

– Instruction set architecture: affects IC, CPI, Tc

cycle Clock

Seconds

nInstructio

cycles Clock

Program

nsInstructio TimeCPU ××=

Page 23: Computer Organization & Assembly Language Programmingranger.uta.edu/~huang/teaching/CSE2312/CSE2312_Lecture4.pdfSpring 2015 CSE 2312 Computer Organization & Assembly Language Programming

CSE 2312 Computer Organization & Assembly Language Programming 23Spring 2015

How Improve Performance?

We must lower execution time!

• Algorithm– Determines number of operations executed

• Programming language, compiler, architecture– Determine number of machine instructions executed per operation (IC)

• Processor and memory system– Determine how fast instructions are executed (CPI)

• I/O system (including OS)– Determines how fast I/O operations are executed

Page 24: Computer Organization & Assembly Language Programmingranger.uta.edu/~huang/teaching/CSE2312/CSE2312_Lecture4.pdfSpring 2015 CSE 2312 Computer Organization & Assembly Language Programming

CSE 2312 Computer Organization & Assembly Language Programming 24Spring 2015

Amdahl’s Law

• Improving an aspect of a computer won’t give a proportional improvement in overall performance

• Especially true of multicore computers

• So make the common case fast!

unaffectedaffected

improved Tfactort improvemen

TT +=

Page 25: Computer Organization & Assembly Language Programmingranger.uta.edu/~huang/teaching/CSE2312/CSE2312_Lecture4.pdfSpring 2015 CSE 2312 Computer Organization & Assembly Language Programming

CSE 2312 Computer Organization & Assembly Language Programming 25Spring 2015

Exercise 1

• Problem– There are 3 classes of instructions, A, B, C. Suppose compiler has two

choices: Sequence 1 and Sequence 2, as described below:

• Which one is better? Why?

Class A B C

CPI for class 1 2 3

IC in sequence 1 2 1 2

IC in sequence 2 3 1 1

• Sequence 1: IC = 5

– Clock Cycles = 2×1 + 1×2 + 2×3 = 10

– Avg. CPI = 10/5 = 2.0

• Sequence 2: IC = 5

– Clock Cycles= 3×1 + 1×2 + 1×3 = 8

– Avg. CPI = 8/5 = 1.6

Sequence 2 has lower average CPI, so it is better.

Page 26: Computer Organization & Assembly Language Programmingranger.uta.edu/~huang/teaching/CSE2312/CSE2312_Lecture4.pdfSpring 2015 CSE 2312 Computer Organization & Assembly Language Programming

CSE 2312 Computer Organization & Assembly Language Programming 26Spring 2015

Exercise 2

• Problem:– There are two computers: A and B.

– Computer A: Cycle Time = 250ps, CPI = 2.0

– Computer B: Cycle Time = 400ps, CPI = 1.5

– If they have the same ISA, which computer is faster?

– How many times it is faster than another?

• Answer:– We know that CPU = IC * CPI * Cycle time

– Therefore, CPU(A) = IC*2*250 = 500*IC

– CPU(B) = IC*1.5*400 = 600*IC

– So, A is (600/500) = 1.2 times faster.

Page 27: Computer Organization & Assembly Language Programmingranger.uta.edu/~huang/teaching/CSE2312/CSE2312_Lecture4.pdfSpring 2015 CSE 2312 Computer Organization & Assembly Language Programming

CSE 2312 Computer Organization & Assembly Language Programming 27Spring 2015

Exercise 3

• Problem:– Computer A has 2GHz clock. It takes 10s CPU time to finish one given task.

– We want to design Computer B to finish the same task within 5s CPU time.

– The clock cycle number for computer B is 2 times as that of Computer A.

– What clock rate should be designed for Computer B?

• Answer:

8GHz5s

1040

5s

10202Rate Clock

10202GHz10s

Rate Clock TimeCPUCycles Clock

5s

Cycles Clock2

TimeCPU

Cycles ClockRate Clock

99

B

9

AAA

A

B

B

B

=××

=

×=×=

×=

×==

Page 28: Computer Organization & Assembly Language Programmingranger.uta.edu/~huang/teaching/CSE2312/CSE2312_Lecture4.pdfSpring 2015 CSE 2312 Computer Organization & Assembly Language Programming

CSE 2312 Computer Organization & Assembly Language Programming 28Spring 2015

Homework 1

• See webpage– Chapter 1 in the Tanenbaum’s Textbook

• Due in Class