cse 502 graduate computer architecture lec 1-2 - introduction

Download CSE 502 Graduate Computer Architecture  Lec 1-2 - Introduction

Post on 08-Jan-2016

30 views

Category:

Documents

1 download

Embed Size (px)

DESCRIPTION

CSE 502 Graduate Computer Architecture Lec 1-2 - Introduction. Larry Wittie Computer Science, StonyBrook University http://www.cs.sunysb.edu/~cse502 and ~lw Slides adapted from David Patterson, UC-Berkeley cs252-s06. Outline. Computer Science at a Crossroads - PowerPoint PPT Presentation

TRANSCRIPT

  • CSE 502 Graduate Computer Architecture

    Lec 1-2 - Introduction Larry WittieComputer Science, StonyBrook University

    http://www.cs.sunysb.edu/~cse502 and ~lw

    Slides adapted from David Patterson, UC-Berkeley cs252-s06

    2/1,3/2011*CSE502-S10, Lec 01-2 - intro

    CSE502-S10, Lec 01-2 - intro

  • 2/1,3/2011CSE502-S10, Lec 01-2 - intro*OutlineComputer Science at a CrossroadsComputer Architecture v. Instruction Set Arch.How would you like your CSE502?What Computer Architecture brings to tableQuantitative Principles of DesignTechnology Performance Trends Careful, Quantitative Comparisons

    CSE502-S10, Lec 01-2 - intro

  • 2/1,3/2011CSE502-S10, Lec 01-2 - intro*Old Conventional Wisdom: Power is free, Transistors expensiveNew Conventional Wisdom: Power wall Power expensive, Xtors free (Can put more on chip than can afford to turn on)Old CW: Can increase Instruction Level Parallelism more via compilers, innovation (Out-of-order, speculation, VLIW, )New CW: ILP wall law of diminishing returns on more HW for ILP Old CW: Multiplies are slow, Memory access is fastNew CW: Memory wall Memory slow, multiplies fast (200 clock cycles to DRAM memory, 4 clocks for multiply)Old CW: Uniprocessor performance 2X / 1.5 yrsNew CW: Power Wall + ILP Wall + Memory Wall = Brick WallUniprocessor performance now 2X / 5(?) yrs Sea change in chip design: multiple cores (2X processors per chip / ~ 2 years)Increase on-chip number of simple processors that are power efficientSimple processor cores use less power per useful calculation doneCrossroads: Conventional Wisdom in Comp. Arch

    CSE502-S10, Lec 01-2 - intro

  • 2/1,3/2011CSE502-S10, Lec 01-2 - intro*Crossroads: Uniprocessor Performance VAX : 25%/year 1978 to 1986 RISC + x86: 52%/year 1986 to 2002 RISC + x86: ??%/year 2002 to 2006From Hennessy and Patterson, Computer Architecture: A Quantitative Approach, 4th edition, October, 2006

    CSE502-S10, Lec 01-2 - intro

    Chart3

    1

    1.5

    5

    9

    13

    18

    24

    51

    80

    117

    183

    280

    481

    648.9682539683

    992.5396825397

    1267.3968253968

    1778.9365079365

    2584.4206349206

    4195.3888888889

    5363.5317460318

    5764.3650793651

    6504.9523809524

    25%/year

    52%/year

    ??%/year

    Performance (vs. VAX-11/780)

    Sheet1

    Data points (not linked)yearPerf TargetCAGR

    YearRel. Perf.Computer1978125%

    19781VAX-11/780198656.052%VAX-11/780

    19841.5VAX-11/785200242004,059.520%VAX-11/785

    19865VAX 8700200565007,257.6VAX 8700

    19879Sun-4/260Sun-4/260

    198813MIPS M/120MIPS M/120

    198918MIPS M2000MIPS M2000

    199024IBM RS6000/540IBM RS6000/540

    199151HP 9000/750HP PA-RISC, 0.05 GHz

    199280Digital 3000 AXP/500Alpha 21064, 0.2 GHz

    1993117IBM POWERstation 100PowerPC 604, 0.1 GHz

    1994183Digital Alphastation 4/266Alpha 21064A, 0.3 GHz

    1995280Digital Alphastation 5/300Alpha 21164, 0.3 GHz

    1996481Digital Alphastation 5/500Alpha 21164, 0.5 GHz

    1997649AlphaServer 4000 5/600, 600 MHz 21164(was 1140 for 600 MHz 21264)Alpha 21164, 0.6 GHz

    1998993Digital AlphaServer 8400 6/575, 575 MHz 21264Alpha 21264, 0.6 GHz

    19991,267Professional Workstation XP1000, 667 MHz 21264AAlpha 21264A, 0.7 GHz

    20001,779Intel VC820 motherboard, 1.0 GHz Pentium III processorIntel Pentium III, 1.0 GHz

    20012,584AMD Epox 8KHA+ Motherboard, AMD Athlon (TM) XP 1900+, 1.6 GHzAMD Athlon, 1.6 GHz

    20024,195Intel D850EMVR motherboard (3.06 GHz, Pentium 4 processor with Hyper-Threading Technology)Intel Pentium 4, 3.0 GHz

    20035,364AMD ASUS SK8N Motherboard, AMD Opteron (TM) 148, 2.2 GHzAMD Opteron, 2.2 GHz

    20045,764HP ProLiant BL20p G3 (3.6GHz, Intel Xeon)Intel Xeon, 3.6 GHz

    20056,505Fujitsu Siemens Computers, PRIMERGY BX620 S2, 64-bit Intel Xeon 3.60 GHz64-bit Intel Xeon, 3.6 GHz

    Sheet1

    1

    1.5

    5

    9

    13

    18

    24

    51

    80

    117

    183

    280

    481

    648.9682539683

    992.5396825397

    1267.3968253968

    1778.9365079365

    2584.4206349206

    4195.3888888889

    5363.5317460318

    5764.3650793651

    6504.9523809524

    25%/year

    52%/year

    20%/year

    Performance (vs. VAX-11/780)

    Sheet2

    Sheet3

  • 2/1,3/2011CSE502-S10, Lec 01-2 - intro*Sea Change in Chip DesignIntel 4004 (1971): 4-bit processor, 2312 transistors, 0.4 MHz, 10 micron PMOS, 11 mm2 chip Processor is the new transistor? RISC II (1983): 32-bit, 5 stage pipeline, 40,760 transistors, 3 MHz, 3 micron NMOS, 60 mm2 chip (6 x 10 mm)Today (2006) 125 mm2 chip, 0.065 micron CMOS = 2312 RISC II+FPU+Icache+Dcache (~108 Xistors)RISC II shrinks to ~ 0.02 mm2 at 65 nmCaches via DRAM or 1 transistor SRAM (www.t-ram.com) ?Proximity Communication via capacitive coupling at > 1 TB/s ? (Ivan Sutherland @ Sun / Berkeley)

    CSE502-S10, Lec 01-2 - intro

  • 2/1,3/2011CSE502-S10, Lec 01-2 - intro*Dj vu all over again?Multiprocessors imminent in 1970s, 80s, 90s, todays processors are nearing an impasse as technologies approach the speed of light..David Mitchell, The Transputer: The Time Is Now (1989)Transputer was premature Custom multiprocessors strove to lead uniprocessors Procrastination rewarded: 2X seq. perf. / 1.5 years

    We are dedicating all of our future product development to multicore designs. This is a sea change in computingPaul Otellini, President, Intel (2004) Difference is all microprocessor companies switch to multiprocessors (AMD, Intel, IBM, Sun; all new Apples 2 CPUs) Procrastination penalized: 2X sequential perf. / 5 yrs Biggest programming challenge: going from 1 to 2 CPUs

    CSE502-S10, Lec 01-2 - intro

  • 2/1,3/2011CSE502-S10, Lec 01-2 - intro*Problems with Sea Change Algorithms, Programming Languages, Compilers, Operating Systems, Architectures, Libraries, not ready to supply Thread Level Parallelism or Data Level Parallelism for 1000 CPUs / chip, Architectures not ready for 1000 CPUs / chipUnlike Instruction Level Parallelism, cannot be solved by just by computer architects and compiler writers alone, but also cannot be solved without participation of computer architectsThis edition of CSE 502 (and 4th Edition of textbook Computer Architecture: A Quantitative Approach) explores shift from Instruction Level Parallelism to Thread Level Parallelism / Data Level Parallelism

    CSE502-S10, Lec 01-2 - intro

  • 2/1,3/2011CSE502-S10, Lec 01-2 - intro*OutlineComputer Science at a CrossroadsComputer Architecture v. Instruction Set Arch.How would you like your CSE502?What Computer Architecture brings to tableQuantitative Principles of DesignTechnology Performance TrendsCareful, Quantitative Comparisons

    CSE502-S10, Lec 01-2 - intro

  • 2/1,3/2011CSE502-S10, Lec 01-2 - intro*Instruction Set Architecture: Critical InterfaceThe computing system as seen by programmersinstruction setsoftwarehardwareProperties of a good abstractionLasts through many generations (portability)Used in many different ways (generality)Provides convenient functionality to higher levelsPermits an efficient implementation at lower levelsWhat matters today is performance of complete computer systems

    CSE502-S10, Lec 01-2 - intro

  • 2/1,3/2011CSE502-S10, Lec 01-2 - intro*Example: MIPS0r0r1r31PClohiProgrammable storage2^32 x bytes31 x 32-bit GPRs (R0=0)32 x 32-bit FP regs (paired DP)HI, LO, PCData types ?Format ?Addressing Modes?Arithmetic logical Add, AddU, Sub, SubU, And, Or, Xor, Nor, SLT, SLTU, AddI, AddIU, SLTI, SLTIU, AndI, OrI, XorI, LUISLL, SRL, SRA, SLLV, SRLV, SRAVMemory AccessLB, LBU, LH, LHU, LW, LWL,LWRSB, SH, SW, SWL, SWRControlJ, JAL, JR, JALRBEQ, BNE, BLEZ,BGTZ,BLTZ,BGEZ,BLTZAL,BGEZAL32-bit instructions on word boundary

    CSE502-S10, Lec 01-2 - intro

  • 2/1,3/2011CSE502-S10, Lec 01-2 - intro*OutlineComputer Science at a CrossroadsComputer Architecture v. Instruction Set Arch.How would you like your CSE502?What Computer Architecture brings to tableQuantitative Principles of DesignTechnology Performance TrendsCareful, Quantitative Comparisons

    CSE502-S10, Lec 01-2 - intro

  • 2/1,3/2011CSE502-S10, Lec 01-2 - intro*CSE502: AdministriviaInstructor: Prof Larry WittieOffice/Lab: 1308 CompSci, lw AT icDOTsunysbDOTeduOffice Hrs: TuTh, 3:45 - 5:15 pm, if door open, or appt.T. A.: To Be Determined (TBD)Class:TuTh, 2:20 - 3:40 pm 2120 Comp SciText:Computer Architecture: A Quantitative Approach, 4th Ed. (Oct, 2006), ISBN 0123704901 or 978-0123704900, $65/($50?) Amazon; $77used SBU, S11 Web page: http://www.cs.sunysb.edu/~cse502/First reading assignment: Chapter 1 for today, TuesdayAppendix A (at back of CAQA4 text) for next week(New edition this fall 2011 so little 4ed trade-back value)

    CSE502-S10, Lec 01-2 - intro

  • 2/1,3/2011CSE502-S10, Lec 01-2 - intro*Coping with CSE 502Undergrads from SBU have taken CSE320Grad Students with too varied background?You will have a difficult time if you have not had an undergrad course using a Hennessy & Patterson text.Grads without CSE320 equivalent may have to work hard; Review: CSE502 text Appendix A, B, C; the CSE320 home page; and maybe CSE320 text Computer Organization and Design (COD) 4/e Read chapters 1 to 8 of COD if you never took the prerequisiteIf took a class, be sure COD Chapters 2, 6, 7 are very familiarWe will spend 2 week-long lectures on review of Pipelining (App. A) and Memory Hierarchy (App. C), before an in-class quiz to check if everyone is OK.

    CSE502-S10, Lec 01-2 - intro

  • 2/1,3/2011CSE502-S10, Lec 01-2 - intro*Grading18% Homeworks (practice for the exams)Not doing HW yourself will doom you on exams (& penalties)74% Exams: {4% Quiz, 20% Midterm, 50% Final Exam}8% (Optional) Rese

Recommended

View more >