multicycle operations.ppt

Upload: i2loveu3235

Post on 02-Jun-2018

231 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 MULTIcycle OPERATIONS.ppt

    1/24

    1

    COMP 206:

    Computer Architecture andImplementation

    Montek Singh

    Wed, Sep 28, 2005

    Topic: Pipelining -- Intermediate Concepts

    (Multicycle Operations; Exceptions)

  • 8/10/2019 MULTIcycle OPERATIONS.ppt

    2/24

    2

    Outline

    Multi-cycle operations

    Floating-point operations Structural and data hazards

    Interrupts, Faults and Exceptions Precise exceptions

    Complications in pipelines

    READING: Appendix A

  • 8/10/2019 MULTIcycle OPERATIONS.ppt

    3/24

    3

    Pipelining Multicycle Operations

    Assume five-stage pipeline

    Third stage (execution) has two functional units E1and E2

    Instruction goes through either E1 or E2, but not both

    E1 and E2 are not pipelined

    Stage delay of E1 = 2 cycles

    Stage delay of E2 = 4 cycles

    No buffering on inputs of E1 and E2

    Stage delay of other stages = 1 cycle Consider an instruction sequence of five instructions

    Instructions 1, 3, 5 need E1

    Instructions 2, 4 need E2

  • 8/10/2019 MULTIcycle OPERATIONS.ppt

    4/24

    4

    Space-Time Diagram: Multicycle Operations

    Delay 1 2 3 4 5 6 7 8 9 10 11 12 13

    1 IF 1 2 3 4 5 5 5

    1 ID 1 2 3 4 4 4 5

    2 E1 1 1 3 3 5 5

    4 E2 2 2 2 2 4 4 4 4

    1 MEM 1 3 2 5 41 WB 1 3 2 5 4

    Out-of-order completion

    3 finishes before 2, and 5 finishes before 4

    Instructions may be delayed after entering the pipelinebecause of structural hazards

    Instructions 2 and 4 both want to use E2 unit at same time

    Instruction 4 stallsin ID unit

    This causes instruction 5 to stallin IF unit

  • 8/10/2019 MULTIcycle OPERATIONS.ppt

    5/24

    5

    Floating-Point Operations in MIPS

    IF ID

    MEM

    WB

    A1 A2 A3 A4

    M1 M2 M3 M4 M5 M6 M7

    EX

    DIV (25)

    Structural hazard:

    not fully pipelined

    Structural hazard:

    instructions have

    varying running

    times

    WAW hazards

    possible; WAR

    hazards not

    possible

    Longer operation

    latency impliesmore frequent

    stalls for RAW

    hazards

    Out-of-order

    completion; hasramifications for

    exceptions

  • 8/10/2019 MULTIcycle OPERATIONS.ppt

    6/24

    6

    Structural Hazard on WB Unit1 2 3 4 5 6 7 8 9 10 11

    DIV.D (issued at t = -16) D D D D D D D D D MEM WB

    MUL.D F0, F4, F6 IF ID M1 M2 M3 M4 M5 M6 M7 MEM WBinteger instruction IF ID EX MEM WB

    integer instruction IF ID EX MEM WB

    ADD.D F2, F4, F6 IF ID A1 A2 A3 A4 MEM WB

    integer instruction IF ID EX MEM WB

    integer instruction IF ID EX MEM WB

    L.D F2, 0(R2) IF ID EX MEM WB

    This is worst-case scenario: max steady-state number of write ports is 1

    Dont replicate resources; detect and serialize access as needed

    Early resolution

    Track use of WB in ID stage (using shift register), stall instructions there reservation register

    Simplifies pipeline control; all stalls occur in ID adds shift register and write-conflict logic

    Late resolution

    Stall instructions at entry to MEM or WB stage

    Complicates pipeline control (two stall locations)

  • 8/10/2019 MULTIcycle OPERATIONS.ppt

    7/247

    1 2 3 4 5 6 7 8 9 10 11 12 13

    DIV.D (issued at t = -16) D D D D D D D D D MEM WB

    MULT.D F0, F4, F6 IF ID s M1 M2 M3 M4 M5 M6 M7 MEM WBinteger instruction IF s ID EX MEM WB

    integer instruction IF ID EX MEM WB

    ADD.D F2, F4, F6 IF ID s A1 A2 A3 A4 MEM WB

    L.D F2, 0(R2) IF ID EX MEM WB

    WAW Hazards

    WAW hazard arises only when no instruction between ADD.D and L.D usesresult computed by ADD.D

    Adding an instruction like ADD.D F8,F2,F4 before L.D would stall pipelineenough for RAW hazard to avoid WAW hazard

    Can happen through a branch/trap (example in HP3, Section A.9)

    Rare situation, but must still handle correctly

    Hazard resolution

    Delay the issue of L.D until ADD.D enters MEM

    Cancel write of ADD.D

  • 8/10/2019 MULTIcycle OPERATIONS.ppt

    8/248

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

    L: L.D F4, 0(R2) IF L M A A S S S S S S S D

    M:MUL.D F0, F4, F6 ID L M M A A A A A A A S DA:ADD.D F2, F0, F8 EX L S S S S

    S:S.D 0(R2), F2 Mult M M M M M M M

    D:DIV.D F12, F4, F8 Add A A A A

    Div D D D D D D

    MEM L M A S

    WB L M A S

    RAW Hazards

    Longer delays of FP operations increases number of stalls in response toRAW hazards

    Two methods for reducing stalls

    Compiler could have moved instruction D between instructions M and A,which would allow D to complete earlier; or hardware could detect thispossibility and issue instruction D out of order

    ID stage is a bottleneck because instructions wait there for their operandsto be available; could add buffers (reservation stations) to functional unitsand let instructions await their operands there

  • 8/10/2019 MULTIcycle OPERATIONS.ppt

    9/24

  • 8/10/2019 MULTIcycle OPERATIONS.ppt

    10/2410

    MIPS R4000 Floating-Point Pipeline

    Stage Functional Unit Description

    A FP adder Mantissa ADD stage

    D FP divider Divide pipeline stage

    E FP multiplier Exception test stage

    M FP multiplier First stage of multiplier

    N FP multiplier Second stage of multiplier

    R FP adder Rounding stage

    S FP adder Operand shift stage

    U Unpack FP numbers

    1 2 3 4

    A x x

    D

    E

    MN

    R x x

    S x x

    U x

    Add

    Subtract

    1 2 3 4 5 6 7 8

    A xD

    E x

    M x x x x

    N x x

    R x

    S

    U x

    Multiply

    1 2 3 4 30 31 32 33 34 35 36

    A x x x x

    D x x x x x x

    E

    M

    N

    R x x x x

    S

    U xDivide

  • 8/10/2019 MULTIcycle OPERATIONS.ppt

    11/2411

    Instruction Mixes in FP Pipeline: Adds Only

    1 2 3 4

    A x x

    D

    E

    M

    N

    R x x

    S x x

    U x

    Add

    Subtract

    Cant initiate

    another add

    on cycle 2Conflict here

    Cant initiate

    another add

    on cycle 3

    Conflict here

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

    A x x y y x x y y x x y y

    D

    E

    MN

    R x x y y x x y y x x y y

    S x x y y x x y y x x y y

    U x y x y x y

    Forbidden latencies: 1 and 2

    Steady-state utilization (cycles 4 through 18)

    = (5*7)/(8*15) = 35/120 = 29.17%

    Total utilization (cycles 1 through 19)

    = (5+5*7+2)/(8*19) = 42/152 = 27.63%

  • 8/10/2019 MULTIcycle OPERATIONS.ppt

    12/2412

    FP Pipeline: Multiplies Only

    1 2 3 4 5 6 7 8

    A xD

    E x

    M x x x x

    N x x

    R x

    S

    U x

    1 1 1 1 0 0 0 0

    Multiply

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28

    A x y z x y z

    D

    E x y z x y zM x x x x y y y y z z z z x x x x y y y y z z z z

    N x x y y z z x x y y z z

    R x y z x y z

    S

    U x y z x y z

    Collision vector:1 indicates forbidden latency

    0 indicates allowed latency

    Steady-state utilization (cycles 5-24)

    = (5*10)/(8*20) = 50/160 = 31.25%

    Total utilization (cycles 1-28)

    = (5+5*10+5)/(8*28) = 60/224 = 26.79%

  • 8/10/2019 MULTIcycle OPERATIONS.ppt

    13/24

    13

    FP Pipeline: Adds and Multiplies

    1 2 3 4

    A x xD

    E

    M

    N

    R x x

    S x xU x

    Add

    Subtract

    1 2 3 4 5 6 7 8

    A xD

    E x

    M x x x x

    N x x

    R x

    SU x

    Multiply

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28

    A a a m b b n a a m b b n a a m b b n

    D

    E m n m n m nM m m m m n n n n m m m m n n n n m m m m n n n n

    N m m n n m m n n m m n n

    R a a m b b n a a m b b n a a m b b n

    S a a b b a a b b a a b b

    U m a n b m a n b m a n b

    Note out-of-order

    completionSteady-state utilization

    (cycles 6-21)

    = (4*17)/(8*16) = 68/128

    = 53.13%

    Total utilization

    = (12+4*17+22)/(8*28)

    = 85/224 = 37.95%

  • 8/10/2019 MULTIcycle OPERATIONS.ppt

    14/24

    14

    Interrupts, Faults, or Exceptions

    Synchronous, coerced interrupts that occur withininstructions and after which execution must resumeare the hardest to implement

    See Figure A.27 in HP3

    I/O

    request

    Async Coerced Between

    instr.

    Resume

    OS call Sync Userrequest

    Betweeninstr.

    Resume

    Breakpoint Sync Userrequest

    Betweeninstr.

    Resume

    Power fail Async Coerced Withininstr.

    Terminate

  • 8/10/2019 MULTIcycle OPERATIONS.ppt

    15/24

  • 8/10/2019 MULTIcycle OPERATIONS.ppt

    16/24

    16

    Problems on Sequential Processors Instruction modifies state early,

    then causes an interrupt

    State change must beundone

    Example: First operand ofVAX instruction usesautodecrement addressingmode, which writes a

    register. Trying to accesssecond operand causes apage fault. Since instructionexecution cannot becompleted, we must restorethe register written byautodecrement to its originalvalue

    Long-running instructions

    Not enough to be able torestore state, must makeprogress from interrupt tointerrupt

    Example: MVC on IBM 360copies 256 bytes No virtual memory, so

    interrupts not allowed to stopMVC

    Example: MVC on IBM 370copies 256 bytes Has virtual memory, so first

    access all pages involved;after that, no interrupts

    allowed Example: MVCL on IBM 370

    copies up to 224bytes Has VM; two addresses and

    length are in registers

    Registers saved and restored

    on interrupts (makingprogress)

  • 8/10/2019 MULTIcycle OPERATIONS.ppt

    17/24

    17

    Interrupts in MIPS PipelinePipeline stage Problem exceptions

    IF Page fault on instruction fetchMisaligned memory access

    Memory-protection violationID Undefined or illegal opcodeEX Arithmetic exception

    MEM Page fault on data fetchMisaligned memory accessMemory-protection violation

    WB None

    How do we stop and restart execution on an interrupt to keepit precise?

    What problems do delayed branches cause?

    What happens if multiple exceptions occur in the pipeline?

    Can exceptions occur out-of-order?

    What problems do multi-cycle instructions cause?

  • 8/10/2019 MULTIcycle OPERATIONS.ppt

    18/24

  • 8/10/2019 MULTIcycle OPERATIONS.ppt

    19/24

    19

    Complications with Delayed Branches1 2 3 4 5 6 7 8 9

    1 branch F D X M W2 delay slot F D X M W

    u BTA F D X M W

    u+1 F D X M W

    u+2 F D X M W

    Suppose instruction 2 causes an exception (e.g., a page fault)after the taken branch completes (determining that the

    branch outcome is true) Instruction 2 cannot complete

    Neither can instruction u

    On restart, we do not have sequential execution

    We must remember two PC values: 2 and u

  • 8/10/2019 MULTIcycle OPERATIONS.ppt

    20/24

  • 8/10/2019 MULTIcycle OPERATIONS.ppt

    21/24

    C li i i h l i l O i

  • 8/10/2019 MULTIcycle OPERATIONS.ppt

    22/24

    22

    Complications with Multicycle Operations

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28DIVF F0, F2, F4 F D X X X X X X X X X X X X X X X X X X X X X X X X M W

    ADDF F10, F10, F8 F D X X X X M W

    SUBF F12, F12, F14 F D X X X X M W

    Instructions are independent (no hazards) and therefore issueimmediately

    Differences in running times causes out-of-order termination

    DIVF throws arithmetic exception late in its executionAt that point, ADDF and SUBF have both completed execution

    and destroyed one of their operands

    Can we maintain precise interrupts under these conditions?

    l l d 2

  • 8/10/2019 MULTIcycle OPERATIONS.ppt

    23/24

    23

    FP Pipeline Exceptions: Solns. 1 and 2

    Settle for imprecise interrupts (CRAY, with

    checkpointing) Done on Alpha 21064 and 21164, IBM Power-1 and Power-2,

    MIPS R8000 by supporting a fast imprecise mode and a slowprecise mode

    Not an option if you have to support virtual memory or IEEEfloating point standard

    Software finishes certain instructions (SPARC) Keep enough state around for trap handler to create a precise

    sequence for exception and finish work for some instructionstages

    Only FP instructions cause this problem1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

    F D X X X X X X X X X X X X X X X M W

    F D X X X X X X X X M W

    F D X X X X X X X X M W

    F D X X X X M W

    FP Pi li E i S l 3 d 4

  • 8/10/2019 MULTIcycle OPERATIONS.ppt

    24/24

    24

    FP Pipeline Exceptions: Solns. 3 and 4

    Stalling (MIPS R2000/3000, MIPS R4000, Pentium)

    An instruction is allowed to issue only if it is certain that allthe instructions before the issuing instruction will completewithout causing an exception

    To prevent excessive stalling, FP units must decide onpossibility of exceptions early in pipeline

    General methods (PowerPC 620, MIPS R10000)

    Reorder buffer, history file, future file

    An instruction is allowed to finalize its writes only when all

    previously issued instructions are complete More naturally used in connection with ILP (Chapter 4)

    Significant complexity (to be discussed later)