computer architecture computer architecture processing of control transfer instructions, part ii ola...
TRANSCRIPT
![Page 1: Computer Architecture Computer Architecture Processing of control transfer instructions, part II Ola Flygt Växjö University](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d005503460f949d30fc/html5/thumbnails/1.jpg)
Computer Computer ArchitectureArchitecture
Processing of control transfer instructions, part IIOla Flygt
Växjö Universityhttp://w3.msi.vxu.se/users/ofl/
[email protected]+46 470 70 86 49
![Page 2: Computer Architecture Computer Architecture Processing of control transfer instructions, part II Ola Flygt Växjö University](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d005503460f949d30fc/html5/thumbnails/2.jpg)
Outline
8.4.4 Extent of speculativenessRecocery from misprediction
8.4.5 Branch penalty8.5 Multiway branching8.6 Guarded Execution
CH01
![Page 3: Computer Architecture Computer Architecture Processing of control transfer instructions, part II Ola Flygt Växjö University](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d005503460f949d30fc/html5/thumbnails/3.jpg)
Extent of speculative processing
![Page 4: Computer Architecture Computer Architecture Processing of control transfer instructions, part II Ola Flygt Växjö University](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d005503460f949d30fc/html5/thumbnails/4.jpg)
Extent of speculative processing
![Page 5: Computer Architecture Computer Architecture Processing of control transfer instructions, part II Ola Flygt Växjö University](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d005503460f949d30fc/html5/thumbnails/5.jpg)
Recovery from a misprediction: Basic Tasks
![Page 6: Computer Architecture Computer Architecture Processing of control transfer instructions, part II Ola Flygt Växjö University](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d005503460f949d30fc/html5/thumbnails/6.jpg)
Necessary activities to allow or to shorten recovery from a
misprediction
![Page 7: Computer Architecture Computer Architecture Processing of control transfer instructions, part II Ola Flygt Växjö University](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d005503460f949d30fc/html5/thumbnails/7.jpg)
Frequently employed schemes for shortening recovery from a misprediction
![Page 8: Computer Architecture Computer Architecture Processing of control transfer instructions, part II Ola Flygt Växjö University](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d005503460f949d30fc/html5/thumbnails/8.jpg)
shortening recovery from a misprediction: needs
![Page 9: Computer Architecture Computer Architecture Processing of control transfer instructions, part II Ola Flygt Växjö University](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d005503460f949d30fc/html5/thumbnails/9.jpg)
Using two instruction buffers in the supersparc
to shorten recovery from a misprediction:
![Page 10: Computer Architecture Computer Architecture Processing of control transfer instructions, part II Ola Flygt Växjö University](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d005503460f949d30fc/html5/thumbnails/10.jpg)
Using three instruction buffers in the Nx586
to shorten recovery from a misprediction:
![Page 11: Computer Architecture Computer Architecture Processing of control transfer instructions, part II Ola Flygt Växjö University](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d005503460f949d30fc/html5/thumbnails/11.jpg)
8.4.5 Branch penalty for taken guesses depends on
branch target accessing schemes
![Page 12: Computer Architecture Computer Architecture Processing of control transfer instructions, part II Ola Flygt Växjö University](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d005503460f949d30fc/html5/thumbnails/12.jpg)
Compute/fetch scheme for accessing branch targets {IFAR vs. PC}
![Page 13: Computer Architecture Computer Architecture Processing of control transfer instructions, part II Ola Flygt Växjö University](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d005503460f949d30fc/html5/thumbnails/13.jpg)
BTAC scheme for accessing branch targets {associative search for BA, if found get BTA} {0-cycle branch: BA=BA-
4}
![Page 14: Computer Architecture Computer Architecture Processing of control transfer instructions, part II Ola Flygt Växjö University](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d005503460f949d30fc/html5/thumbnails/14.jpg)
BTIC scheme: store next BTA
![Page 15: Computer Architecture Computer Architecture Processing of control transfer instructions, part II Ola Flygt Växjö University](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d005503460f949d30fc/html5/thumbnails/15.jpg)
BTIC scheme: calculate next BTA
![Page 16: Computer Architecture Computer Architecture Processing of control transfer instructions, part II Ola Flygt Växjö University](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d005503460f949d30fc/html5/thumbnails/16.jpg)
Successor index in the I-cache scheme to access the branch target path {index: next I, or target
I}
![Page 17: Computer Architecture Computer Architecture Processing of control transfer instructions, part II Ola Flygt Växjö University](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d005503460f949d30fc/html5/thumbnails/17.jpg)
Successor index in the I-cache scheme: e.g. The microachitecture of the UltraSparc
![Page 18: Computer Architecture Computer Architecture Processing of control transfer instructions, part II Ola Flygt Växjö University](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d005503460f949d30fc/html5/thumbnails/18.jpg)
Predecode unit: detects branches, BTA, make predictions (based on compiler’s hint bit), set up I-cache Next address
![Page 19: Computer Architecture Computer Architecture Processing of control transfer instructions, part II Ola Flygt Växjö University](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d005503460f949d30fc/html5/thumbnails/19.jpg)
Branch target accessing trends
![Page 20: Computer Architecture Computer Architecture Processing of control transfer instructions, part II Ola Flygt Växjö University](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d005503460f949d30fc/html5/thumbnails/20.jpg)
8.5 Multiway branching: {two IFA’s or PC’s}
![Page 21: Computer Architecture Computer Architecture Processing of control transfer instructions, part II Ola Flygt Växjö University](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d005503460f949d30fc/html5/thumbnails/21.jpg)
Threefold multiway branching: only one correct path!
![Page 22: Computer Architecture Computer Architecture Processing of control transfer instructions, part II Ola Flygt Växjö University](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d005503460f949d30fc/html5/thumbnails/22.jpg)
8.6 Guarded Execution a means to eliminate branches by conditional operate instructions
IF the condition associated with the instruction is met, THEN perform the specified operation ELSE do not perform the operation
e.g. original beg r1, label // if (r1) = 0 branch to label move r2, r3 // move (r2) into r3 label: …
e.g. guarded cmovne r1, r2, r3 // if (r1) != 0, move (r2) into r3 …
Convert control dependencies into data dependencies
![Page 23: Computer Architecture Computer Architecture Processing of control transfer instructions, part II Ola Flygt Växjö University](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d005503460f949d30fc/html5/thumbnails/23.jpg)
Eliminated branches by full and restricted guarding {full: all instruction guarded, restricted: ALU inst guarded}
![Page 24: Computer Architecture Computer Architecture Processing of control transfer instructions, part II Ola Flygt Växjö University](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d005503460f949d30fc/html5/thumbnails/24.jpg)
Guarded Execution: Disadvantages
guarding transforms instructions from both the taken and the not-taken paths into guard instruction increase number of instructions by 33% for full guarding by 8% for restricted guarding {more instructions more time and space}
guarding requires additional hardware resourcesif an increase in processing time is to be avoided VLIW