Download - OOO vs. EPIC
![Page 1: OOO vs. EPIC](https://reader036.vdocuments.net/reader036/viewer/2022082818/56813064550346895d963a00/html5/thumbnails/1.jpg)
OOO vs. EPIC
Yingmin Li
Ting Yan
Qi Zhao
![Page 2: OOO vs. EPIC](https://reader036.vdocuments.net/reader036/viewer/2022082818/56813064550346895d963a00/html5/thumbnails/2.jpg)
Outline
“Advantages” of EPIC Critique Conclusion
![Page 3: OOO vs. EPIC](https://reader036.vdocuments.net/reader036/viewer/2022082818/56813064550346895d963a00/html5/thumbnails/3.jpg)
EPIC: Main Idea
“Smart compiler, dumb machine” Finding parallelism
– Processor compiler– Software/hardware synergy
Processor design– Avoid complexity and difficulty
ILP, SMT & CMP
![Page 4: OOO vs. EPIC](https://reader036.vdocuments.net/reader036/viewer/2022082818/56813064550346895d963a00/html5/thumbnails/4.jpg)
EPIC: Predication
In OOO: dynamic branch prediction. Larger basic blocks. Control dep. Data dep. Eliminate misprediction & penalties.
![Page 5: OOO vs. EPIC](https://reader036.vdocuments.net/reader036/viewer/2022082818/56813064550346895d963a00/html5/thumbnails/5.jpg)
EPIC: Speculation
OOO: dynamic hardware Data speculation & control speculation Bigger window Reduce impact of memory latencies
![Page 6: OOO vs. EPIC](https://reader036.vdocuments.net/reader036/viewer/2022082818/56813064550346895d963a00/html5/thumbnails/6.jpg)
EPIC: Large Register Set
OOO: register renaming. Easier to design than reg. Renaming. “Real” registers benefits some apps.
– Encryption alg., Numerical alg. Avoid loss of invisible registers.
– Interruptions in OOO.
![Page 7: OOO vs. EPIC](https://reader036.vdocuments.net/reader036/viewer/2022082818/56813064550346895d963a00/html5/thumbnails/7.jpg)
EPIC: Unique Features
Register Stack Engine (RSE).– To deal with call/ return costs.– Seems an unlimited stack of phys. Reg.
Rotating register file.– Software pipelining.
• Multiple loops at the same time.
![Page 8: OOO vs. EPIC](https://reader036.vdocuments.net/reader036/viewer/2022082818/56813064550346895d963a00/html5/thumbnails/8.jpg)
Function Call
Register saving/restoring– Processor?– Compiler?
Register file– Expensive– Always idle
![Page 9: OOO vs. EPIC](https://reader036.vdocuments.net/reader036/viewer/2022082818/56813064550346895d963a00/html5/thumbnails/9.jpg)
Predication
Computation of the branch condition is on the critical path
Increase ICache footprint Half of the functional units effectively
used if both “then” and “else” are scheduled
Hard to implement out-of-order with full predication
![Page 10: OOO vs. EPIC](https://reader036.vdocuments.net/reader036/viewer/2022082818/56813064550346895d963a00/html5/thumbnails/10.jpg)
PredicationTo compute if (a) x = t+1:
![Page 11: OOO vs. EPIC](https://reader036.vdocuments.net/reader036/viewer/2022082818/56813064550346895d963a00/html5/thumbnails/11.jpg)
Control Speculation
Why not just use prefetch which will not cause unexpected exception?
Technique to exploit control speculation such as superblock increase code length
![Page 12: OOO vs. EPIC](https://reader036.vdocuments.net/reader036/viewer/2022082818/56813064550346895d963a00/html5/thumbnails/12.jpg)
Control prediction
![Page 13: OOO vs. EPIC](https://reader036.vdocuments.net/reader036/viewer/2022082818/56813064550346895d963a00/html5/thumbnails/13.jpg)
Data Speculation
Moving a load above a possibly conflicting store– An advanced load and a checking load
(IA64)– A run-time predictor
![Page 14: OOO vs. EPIC](https://reader036.vdocuments.net/reader036/viewer/2022082818/56813064550346895d963a00/html5/thumbnails/14.jpg)
Data speculation
![Page 15: OOO vs. EPIC](https://reader036.vdocuments.net/reader036/viewer/2022082818/56813064550346895d963a00/html5/thumbnails/15.jpg)
Software Pipelining
For high performance technical computing– High trip-count loops
For commercial applications– Low trip-count loops
![Page 16: OOO vs. EPIC](https://reader036.vdocuments.net/reader036/viewer/2022082818/56813064550346895d963a00/html5/thumbnails/16.jpg)
EPIC: at least not a breakthrough
Design Object of EPIC:– Moving hardware complexity to compiler
![Page 17: OOO vs. EPIC](https://reader036.vdocuments.net/reader036/viewer/2022082818/56813064550346895d963a00/html5/thumbnails/17.jpg)
EPIC: at least not a breakthrough
The failure of EPIC:– The compiling technique used for EPIC
almost also apply well to OOO– Hardware simplicity is not so obvious to
offset EPIC’s overhead– Without dynamic information, compiler
essentially can’t do sth well enough
![Page 18: OOO vs. EPIC](https://reader036.vdocuments.net/reader036/viewer/2022082818/56813064550346895d963a00/html5/thumbnails/18.jpg)
The tragedy of cycle time
Why no obvious improvement in cycle time– mechanisms like RSA increase die
complexity– Compare and dependent branch in one
cycle– Predicted execution dependent on the
existence of many function units
![Page 19: OOO vs. EPIC](https://reader036.vdocuments.net/reader036/viewer/2022082818/56813064550346895d963a00/html5/thumbnails/19.jpg)
Dynamic path length: hey, IA64, you wasted too much here Speculation Half of the predicted instructions
discarded Restricted bundling One base register No sign-extended loads No integer multiply or divide in general
register
![Page 20: OOO vs. EPIC](https://reader036.vdocuments.net/reader036/viewer/2022082818/56813064550346895d963a00/html5/thumbnails/20.jpg)
CPI
No dynamic prediction Longer source code (more GR,
Predicate register, template bit, restricted bundling, recovery code) is burdensome for instruction fetching
Recovery code may induce ICache pollution or just a page-fault