unit1 qans

Upload: kamalsomu

Post on 14-Apr-2018

256 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/30/2019 Unit1 QAns

    1/7

    Powered By www.technoscriptz.com

    DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

    SUBJECT NAME: ADVANCED COMPUTER ARCHITECTURE CODE: CS2354

    PART-A

    1. Give few essential features of RISC architecture.The RISC-based machines focused the attention of designers on two critical performance

    techniques, the exploitation of instruction level parallelism (initially through pipelining and laterthrough multiple instruction issue) and the use of caches (initially in simple forms and later using

    more sophisticated organizations and optimizations).The RISC-based computers raised the performance bar, forcing prior architectures to keep up or

    disappear.(or/ both)

    RISC architectures are characterized by a few key properties, which dramatically simplifytheir implementation:

    All operations on data apply to data in registers and typically change the entire register (32 or 64bits per register).

    The only operations that affect memory are load and store operations that move data from memoryto a register or to memory from a register, respectively.

    Load and store operations that load or store less than a full register (e.g., a byte, 16 bits, or 32 bits)are often available.

    The instruction formats are few in number with all instructions typically being one size.These simple properties lead to dramatic simplifications in the implementation of

    pipelining, which is why these instruction sets were designed this way.(ref: text book, 4

    thedn. page no. appendix A-4)

    2. Power sensitive designs will avoid fixed field decoding. Why?In RISC architecture, register specifiers are at a fixed location and decoding is done in parallelwith reading registers. This technique is known as 'fixed field decoding'. In this method, we may

    read a register which we may not use. This doesn't help, but also doesn't hurt the performance. Incase of power sensitive designs, it does waste energy for reading an unnecessary register.

    (ref: text book, 4th

    edn. page no. appendix A-6)

    3. Give the causes of structural hazards.

  • 7/30/2019 Unit1 QAns

    2/7

    Powered By www.technoscriptz.com

    4. Give an example of result forwarding technique to minimize data hazard stalls. Isforwardingasoftware technique?

    No, it is a hardware technique.

    Example:

  • 7/30/2019 Unit1 QAns

    3/7

    Powered By www.technoscriptz.com

    5. Give a sequence of code that has true dependence, anti-dependence and control dependence in it.

    true dependence: Instrns 1,2 (R0)

    antidependence: Instructions 3,4 (R1)

    output dependence: Instructions 2,3 (F4); 4,5 (R1)

    6. What is the flaw in 1-bit branch prediction scheme?

  • 7/30/2019 Unit1 QAns

    4/7

    Powered By www.technoscriptz.com

    7. What is the key idea behind the implementation of hardware speculation?

    8. What is trace scheduling? Which type of processors use this technique?Trace scheduling is useful for processors with a large number of issues per clock, where conditional or

    predicated execution is inappropriate or unsupported, and where simple loop unrolling may not be

    sufficient by itself to uncover enough ILP to keep the processor busy. Trace scheduling is a way toorganize the global code motion process, so as to simplify the code scheduling by incurring the costs ofpossible code motion on the less frequent paths.

    There are two steps to trace scheduling. The first step, called trace selection, tries to find a likely

    sequence of basic blocks whose operations will be put together into a smaller number of instructions;this sequence is called a trace. Loop unrolling is used to generate long traces, since loop branches are

    taken with high probability.

    Once a trace is selected, the second process, called trace compaction, tries to squeeze the trace into asmall number of wide instructions. Trace compaction is code scheduling; hence, it attempts to move

    operations as early as it can in a sequence (trace), packing the operations into as few wide instructions(or issue packets) as possible.

    Trace scheduling is used in VLIW processors to exploit ILP.

    9. List some of the advanced Techniques for instruction delivery and Speculation.Techniques are 1. Multiple issuing (use of multiple issue processor), register renaming, ROB,

    speculation techniques, value prediction etc.

  • 7/30/2019 Unit1 QAns

    5/7

    Powered By www.technoscriptz.com

    10 . Mention few limits on Instruction Level Parallelism.

    1. Limitations on the Window Size and Maximum Issue Count2. Realistic Branch and Jump Prediction3. The Effects of Finite Registers4. The Effects of Imperfect Alias Analysis

    PART-B

    Explain how Scheduling and Structuring Code for Parallelism is done in VLIW / EPIC processors.

    ( 8 marks)

  • 7/30/2019 Unit1 QAns

    6/7

    Powered By www.technoscriptz.com

    1. Discuss the static and dynamic branch prediction techniques with suitable examples and diagrams. (16marks)

    Section 2.3 in the book. Should explain the following: Static Branch Prediction, Dynamic BranchPrediction (2 bit prediction) and Branch-Prediction Buffers, Correlating Branch Predictors andTournament

    Predictors,

    Or

    Explain Dynamic Scheduling Using Tomasulo's Approach (16 marks)

    Section 2.4 in the text book. Should explain the following: The basic structure of a MIPS floating-point

    unit using Tomasulo's algorithm with example.

    2. Discuss the essential features of Intel IA-64 Architecture and Itanium Processor. (16 marks)Section G.6 of Appendix G. Should discuss the following:

    The Intel IA-64 Instruction Set Architecture - The IA-64 Register Model, Instruction Format and

    Support for Explicit Parallelism, Instruction Set Basics, Predication and Speculation Support and The

    Itanium 2 Processor - Functional Units and Instruction Issue.

    Or

    Write short notes on

    a. Hardware versus Software Speculation (section 3.4, page 169-171 in the text book) (6 marks)To speculate extensively, we must be able to disambiguate memory references. This capability isdifficult to do at compile time for integer programs that contain pointers. In a hardware-based scheme,

    dynamic run time disambiguation of memory addresses is done using the techniques we saw earlier forTomasulo's algorithm. This disambiguation allows us to move loads past stores at run time. Support for

    speculative memory references can help overcome the conservatism of the compiler, but unless suchapproaches are used carefully, the overhead of the recovery mechanisms may swamp the advantages.

  • 7/30/2019 Unit1 QAns

    7/7

    Powered By www.technoscriptz.com

    Hardware-based speculation works better when control flow is unpredictable, and when hardware-

    based branch prediction is superior to software-based branch prediction done at compile time. Theseproperties hold for many integer programs. For example, a good static predictor has a misprediction rateof about 16% for four major integer SPEC92 programs, and a hardware predictor has a misprediction

    rate of under 10%. Because speculated instructions may slow down the computation when the predictionis incorrect, this difference is significant. One result of this difference is that even statically scheduled

    processors normally include dynamic branch predictors. Hardware-based speculation maintains a completely precise exception model even for speculated

    instructions. Recent software-based approaches have added special support to allow this as well. Hardware-based speculation does not require compensation or bookkeeping code, which is needed by

    ambitious software speculation mechanisms. Compiler-based approaches may benefit from the ability to see further in the code sequence, resulting

    in better code scheduling than a purely hardwaredriven approach. Hardware-based speculation with dynamic scheduling does not require different code sequences to

    achieve good performance for different implementations of an architecture. Although this advantage isthe hardest to quantify, it may be the most important in the long run. Interestingly, this was one of the

    motivations for the IBM 360/91. On the other hand, more recent explicitly parallel architectures, such as

    IA-64, have added flexibility that reduces the hardware dependence inherent in a code sequence.The major disadvantage of supporting speculation in hardware is the complexity and additionalhardware resources required. This hardware cost must be evaluated against both the complexity of a

    compiler for a software-based approach and the amount and usefulness of the simplifications in aprocessor that relies on such a compiler.

    a. ILP Support to Exploit Thread-Level Parallelism (section 3.5, page 172-179 in the text book) (10 marks)(out of syllabus!)