[ieee test symposium (ewdts) - st. petersburg, russia (2010.09.17-2010.09.20)] 2010 east-west design...

4
FREP: A Soft Error Resilient Pipelined RISC Architecture Viney Kumar * Centre for Electronic Design and Technology Indian Institute of Science Bangalore, India Email: [email protected] Rahul Raj Choudhary Electronics and Instrumentation Dept. Government Engineering College Bikaner, India Email:[email protected] Virendra Singh Supercomputer Education and Research Centre Indian Institute of Science Bangalore, India Email: [email protected] Abstract—Soft error has become one of the major areas of attention with the device scaling and large scale integration. Lot of variants for superscalar architecture were proposed with focus on program re-execution, thread re-execution and instruction re-execution. In this paper we proposed a fault tolerant micro–architecture of pipelined RISC. The proposed architecture, Floating Resources Ex- tended pipeline (FREP), re-executes the instructions using extended pipeline stages. The instructions are re-executed by hybrid architecture with a suitable combination of space and time redundancy. Index Terms—Soft error mitigation, Instruction Re- execution, Fault-tolerance, Pipelined RISC. I. I NTRODUCTION Industrial automation and many low power portable computing devices are the results of success of semiconductor technology especially nano-technology. Trends in device scaling poses new challenges of relia- bility and testing. It is speculated that soft error will have major reliability concern as we move towards nanometer technology. The ever decreasing supply voltages and nodal capacitances (required for constraining the power and making circuit transition faster) result in reduced critical charge (Qcrit) required to upset a node in digital circuits. The problem becomes more acute for aircraft and space electronics where high-energy neutrons at higher altitudes and heavy ions in space are more abun- dant. Under these circumstances it becomes necessary to make chip fault tolerant specially for errors which are short lived in nature. Hardware redundant and Time redundant architectures are proposed by many authors to detect transient error, before final system crash. In 0 * The author is currently with nVIDIA, Bangalore Hardware redundant architecture hardware is duplicated to re-execute the work. The technique Triple Modular Redundancy (TMR) ( 2/3 logic )provides fault tolerance capability, but it has 200 percent hardware overhead . Time Redundant Micro-architectures like ReStore [2], REESE [7], AR-SMT [1] are fault tolerant. AR-SMT [1] microarchitecture focus super-scaler architecture which supports simultaneous multithreading. ReStore [2] is also augmentation of modern high performance processor. These Architectures are suitable only for superscalar machines where enough resources are available for re- computing the work. The architecture proposed by Sohi et al [6] handles only functional unit errors. Metra et al [3] have proposed a hidden code based-technique to deal with the soft errors in controller. However, this technique is limited to the controller circuitry. Soft errors can be managed at software level also by re-executing the program by means of operating system. Re-execution can be at different levels -(1) Program re-execution (2) Instruction re-execution (3) Thread re-execution . In this work, we propose a soft error resilient pipelined RISC architecture (FREP) for small microcon- troller/embedded processor cores which are used in in- dustrial application in harsh environments and for small embedded controllers used in space applications( where mix arithmetic and logic operations are delivered to core). In proposed work it is assumed that soft errors are short lived. The Remainder of this paper is organized as follows. In Section-II we describe our (FREP) proposed concept. In section-III we explain implemented architec- ture of FREP Pipelined RISC controller. In section-IV results are discussed of RTL simulation on Xilinx ISE and Modelsim-XE. Section-V concludes the paper. 978-1-4244-9556-6/10/$26.00 ©2010 IEEE

Upload: virendra

Post on 21-Mar-2017

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: [IEEE Test Symposium (EWDTS) - St. Petersburg, Russia (2010.09.17-2010.09.20)] 2010 East-West Design & Test Symposium (EWDTS) - FREP: A soft error resilient pipelined RISC architecture

FREP: A Soft Error Resilient Pipelined RISCArchitecture

Viney Kumar *Centre for Electronic

Design and TechnologyIndian Institute of Science

Bangalore, IndiaEmail: [email protected]

Rahul Raj ChoudharyElectronics and Instrumentation Dept.

Government Engineering CollegeBikaner, India

Email:[email protected]

Virendra SinghSupercomputer Education

and Research CentreIndian Institute of Science

Bangalore, IndiaEmail: [email protected]

Abstract—Soft error has become one of the majorareas of attention with the device scaling and large scaleintegration. Lot of variants for superscalar architecturewere proposed with focus on program re-execution, threadre-execution and instruction re-execution. In this paper weproposed a fault tolerant micro–architecture of pipelinedRISC. The proposed architecture, Floating Resources Ex-tended pipeline (FREP), re-executes the instructions usingextended pipeline stages. The instructions are re-executedby hybrid architecture with a suitable combination of spaceand time redundancy.

Index Terms—Soft error mitigation, Instruction Re-execution, Fault-tolerance, Pipelined RISC.

I. INTRODUCTION

Industrial automation and many low power portablecomputing devices are the results of success ofsemiconductor technology especially nano-technology.Trends in device scaling poses new challenges of relia-bility and testing. It is speculated that soft error will havemajor reliability concern as we move towards nanometertechnology. The ever decreasing supply voltages andnodal capacitances (required for constraining the powerand making circuit transition faster) result in reducedcritical charge (Qcrit) required to upset a node in digitalcircuits. The problem becomes more acute for aircraftand space electronics where high-energy neutrons athigher altitudes and heavy ions in space are more abun-dant. Under these circumstances it becomes necessaryto make chip fault tolerant specially for errors whichare short lived in nature. Hardware redundant and Timeredundant architectures are proposed by many authorsto detect transient error, before final system crash. In

0* The author is currently with nVIDIA, Bangalore

Hardware redundant architecture hardware is duplicatedto re-execute the work. The technique Triple ModularRedundancy (TMR) ( 2/3 logic )provides fault tolerancecapability, but it has 200 percent hardware overhead .Time Redundant Micro-architectures like ReStore [2],REESE [7], AR-SMT [1] are fault tolerant. AR-SMT [1]microarchitecture focus super-scaler architecture whichsupports simultaneous multithreading. ReStore [2] is alsoaugmentation of modern high performance processor.These Architectures are suitable only for superscalarmachines where enough resources are available for re-computing the work. The architecture proposed by Sohiet al [6] handles only functional unit errors. Metra etal [3] have proposed a hidden code based-techniqueto deal with the soft errors in controller. However, thistechnique is limited to the controller circuitry. Soft errorscan be managed at software level also by re-executingthe program by means of operating system. Re-executioncan be at different levels -(1) Program re-execution (2)Instruction re-execution (3) Thread re-execution .In this work, we propose a soft error resilientpipelined RISC architecture (FREP) for small microcon-troller/embedded processor cores which are used in in-dustrial application in harsh environments and for smallembedded controllers used in space applications( wheremix arithmetic and logic operations are delivered tocore). In proposed work it is assumed that soft errors areshort lived. The Remainder of this paper is organized asfollows. In Section-II we describe our (FREP) proposedconcept. In section-III we explain implemented architec-ture of FREP Pipelined RISC controller. In section-IVresults are discussed of RTL simulation on Xilinx ISEand Modelsim-XE. Section-V concludes the paper.

978-1-4244-9556-6/10/$26.00 ©2010 IEEE

Page 2: [IEEE Test Symposium (EWDTS) - St. Petersburg, Russia (2010.09.17-2010.09.20)] 2010 East-West Design & Test Symposium (EWDTS) - FREP: A soft error resilient pipelined RISC architecture

II. FREP - PROPOSED CONCEPT

The fact, that all instructions do not use all theresources of any processor, can be utilized to achieveredundancy in time during execution. Data transfer in-structions and NOP instruction do not use arithmeticand logical unit of processor. Even instructions whichmake use of ALU, do not utilize all hardware resourcesof ALU. This allows division of ALU into smallerfunctional units like arithmetic unit, logical unit andmultiplier unit. This enables re-execution of instructionsby making simultaneous usage of these functional units.In a typical execution sequence where two consecutiveinstructions do not make use of same functional unit, theredundancy in time can be easily achieved by executingthe instruction twice. Results of executed instructionare held until it is compared with the result of sameinstruction in the second execution. In case if the con-secutive instructions demand same functional unit thenthe controller detects the non-availability of resources forsecond time execution of the instruction and stalls thepipeline. This ensures that instructions are re-executedbefore commitment of result in architecture state. Addingextra pipeline stages between execution and commitmentstage, gives more clocks cycles to instruction for re-execution without stalling the pipeline. In this paper,we propose a floating resource and extended pipeline(FREP) architecture. In FREP architecture some addi-tional pipelines are inserted and availability of executionunits are made to all extended pipelines. Now executionunits become floating which means they can be allocatedto any pipeline stage based on job requirement. Asshown in figure 1,in FREP architecture pipelined stagesare inserted with a belief that instructions delivered tocore demand all functional units on an average in agiven time frame. To execute all instructions in extendedpipeline stages, resources keep floating so that minimumhardware is required for execution and re-execution of allinstructions. Tasks keep moving in pipelined stages andcorresponding resources are allocated to them wheneveravailability of resources is confirmed by the scheduler.Scheduler ensures that all computation in a task is overbefore leaving the pipeline stages. This concept canbe suitable in other domains like hardware accelerator,where stalls are tolerable in the worst case.

III. ARCHITECTURAL IMPLEMENTATION OF THE

PROPOSED CONCEPT

FREP concept is demonstrated with implementationof 10 stage pipelined MIPS architecture as shown infigure 3, in which 4 redundant pipeline stages (pipeline

Fig. 1. FREP-Floating Resource Extended Pipeline Concept

stage 5-8) for re-execution and one(pipeline stage 2)for reducing critical path are introduced. ArithmeticLogical Unit has been split into two logical parts (1 )Adder - Subtractor unit and (2) logical operation unit.Instructions are fetched in first pipeline stage and aredecoded with two redundant decoders in the secondpipeline stage. As in pipelined MIPS architecture, ineach clock one instruction is issued, Space redundanttechnique is used for capturing soft errors in decodingof an instruction. Decoded signals are compared in thirdpipeline stage. Data forwarding and re-execution FSMare kept in third pipeline stage. After execution pipelinestage, four extended pipeline stages are inserted toprovide sufficient clocks to re-execute all instructionsbefore committing to architectural state. During theseclock cycles the same functional unit is invoked tore-execute the instructions. In this implementation,execution pipeline stage is fixed in order to make thedesign of hardware scheduler simple. Each instructionhas to finish execution before entering re-executionpipeline stages. Hardware scheduler (re-executionFSM) allocates unused functional units to extendedre-execution pipeline stages for re-execution. Figure 2explains the algorithm to schedule the functional unitsfor re-execution of instructions. In case, when hardwareresources corresponding to two non re-executedinstructions in re-execution pipeline cycles, are availablethen instruction, which is nearer to architecturalstate, will be issued for re-execution. In worst caseif any instruction reaches last extended re-execution

Page 3: [IEEE Test Symposium (EWDTS) - St. Petersburg, Russia (2010.09.17-2010.09.20)] 2010 East-West Design & Test Symposium (EWDTS) - FREP: A soft error resilient pipelined RISC architecture

Fig. 2. FREP-Re-execution FSM

pipeline stage, hardware scheduler stalls the normalexecution flow and dedicates particular Function unitfor re-execution of instruction. Once instructions arere-executed, they will flow in extended pipeline stagesin normal fashion. memory write and write back are thelast two pipeline stages. In other words, these functionalunits have become floating resources, which can bemoved to any pipelined stage. To avoid data hazards aforwarding unit is provided. Two re-execution pipes areused to carry the instruction, operands and control wordfor re-execution. This enables maximum utilizationof the functional units. These re-execution pipes canserve both purpose of data forwarding and carryinginstructions for re-execution.After re-execution, results of executed and re-executedinstructions are compared, in case soft error in detected,instruction will not be committed to architectural stageand pipeline will be flushed.

In proposed architecture instructions are executed in-order where as re-execution of instructions is done outof order. Both Time and Space redundancy concepts areutilised to cover all blocks of MIPS against detectionof soft errors. It is assumed that memories and busesare covered with ECC (Error detection and Correc-tion Coding) codes. Data forwarding FSM is protectedagainst soft error by virtue of design. Any soft errorin FSM causes wrong data to functional units, whichprobabilistically produces mismatch in results.

IV. EXPERIMENTAL RESULTS

FREP Architecture, for soft error detection, has beenvalidated using RTL Simulation in ModelSim. Soft errorsare inserted in FU, decoders and Data forwarding FSM.Hardware overhead of 15.2% has been observed with twofunctional units and 10 pipeline stages. Performance ofproposed architecture is analysed by running ”for” loops.Loops have been appended with addition, right shift, leftshift and data transfer instructions. A few preliminaryobservations are presented here. The observations are:(1) In all loops where number of instruction is less, nopipeline stalls are needed as all instructions get enoughtime for re-execution during branch prediction stalls.(2) In computational algorithms where addition instruc-tions dominate, about 50% stalls are encountered.(3)Various analysis results predict that compiler inserts10-15% ”NOP” instructions [9]. With a mix of arithmeticand logical operations and 20% ”NOP” instructions, timeoverhead in re-executing all instructions for detection ofsoft error, is very low . We carried out simulation usingsmall embedded system programms and time overheadin re-executiong all the instructions of detection of softerror was less than 10%.(4) Ten pipeline stages, forces compiler to insert 5”nop” after ”load” instruction ,which causes performancepenalty in term of time. One of the possible solution is toexecute ”Load” instruction out of order to reduces timepenalty.

V. CONCLUSION

Soft errors are becoming increasingly important forthe designs implemented in nano-technology. Varioustechniques have been proposed in the literature for su-perscalar architectures. This paper presented a techniquewhich is most suitable technique for simple pipelinedarchitectures. Pipelined architectures are most commonlyused architectures in the embedded systems. This paperhas shown that it can guarantee the correct output onthe cost of low performance penalty. Our future work

Page 4: [IEEE Test Symposium (EWDTS) - St. Petersburg, Russia (2010.09.17-2010.09.20)] 2010 East-West Design & Test Symposium (EWDTS) - FREP: A soft error resilient pipelined RISC architecture

Fig. 3. FREP-Floating Resource Extended Pipeline MIPS architecture

consists of evalution of our architecture with benchmarkprogrammes.

REFERENCES

[1] Eric Rotenberg, ’AR-SMT: A Micro architectural Approachto Fault Tolerance in Microprocessors’, Twenty-Ninth AnnualInternational Symposium on Fault-Tolerant Computing, 1999.Page(s):84 - 91.

[2] Wang N.J. and Patel S.J., ’ReStore: Symptom Based Soft ErrorDetection in Microprocessors, IEEE Transactions on Architec-ture, Volume 3, Issue 3, July-Sept. 2006 Page(s): 188 - 201.

[3] C. Metra D. Rossi, M. Omana, A. Jas, and R.Galivanche,’Function-Inherent Code Checking: A NewLow Cost On-Line Testing Approach For High PerformanceMicroprocessor Control Logic’, 13th European Test Symposium,May 2008.

[4] Naseer, Riaz Bhatti, Rashed Zafar Draper, Jeff , ’Analysis ofSoft Error Mitigation Techniques for Register Files in IBM Cu-08 90nm Technology’,Circuits and Systems, 2006. MWSCAS’06.

[5] Qureshi, M.K.; Mutlu, O.; Patt, Y.N., ’Micro architecture-BasedIntrospection: A Technique for Transient-Fault Tolerance in Mi-croprocessors’, Dependable Systems and Networks, 2005. DSN2005. Proceedings. International Conference, Volume , Issue , 28June-1 July 2005 Page(s): 434 - 443

[6] G. Sohi, M. Franklin, and K.K. Saluja, ’A Study of Time-Redundant Fault Tolerance Techniques for High-PerformancePipelined Computers’, Nineteenth International Symposium onFault-Tolerant Computing, FTCS-19, Jun 1989 Page(s):436 -443.

[7] J.B. Nickel, and A.K. Somani, ’REESE: A Method of Soft ErrorDetection in Microprocessors’, Proc. of Int. Conf. on DependableSystems and Networks, 2001.

[8] S. Kim, A. K. Somani, ’SSD: an Affordable Fault TolerantArchitecture for Superscalar Processors’, in Proc. of Int. Symp.on Dependable Computing, pp. 27-34, 2001.

[9] S. Shamshiri, H. Esmaeilzadeh, and Z. Navabi, Instruction-leveltest methodology for CPU core self-testing, ACM Trans. Des.Autom. Electron. Syst., vol. 10, no. 4, pp. 673689, Oct. 2005.