fault avoidance and fault tolerance
Post on 13-Feb-2017
239 Views
Preview:
TRANSCRIPT
Fault Avoidance and Fault Tolerance
Jabez Winston C15MU01
1st year M.E Embedded and Real-Time SystemsPSG College of Technology
WaferA wafer is a thin slice of semiconductor
material, such as a crystalline silicon, used in electronics for the fabrication of integrated circuits.
2-inch (51 mm), 4-inch (100 mm), 6-inch (150 mm), and 8-inch (200 mm) wafers
DieA die in the context of integrated circuits is a
small block of semiconducting material, on which a given functional circuit is fabricated
Die
Intel Xeon processor E7440 die containing 1.9 billion transistors.
Die is 22×23 mm (503 mm2)
Inside IC packaging
Die
VLSI circuits like microprocessor contains billions of transistors with feature size in order of nanometers.
So chances of failure are very high.
Even if the digital system is designed and manufactured without faults , the system can develop faults at later stages.
Faults, Errors and Failures• Fault: A physical defect within a circuit or a system– May or may not cause a system failure
• Error: Manifestation of a fault that results in incorrectcircuit (system) outputs or states– Caused by faults
• Failure: Deviation of a circuit or system from itsspecified behavior– Fails to do what it should do– Caused by an error
• Fault ---> Error ---> Failure
Some Real Defects in Chips• Processing Faults• Material Defects• Time-Dependent Failures• Packaging Failures
Fault avoidanceFault avoidance (a process oriented concept)
seeks to prevent faults from being introduced into the system
By carefully designing and manufacturing systems , fault can be avoided
Fault toleranceFault tolerance (a product oriented concept)
accepts faults in a limited capacity and masks their manifestation
A fault-tolerant design enables a system to continue its intended operation, possibly at a reduced level, rather than failing completely, when some part of the system fails.
Methods to avoid faults and building fault tolerant systems
RedundancyIt can implemented in the following ways
Providing multiple identical instances of the same system or subsystem, directing tasks or requests to all of them in parallel.
Providing multiple identical instances of the same system and switching to one of the remaining instances in case of a failure .
Providing multiple different implementations of the same specification
Thank you !
top related