design and simulation of an em-fault-tolerant processor with micro-rollback, control- flow checking...

32
Design and Simulation of an EM-Fault-Tolerant Processor with Micro- Rollback, Control-Flow Checking and ECC Franco Trovo, Shantanu Franco Trovo, Shantanu Dutt & Hasan Arslan Dutt & Hasan Arslan Univ. of Illinois at Chicago Univ. of Illinois at Chicago

Upload: arlene-blankenship

Post on 19-Jan-2018

222 views

Category:

Documents


0 download

DESCRIPTION

Assumptions/Scenarios of Past FD/FT Work Past Work on general fault detection: Past Work on general fault detection: Random single (sometimes double) faultsRandom single (sometimes double) faults Deterministic faultsDeterministic faults Types of faults: permanent, transient, intermittent; intermittent type not generally tackledTypes of faults: permanent, transient, intermittent; intermittent type not generally tackled Past Work on EM-induced faults: Past Work on EM-induced faults: No how/why/what analysis and classification of computer failure due to EM interferenceNo how/why/what analysis and classification of computer failure due to EM interference

TRANSCRIPT

Page 1: Design and Simulation of an EM-Fault-Tolerant Processor with Micro-Rollback, Control- Flow Checking and ECC Franco Trovo, Shantanu Dutt  Hasan Arslan

Design and Simulation of an EM-Fault-Tolerant Processor with Micro-Rollback, Control-

Flow Checking and ECC

Franco Trovo, Shantanu Dutt Franco Trovo, Shantanu Dutt & Hasan Arslan& Hasan Arslan

Univ. of Illinois at ChicagoUniv. of Illinois at Chicago

Page 2: Design and Simulation of an EM-Fault-Tolerant Processor with Micro-Rollback, Control- Flow Checking and ECC Franco Trovo, Shantanu Dutt  Hasan Arslan

OutlineOutline GoalsGoals Solution AdoptedSolution Adopted

• Control Flow CheckingControl Flow Checking• Hamming encoding on the busesHamming encoding on the buses• Instruction Micro rollbackInstruction Micro rollback

Motorola 68040 and VHDL descriptionMotorola 68040 and VHDL description Simulation resultsSimulation results ConclusionConclusion

Page 3: Design and Simulation of an EM-Fault-Tolerant Processor with Micro-Rollback, Control- Flow Checking and ECC Franco Trovo, Shantanu Dutt  Hasan Arslan

Assumptions/Scenarios of Past FD/FT Assumptions/Scenarios of Past FD/FT WorkWork

Past Work on general fault detection:Past Work on general fault detection:• Random single (sometimes double) faultsRandom single (sometimes double) faults• Deterministic faultsDeterministic faults• Types of faults: permanent, transient, Types of faults: permanent, transient,

intermittent; intermittent type not generally intermittent; intermittent type not generally tackledtackled

Past Work on EM-induced faults:Past Work on EM-induced faults:• No how/why/what analysis and classification of No how/why/what analysis and classification of

computer failure due to EM interferencecomputer failure due to EM interference

Page 4: Design and Simulation of an EM-Fault-Tolerant Processor with Micro-Rollback, Control- Flow Checking and ECC Franco Trovo, Shantanu Dutt  Hasan Arslan

Broad Goals of Our WorkBroad Goals of Our Work Will determine and classify the following type of Will determine and classify the following type of

computer system behavioral error (i.e., program computer system behavioral error (i.e., program errors) due to different patterns, extent, duration and errors) due to different patterns, extent, duration and location of faults under EM-type faults:location of faults under EM-type faults: Control flow errors -- incorrect sequence of instruction Control flow errors -- incorrect sequence of instruction

execution. execution. Causes: address gen. error, memory faults, Causes: address gen. error, memory faults, bus faultsbus faults

Data errors. Data errors. Causes: computation errors, memory & bus Causes: computation errors, memory & bus faultsfaults

Termination Errors (hung processor & crashes). Termination Errors (hung processor & crashes). Causes:Causes: C.U. transition to dead-end states, invalid instruction, C.U. transition to dead-end states, invalid instruction, out-of-bound address, divide-by-zero, spurious interruptsout-of-bound address, divide-by-zero, spurious interrupts

Note: Note: Error types are NOT mutually exclusive Error types are NOT mutually exclusive Provide recipes for FT and reliable operationProvide recipes for FT and reliable operation

Page 5: Design and Simulation of an EM-Fault-Tolerant Processor with Micro-Rollback, Control- Flow Checking and ECC Franco Trovo, Shantanu Dutt  Hasan Arslan

In This WorkIn This Work Will detectWill detect

Control flow errors -- incorrect sequence of instruction Control flow errors -- incorrect sequence of instruction execution. execution. Causes: address gen. error, memory faults, Causes: address gen. error, memory faults, bus faultsbus faults

Raw bus errors using ECCRaw bus errors using ECC

Provide a FT mechanism using these detections for Provide a FT mechanism using these detections for reliable operationreliable operation

Page 6: Design and Simulation of an EM-Fault-Tolerant Processor with Micro-Rollback, Control- Flow Checking and ECC Franco Trovo, Shantanu Dutt  Hasan Arslan

OutlineOutline GoalsGoals Solution AdoptedSolution Adopted

• Control Flow CheckingControl Flow Checking• Hamming encoding on the busesHamming encoding on the buses• Instruction Micro rollbackInstruction Micro rollback

Motorola 68040 and VHDL descriptionMotorola 68040 and VHDL description Simulation resultsSimulation results ConclusionConclusion

Page 7: Design and Simulation of an EM-Fault-Tolerant Processor with Micro-Rollback, Control- Flow Checking and ECC Franco Trovo, Shantanu Dutt  Hasan Arslan

FD/FT SolutionsFD/FT Solutions

Fault Detection:Fault Detection:• Control flow checking (CFC) by a concurrent error Control flow checking (CFC) by a concurrent error

detection using watchdog (WD) processordetection using watchdog (WD) processor• Hamming ECC (2-error detecting) on data & Hamming ECC (2-error detecting) on data &

address busesaddress buses

Fault Tolerance:Fault Tolerance:• Instruction micro rollback triggered byInstruction micro rollback triggered by

Hamming ECCHamming ECC WD-monitored CFCWD-monitored CFC

Page 8: Design and Simulation of an EM-Fault-Tolerant Processor with Micro-Rollback, Control- Flow Checking and ECC Franco Trovo, Shantanu Dutt  Hasan Arslan

General Structure of a System with General Structure of a System with a Watchdoga Watchdog

MAIN PROCESSOR

MAIN MEMORY

DATA BUS

ADD. BUS

WATCHDOG PROCESSOR

Performs various checks (CFC, address, etc.)

Page 9: Design and Simulation of an EM-Fault-Tolerant Processor with Micro-Rollback, Control- Flow Checking and ECC Franco Trovo, Shantanu Dutt  Hasan Arslan

General Structure of a WD-General Structure of a WD-Monitored System with On-Chip Monitored System with On-Chip

CacheCache

ADD. BUS

DATA BUS

CPU

MM

WD

Cache

Page 10: Design and Simulation of an EM-Fault-Tolerant Processor with Micro-Rollback, Control- Flow Checking and ECC Franco Trovo, Shantanu Dutt  Hasan Arslan

Control Flow Checking Control Flow Checking [Mahmood, et al., IEEE TC’88][Mahmood, et al., IEEE TC’88]

Hybrid solution for detecting wrong block Hybrid solution for detecting wrong block sequence executionsequence execution

Starting from a program it extracts a Control Flow Starting from a program it extracts a Control Flow Graph Graph

Each node is Each node is associated to a associated to a block of branch block of branch free instructions free instructions + branch at end+ branch at end

Each edge is Each edge is associated w/ a associated w/ a possible branch possible branch between two between two blocksblocks

Block AIf cond1 then Block B if cond2 then Block D else Block EElse Block CEnd ifBlock F

A

B C

D E

F

Page 11: Design and Simulation of an EM-Fault-Tolerant Processor with Micro-Rollback, Control- Flow Checking and ECC Franco Trovo, Shantanu Dutt  Hasan Arslan

Control Flow CheckingControl Flow Checking Block: branch free set of instructionsBlock: branch free set of instructions Signature: information added to the block in order Signature: information added to the block in order

to distinguish a block from anotherto distinguish a block from another

Block augmentation & sign. insertion

A

B C

D E

FJump free set of

instructions

Jump free set of

instructions

JUMP

JUMP

JUMP sign 1

JUMP

JUMP sign 2

Branch free set of

instructions

Branch free set of

instructions

Branch

Branch

BLOCK sign

Sign of 1st bra

Branch

Sign of 2nd bra

Branch

Block

Page 12: Design and Simulation of an EM-Fault-Tolerant Processor with Micro-Rollback, Control- Flow Checking and ECC Franco Trovo, Shantanu Dutt  Hasan Arslan

CFC Implemented State CFC Implemented State DiagramDiagram

ResetBegin Block

ErrorWrong Bra

ErrorWrong Jump or

Faulted Signature

ErrorWrong Computed Signature

Header

Middle Block

Signature 1

Signature 2

Branch

ErrorSignatureExpected

Computed Sign. Eq.Header Sign?

GET2S

GET1S

Header Sign Eg.Bra Signatures?

N

N

N

N

Y

Y

Y

Y

A

B C

D E

F

Jump free set of

instructions

JUMP

JUMP sign 1

JUMP

JUMP sign 2

Branch free set of

instructions

BLOCK sign

Sign of 1st bra

Branch

Sign of 2nd bra

Branch

No Branch signs

Page 13: Design and Simulation of an EM-Fault-Tolerant Processor with Micro-Rollback, Control- Flow Checking and ECC Franco Trovo, Shantanu Dutt  Hasan Arslan

Micro Rollback [Tamir, et al., IEEE TC‘90]Micro Rollback [Tamir, et al., IEEE TC‘90]

Individual State Registers(RAM based)

Register File, Caches, Main Mem(DWB based)

to\from processor

Priority

v

v

v

v

v

v

Backup Registers

Current Register

CAMvv v v v v

PRIORITY CIRCUITDECODER

Register Addresses

Register FileDWB

FIFOBus 1

Bus 2

Write

Write

Page 14: Design and Simulation of an EM-Fault-Tolerant Processor with Micro-Rollback, Control- Flow Checking and ECC Franco Trovo, Shantanu Dutt  Hasan Arslan

Support for Micro Rollback for Support for Micro Rollback for Register File - exampleRegister File - example

MOVE 0000, D0MOVE 0000, D0 ADD 000F, D0ADD 000F, D0 MOVE 0001, A3 (f)MOVE 0001, A3 (f) SUBSUB 0002, D0 0002, D0 ……

CAM

PRIORITY CIRCUITDECODER

Register Addresses

Register FileDWB

FIFOBus 1

Bus 2

Write

Write

Page 15: Design and Simulation of an EM-Fault-Tolerant Processor with Micro-Rollback, Control- Flow Checking and ECC Franco Trovo, Shantanu Dutt  Hasan Arslan

Support for Micro Rollback for Support for Micro Rollback for Register File - exampleRegister File - example

MOVE 0000, D0MOVE 0000, D0 ADD 000F, D0ADD 000F, D0 MOVE 0001, A3 (f)MOVE 0001, A3 (f) SUBSUB 0002, D0 0002, D0

Micro rollbackMicro rollback2 levels2 levels

……

CAM

PRIORITY CIRCUITDECODER

Register Addresses

Register FileDWB

FIFOBus 1

Bus 2

Write

Write

100000

D0 XX XX XX XX XX

0000XXXX XXXX XXXX XXXX XXXX

Page 16: Design and Simulation of an EM-Fault-Tolerant Processor with Micro-Rollback, Control- Flow Checking and ECC Franco Trovo, Shantanu Dutt  Hasan Arslan

Support for Micro Rollback for Support for Micro Rollback for Register File - exampleRegister File - example

MOVE 0000, D0MOVE 0000, D0 ADD 000F, D0ADD 000F, D0 MOVE 0001, A3 (f)MOVE 0001, A3 (f) SUBSUB 0002, D0 0002, D0

Micro rollbackMicro rollback2 levels2 levels

……

CAM

PRIORITY CIRCUITDECODER

Register Addresses

Register FileDWB

FIFOBus 1

Bus 2

Write

Write

110000

D0 XX XX XX XX D0

000FXXXX XXXX XXXX XXXX 0000

Page 17: Design and Simulation of an EM-Fault-Tolerant Processor with Micro-Rollback, Control- Flow Checking and ECC Franco Trovo, Shantanu Dutt  Hasan Arslan

Support for Micro Rollback for Support for Micro Rollback for Register File - exampleRegister File - example

MOVE 0000, D0MOVE 0000, D0 ADD 000F, D0ADD 000F, D0 MOVE 0001, A3 (f)MOVE 0001, A3 (f) SUBSUB 0002, D0 0002, D0

Micro rollbackMicro rollback2 levels2 levels

……

CAM

PRIORITY CIRCUITDECODER

Register Addresses

Register FileDWB

FIFOBus 1

Bus 2

Write

Write

111000

A3 XX XX XX D0 D0

0101XXXX XXXX XXXX 0000 000F

Page 18: Design and Simulation of an EM-Fault-Tolerant Processor with Micro-Rollback, Control- Flow Checking and ECC Franco Trovo, Shantanu Dutt  Hasan Arslan

Support for Micro Rollback for Support for Micro Rollback for Register File - exampleRegister File - example

MOVE 0000, D0MOVE 0000, D0 ADD 000F, D0ADD 000F, D0 MOVE 0001, A3 (f)MOVE 0001, A3 (f) SUBSUB 0002, D0 0002, D0

Micro rollbackMicro rollback2 levels2 levels

……

CAM

PRIORITY CIRCUITDECODER

Register Addresses

Register FileDWB

FIFOBus 1

Bus 2

Write

Write

00

XX XX

XXXX XXXX

1 1 1 1

D0 D0 A3 D0

0000 000D0101000F

Page 19: Design and Simulation of an EM-Fault-Tolerant Processor with Micro-Rollback, Control- Flow Checking and ECC Franco Trovo, Shantanu Dutt  Hasan Arslan

Support for Micro Rollback for Support for Micro Rollback for Register File - exampleRegister File - example

MOVE 0000, D0MOVE 0000, D0 ADD 000F, D0ADD 000F, D0 MOVE 0001, A3 (f)MOVE 0001, A3 (f) SUBSUB 0002, D0 0002, D0

Micro rollbackMicro rollback2 levels2 levels

……CAM

PRIORITY CIRCUITDECODER

Register Addresses

Register FileDWB

FIFOBus 1

Bus 2

Write

Write

00

XX XX

XXXX XXXX

1 1 0 0

D0 D0 A3 D0

0000 000D0101000F

Page 20: Design and Simulation of an EM-Fault-Tolerant Processor with Micro-Rollback, Control- Flow Checking and ECC Franco Trovo, Shantanu Dutt  Hasan Arslan

Support for Micro Rollback for Support for Micro Rollback for Register File - exampleRegister File - example

MOVE 0000, D0MOVE 0000, D0 ADD 000F, D0ADD 000F, D0 MOVE 0001, A3 (f)MOVE 0001, A3 (f) SUBSUB 0002, 0002,

D0…D0…

CAM

PRIORITY CIRCUITDECODER

Register Addresses

Register FileDWB

FIFOBus 1

Bus 2

Write

Write

00

XX XX

XXXX XXXX

1 1

D0 D0

0000

1 0

D0 A3

000D0001000F

Page 21: Design and Simulation of an EM-Fault-Tolerant Processor with Micro-Rollback, Control- Flow Checking and ECC Franco Trovo, Shantanu Dutt  Hasan Arslan

CFC with Micro Rollback - CFC with Micro Rollback - Priority Priority

Two concurrent fault detection techniques can request Two concurrent fault detection techniques can request the processor a micro rollbackthe processor a micro rollback

They generally requests different number of levels of They generally requests different number of levels of rollbackrollback

Which technique should have the priority in case of Which technique should have the priority in case of simult. detection by both HC and WD?simult. detection by both HC and WD?• We assign the priority to the Hamming codeWe assign the priority to the Hamming code

Reason: shorter jump backsReason: shorter jump backs Although a rationale exists for WD priorityAlthough a rationale exists for WD priority

HC WD

MRB Unit uRB=1 uRB=3

? ?

Page 22: Design and Simulation of an EM-Fault-Tolerant Processor with Micro-Rollback, Control- Flow Checking and ECC Franco Trovo, Shantanu Dutt  Hasan Arslan

CFC with Instruction Micro CFC with Instruction Micro Rollback – State DiagramRollback – State Diagram

ResetBegin Block

ErrorWrong Branch

ErrorWrong Computed Signature

Header

Middle Block

Signature 1

Signature 2

Branch

GET2S

GET1S

Header Sign Eg.Jump Signatures?

N

N

N

N

Y

Y

Y

Y

Computed Sign. Eq.Header Sign?

Error

Wrong Branch or Faulted SignaturesMultiple points of micro rollback

t<t1

t1<=t<t2

tt2

A

B C

D E

F

urb_d = 2

urb_d = bsize

urb_d = 1

urb_d = 2

urb_d = 3t = number of times the same error state is encountered.t < t1 : urb to BEGIN_BLOCK (1 instr) read header sign. againt1<=t<t2 : urb to “Branch” (2 instr) --re-exec prev. blk’s brancht >≥ t2 : urb to MIDDLE BLOCK (3 instr)-- re-read 2 branch signs. prev blk

Hamming Codeurb_d = 1

(re-executeprevious branch)

Jump free set of

instructions

JUMP

JUMP sign 1

JUMPJUMP sign 2

Branch free set of

instructions

BLOCK sign

Sign of 1st bra

BranchSign of 2nd bra

Branch

Page 23: Design and Simulation of an EM-Fault-Tolerant Processor with Micro-Rollback, Control- Flow Checking and ECC Franco Trovo, Shantanu Dutt  Hasan Arslan

OutlineOutline GoalsGoals Solution AdoptedSolution Adopted

• Control Flow CheckingControl Flow Checking• Hamming encoding on the busesHamming encoding on the buses• Instruction Micro rollbackInstruction Micro rollback

Motorola 68040 and VHDL descriptionMotorola 68040 and VHDL description Simulation resultsSimulation results ConclusionConclusion

Page 24: Design and Simulation of an EM-Fault-Tolerant Processor with Micro-Rollback, Control- Flow Checking and ECC Franco Trovo, Shantanu Dutt  Hasan Arslan

Improved VHDL Model of 68040 + Improved VHDL Model of 68040 + Watchdog connectionsWatchdog connections

CPU BC

InstrCache

DataCache

Encoder DecoderDecoder

Enc \ Dec

Encoder

Enc \ Dec

Enc \ Dec

Enc \ Dec

Encoder Decoder

Encoder Decoder

Encoder Decoder

AddressBus

Data Bus

enable

rw

readyOABUS2OABUS1

IABUS1 IABUS2

IDBUS

ODBUS

WD

Hammingcode errordetect. bits

Controllines

Data buses

Page 25: Design and Simulation of an EM-Fault-Tolerant Processor with Micro-Rollback, Control- Flow Checking and ECC Franco Trovo, Shantanu Dutt  Hasan Arslan

OutlineOutline Goals Goals Solution AdoptedSolution Adopted

• Control Flow CheckingControl Flow Checking• Hamming encoding on the busesHamming encoding on the buses• Instruction Micro rollbackInstruction Micro rollback

Motorola 68040 and VHDL descriptionMotorola 68040 and VHDL description Simulation resultsSimulation results ConclusionConclusion

Page 26: Design and Simulation of an EM-Fault-Tolerant Processor with Micro-Rollback, Control- Flow Checking and ECC Franco Trovo, Shantanu Dutt  Hasan Arslan

Simulation EnvironmentSimulation Environment•The Total Fault Injection Time is simply the total duration of the intermittent fault on the bus or buses considered.•The Delay Time is the time that the FG waits before starting the fault injection.•The Period Time is the period of the intermittent fault.•The Fault Time is the time of duration of the injection of a certain fault.

Start Fault Injection

FirstFaultInjected

SecondFaultInjected

Period TimeFaultTime

Total Fault Injection Time

Delay Time

Fault Enable

Page 27: Design and Simulation of an EM-Fault-Tolerant Processor with Micro-Rollback, Control- Flow Checking and ECC Franco Trovo, Shantanu Dutt  Hasan Arslan

Fault Parameters ValuesFault Parameters Values Simulations run on the model:Simulations run on the model:

• Faults injected on all cache busesFaults injected on all cache buses• Fault typesFault types

Random Double, Triple, Quadruple FaultsRandom Double, Triple, Quadruple Faults Clustered 1 cluster 2bits, 1 cluster 4bits, 2 clusters 2bitsClustered 1 cluster 2bits, 1 cluster 4bits, 2 clusters 2bits

• Three values of repeat frequencyThree values of repeat frequency Low (100 clock cycles = 100KHz)Low (100 clock cycles = 100KHz) Medium (10 clock cycles = 1MHz)Medium (10 clock cycles = 1MHz) High (1 clock cycle = 10MHz)High (1 clock cycle = 10MHz)

• Three values of duty cycleThree values of duty cycle 25% all the simulations25% all the simulations 50% all except high freq and 4 faults50% all except high freq and 4 faults 75% all 2 faults and 3faults middle frequencies75% all 2 faults and 3faults middle frequencies

Page 28: Design and Simulation of an EM-Fault-Tolerant Processor with Micro-Rollback, Control- Flow Checking and ECC Franco Trovo, Shantanu Dutt  Hasan Arslan

Simulation Results (contd.)Simulation Results (contd.)

Overall correctness of execution - sorted

4555

35 30

76 7265 64

1118 21 18

11 8 13 16

0102030405060708090

100

Correct without WDCorrect with WDFail safe with WDIncorrect runs with WD

Page 29: Design and Simulation of an EM-Fault-Tolerant Processor with Micro-Rollback, Control- Flow Checking and ECC Franco Trovo, Shantanu Dutt  Hasan Arslan

Average execution time (completed runs) vs kind of fault injection

0200000400000600000800000

10000001200000

No Faults

2 Random Faults

1 Cluster 2bits

3 Random Faults

1 Cluster 4bits

2 Clusters 2bits

4 Random Faults

100KHz1MHz10MHz

Simulation Results (contd.)Simulation Results (contd.)

NOTE:

• HC has better error coverage for cluster faults

• Block sign check (part of CFC) has better err cov for rand faults

Page 30: Design and Simulation of an EM-Fault-Tolerant Processor with Micro-Rollback, Control- Flow Checking and ECC Franco Trovo, Shantanu Dutt  Hasan Arslan

Simulation Results (contd.)Simulation Results (contd.)

Average execution time - low frequency [1 cluster 4 bits]

0

50000

100000

150000

200000

250000

300000

350000

400000

450000

only correctrun

only notcorrect run

finished run not finishedrun

no faultinjection

25% dc50% dc75% dc

Page 31: Design and Simulation of an EM-Fault-Tolerant Processor with Micro-Rollback, Control- Flow Checking and ECC Franco Trovo, Shantanu Dutt  Hasan Arslan

ConclusionsConclusions Micro-rollback coupled with FD for the first timeMicro-rollback coupled with FD for the first time Micro-rollable WD state diagram for the first timeMicro-rollable WD state diagram for the first time More extensive fault patterns than previous workMore extensive fault patterns than previous work Good reliability for our FD/FT solutions (correct or Good reliability for our FD/FT solutions (correct or

fail-safe execution)fail-safe execution)• 3 faults: 94% low freq, 90% mid freq & 90% high freq3 faults: 94% low freq, 90% mid freq & 90% high freq• 4 faults: 86% low freq, 80% mid freq & 80% high freq4 faults: 86% low freq, 80% mid freq & 80% high freq

Average execution time linear with duty cycle and Average execution time linear with duty cycle and almost quadratic with the fault injection almost quadratic with the fault injection frequencyfrequency• time ovhd 3 faults: 11% low, 12% med, 64% high freqtime ovhd 3 faults: 11% low, 12% med, 64% high freq• time ovhd 4 faults: 16% low, 32% med, 182% high freqtime ovhd 4 faults: 16% low, 32% med, 182% high freq

Data buses less tolerant to faults than address Data buses less tolerant to faults than address buses (latter causes more CFC errors and are so buses (latter causes more CFC errors and are so detected more easily)detected more easily)

Page 32: Design and Simulation of an EM-Fault-Tolerant Processor with Micro-Rollback, Control- Flow Checking and ECC Franco Trovo, Shantanu Dutt  Hasan Arslan

Future WorkFuture Work Introduction of other fault detection Introduction of other fault detection

techniques as triggers for micro rollbacktechniques as triggers for micro rollback

• Lower level fault detection like the micro Lower level fault detection like the micro instruction control flow checking -- can detect instruction control flow checking -- can detect internal processor faultsinternal processor faults

• Higher level fault detection like algorithm based Higher level fault detection like algorithm based fault tolerance (ABFT) for checking data errors -- fault tolerance (ABFT) for checking data errors -- can detect external & internal faults affecting can detect external & internal faults affecting datadata