3d-dresd ft
DESCRIPTION
TRANSCRIPT
![Page 1: 3D-DRESD FT](https://reader033.vdocuments.net/reader033/viewer/2022061218/54b6080b4a7959d4128b469e/html5/thumbnails/1.jpg)
POLITECNICO DI MILANO
Vincenzo Rana
Fault tolerance inFault tolerance inFPGA-based systemsFPGA-based systems
![Page 2: 3D-DRESD FT](https://reader033.vdocuments.net/reader033/viewer/2022061218/54b6080b4a7959d4128b469e/html5/thumbnails/2.jpg)
2
OutlineOutline
Techniques:Triple module redundancy
Throughput logicState-machine logicI/O logicBRAM
Error detection and error correctionPartial reconfiguration
Real approachesSEU migration through dynamic partial reconfigurationRun-time fault reconfiguration
Conclusions
![Page 3: 3D-DRESD FT](https://reader033.vdocuments.net/reader033/viewer/2022061218/54b6080b4a7959d4128b469e/html5/thumbnails/3.jpg)
3
OutlineOutline
Techniques:Triple module redundancy
Throughput logicState-machine logicI/O logicBRAM
Error detection and error correctionPartial reconfiguration
Real approachesSEU migration through dynamic partial reconfigurationRun-time fault reconfiguration
Conclusions
![Page 4: 3D-DRESD FT](https://reader033.vdocuments.net/reader033/viewer/2022061218/54b6080b4a7959d4128b469e/html5/thumbnails/4.jpg)
4
Triple module redundancyTriple module redundancy
![Page 5: 3D-DRESD FT](https://reader033.vdocuments.net/reader033/viewer/2022061218/54b6080b4a7959d4128b469e/html5/thumbnails/5.jpg)
5
Triple module redundancy Triple module redundancy (voter)(voter)
The voter can be implementedwith Look-Up Tables (LUTs)with buffer 3-state (BUFT)
![Page 6: 3D-DRESD FT](https://reader033.vdocuments.net/reader033/viewer/2022061218/54b6080b4a7959d4128b469e/html5/thumbnails/6.jpg)
6
OutlineOutline
Techniques:Triple module redundancy
Throughput logicState-machine logicI/O logicBRAM
Error detection and error correctionPartial reconfiguration
Real approachesSEU migration through dynamic partial reconfigurationRun-time fault reconfiguration
Conclusions
![Page 7: 3D-DRESD FT](https://reader033.vdocuments.net/reader033/viewer/2022061218/54b6080b4a7959d4128b469e/html5/thumbnails/7.jpg)
7
Throughput logicThroughput logic
The system will include 3 copies of:the module itselfthe input signalsthe output signals
No voter is needed
No single-point-of-failure
![Page 8: 3D-DRESD FT](https://reader033.vdocuments.net/reader033/viewer/2022061218/54b6080b4a7959d4128b469e/html5/thumbnails/8.jpg)
8
OutlineOutline
Techniques:Triple module redundancy
Throughput logicState-machine logicI/O logicBRAM
Error detection and error correctionPartial reconfiguration
Real approachesSEU migration through dynamic partial reconfigurationRun-time fault reconfiguration
Conclusions
![Page 9: 3D-DRESD FT](https://reader033.vdocuments.net/reader033/viewer/2022061218/54b6080b4a7959d4128b469e/html5/thumbnails/9.jpg)
9
State-machine logicState-machine logic
State-machines strictly depend on their stateThe voter has to be implemented internally
A voter has to be inserted in the system for:each state registereach feedback path
This approach allows to keep each state-machine always in the correct state
![Page 10: 3D-DRESD FT](https://reader033.vdocuments.net/reader033/viewer/2022061218/54b6080b4a7959d4128b469e/html5/thumbnails/10.jpg)
10
OutlineOutline
Techniques:Triple module redundancy
Throughput logicState-machine logicI/O logicBRAM
Error detection and error correctionPartial reconfiguration
Real approachesSEU migration through dynamic partial reconfigurationRun-time fault reconfiguration
Conclusions
![Page 11: 3D-DRESD FT](https://reader033.vdocuments.net/reader033/viewer/2022061218/54b6080b4a7959d4128b469e/html5/thumbnails/11.jpg)
11
I/O logic (Input)I/O logic (Input)
Input pins have to be replicated in order to avoid single-points-of-failureIf the number of required input pins exceeds the number of input pins available on the reconfigurable devices:
Just a subset of input pins can be replicatedThe system can be split in more than one FPGA
![Page 12: 3D-DRESD FT](https://reader033.vdocuments.net/reader033/viewer/2022061218/54b6080b4a7959d4128b469e/html5/thumbnails/12.jpg)
12
I/O logic (Output)I/O logic (Output)
In order to avoid a single-point-of-failure on output pins it is necessary to implement the following circuit
![Page 13: 3D-DRESD FT](https://reader033.vdocuments.net/reader033/viewer/2022061218/54b6080b4a7959d4128b469e/html5/thumbnails/13.jpg)
13
OutlineOutline
Techniques:Triple module redundancy
Throughput logicState-machine logicI/O logicBRAM
Error detection and error correctionPartial reconfiguration
Real approachesSEU migration through dynamic partial reconfigurationRun-time fault reconfiguration
Conclusions
![Page 14: 3D-DRESD FT](https://reader033.vdocuments.net/reader033/viewer/2022061218/54b6080b4a7959d4128b469e/html5/thumbnails/14.jpg)
14
BRAMBRAM
BRAMs are large block of static memory (4K bits each) that are true dual port and fully synchronousTechniques:
Simple redundancyReplication of BRAMs
Redundancy and refreshReplication of BRAMsRefresh with voter
Data encryptionError Correction Control (ECC)
![Page 15: 3D-DRESD FT](https://reader033.vdocuments.net/reader033/viewer/2022061218/54b6080b4a7959d4128b469e/html5/thumbnails/15.jpg)
15
OutlineOutline
Techniques:Triple module redundancy
Throughput logicState-machine logicI/O logicBRAM
Error detection and error correctionPartial reconfiguration
Real approachesSEU migration through dynamic partial reconfigurationRun-time fault reconfiguration
Conclusions
![Page 16: 3D-DRESD FT](https://reader033.vdocuments.net/reader033/viewer/2022061218/54b6080b4a7959d4128b469e/html5/thumbnails/16.jpg)
16
Error detection and error Error detection and error correctioncorrection
It is more performance and cost effective to correct and error rather than retransmit the dataParity data are added to true data (64+8 or 32+7)No memory replication
![Page 17: 3D-DRESD FT](https://reader033.vdocuments.net/reader033/viewer/2022061218/54b6080b4a7959d4128b469e/html5/thumbnails/17.jpg)
17
OutlineOutline
Techniques:Triple module redundancy
Throughput logicState-machine logicI/O logicBRAM
Error detection and error correctionPartial reconfiguration
Real approachesSEU migration through dynamic partial reconfigurationRun-time fault reconfiguration
Conclusions
![Page 18: 3D-DRESD FT](https://reader033.vdocuments.net/reader033/viewer/2022061218/54b6080b4a7959d4128b469e/html5/thumbnails/18.jpg)
18
Partial reconfigurationPartial reconfiguration
Access to the configuration memory:Readback
Post-configuration read operation
Partial reconfigurationPost-configuration write operation
Techniques:SEU scrubbing
Partial reconfiguration
SEU detectionReadback
Bit for bit comparisonCRC comparison
SEU correctionReadbackPartial reconfiguration
![Page 19: 3D-DRESD FT](https://reader033.vdocuments.net/reader033/viewer/2022061218/54b6080b4a7959d4128b469e/html5/thumbnails/19.jpg)
19
OutlineOutline
Techniques:Triple module redundancy
Throughput logicState-machine logicI/O logicBRAM
Error detection and error correctionPartial reconfiguration
Real approachesSEU migration through dynamic partial reconfigurationRun-time fault reconfiguration
Conclusions
![Page 20: 3D-DRESD FT](https://reader033.vdocuments.net/reader033/viewer/2022061218/54b6080b4a7959d4128b469e/html5/thumbnails/20.jpg)
20
Dynamic partial Dynamic partial reconfigurationreconfiguration
Dynamic partial reconfiguration can be useful to trigger the reconfiguration of the affected portion of the architecture
while the rest of the system is still workingwithout need to perform a complete reconfiguration
It can be very useful to reconfigure the smallest portion of the FPGA where the fault is located (a good partitioning phase is needed)
Solution space exploration has to be performed
![Page 21: 3D-DRESD FT](https://reader033.vdocuments.net/reader033/viewer/2022061218/54b6080b4a7959d4128b469e/html5/thumbnails/21.jpg)
21
Dynamic partial reconfiguration Dynamic partial reconfiguration (DWC)(DWC)
Fault detection and characterizationIdentification of a mismatch
Fault localizationIdentification of the portion of the device where the fault is located
Several solutions with applying DWC
![Page 22: 3D-DRESD FT](https://reader033.vdocuments.net/reader033/viewer/2022061218/54b6080b4a7959d4128b469e/html5/thumbnails/22.jpg)
22
Dynamic partial reconfiguration (ro-index)Dynamic partial reconfiguration (ro-index)
ro-index: the ratio between the occupied area and its minimal placement constraint, both computed in slices
Occupied area in Slices: So
Placement constraint in Slices: Sc
ro-index = So / Sc
![Page 23: 3D-DRESD FT](https://reader033.vdocuments.net/reader033/viewer/2022061218/54b6080b4a7959d4128b469e/html5/thumbnails/23.jpg)
23
OutlineOutline
Techniques:Triple module redundancy
Throughput logicState-machine logicI/O logicBRAM
Error detection and error correctionPartial reconfiguration
Real approachesSEU migration through dynamic partial reconfigurationRun-time fault reconfiguration
Conclusions
![Page 24: 3D-DRESD FT](https://reader033.vdocuments.net/reader033/viewer/2022061218/54b6080b4a7959d4128b469e/html5/thumbnails/24.jpg)
24
Run-time fault reconfigurationRun-time fault reconfiguration
Recovery from permanent logic and interconnect faults
fine-grained physical design partitioning
Faults are localized to small partitioned blocks that have fixed interfaces to the surrounding portion of the device
affected block are reconfigured with previously generated, functionally equivalent block instances that do not use the faulty resources
![Page 25: 3D-DRESD FT](https://reader033.vdocuments.net/reader033/viewer/2022061218/54b6080b4a7959d4128b469e/html5/thumbnails/25.jpg)
25
Run-time fault reconfigurationRun-time fault reconfiguration
AssumptionsDetection of a faultLocalization of a faultDiagnosis of a fault (just helpful, not necessary)
ActionAn alternate configuration of the design can be loaded that does not utilize the faulty resources
Advantagesextremely low area overheadvery low timing overheadrun-time management of faultshigh flexibility
Disadvantagesvery complex design phase (and run-time management)
![Page 26: 3D-DRESD FT](https://reader033.vdocuments.net/reader033/viewer/2022061218/54b6080b4a7959d4128b469e/html5/thumbnails/26.jpg)
26
OutlineOutline
Techniques:Triple module redundancy
Throughput logicState-machine logicI/O logicBRAM
Error detection and error correctionPartial reconfiguration
Real approachesSEU migration through dynamic partial reconfigurationRun-time fault reconfiguration
Conclusions
![Page 27: 3D-DRESD FT](https://reader033.vdocuments.net/reader033/viewer/2022061218/54b6080b4a7959d4128b469e/html5/thumbnails/27.jpg)
27
ConclusionsConclusions
Reliable systems can be effectively implemented on FPGA devices
The previously presented techniques can be combined together in order to improve the overall reliability of the whole design
TMR combined with SEU correction through partial reconfiguration is a powerful and effective SEU migration strategy
3-state buffer can be used in order to implement fault tolerance methodologies without wasting LUTs (keeping low the area overhead)
![Page 28: 3D-DRESD FT](https://reader033.vdocuments.net/reader033/viewer/2022061218/54b6080b4a7959d4128b469e/html5/thumbnails/28.jpg)
28
The endThe end
•Thank you for your attention
•Do you have any questions?