increased reliability through failure predictive scheduling with temperature sensor feedback wesley...
DESCRIPTION
Project Goals Use feedback from sensor networks to predict which components are most reliable Increase reliability of system as seen by tasks through failure predictive schedulingTRANSCRIPT
![Page 1: Increased Reliability Through Failure Predictive Scheduling with Temperature Sensor Feedback Wesley Emeneker CSE 534 Dr. Sandeep Gupta](https://reader036.vdocuments.net/reader036/viewer/2022082908/5a4d1b487f8b9ab0599a4888/html5/thumbnails/1.jpg)
Increased Reliability Through Failure Predictive Scheduling with Temperature Sensor Feedback
Wesley Emeneker
CSE 534
Dr. Sandeep Gupta
![Page 2: Increased Reliability Through Failure Predictive Scheduling with Temperature Sensor Feedback Wesley Emeneker CSE 534 Dr. Sandeep Gupta](https://reader036.vdocuments.net/reader036/viewer/2022082908/5a4d1b487f8b9ab0599a4888/html5/thumbnails/2.jpg)
Background
High temperatures reduce computer reliability
Thermal Scheduling is good but doesn’t look at failure rates of components
Estimating failure rates allows tasks to be scheduled on the most reliable components
![Page 3: Increased Reliability Through Failure Predictive Scheduling with Temperature Sensor Feedback Wesley Emeneker CSE 534 Dr. Sandeep Gupta](https://reader036.vdocuments.net/reader036/viewer/2022082908/5a4d1b487f8b9ab0599a4888/html5/thumbnails/3.jpg)
Project Goals
Use feedback from sensor networks to predict which components are most reliable
Increase reliability of system as seen by tasks through failure predictive scheduling
![Page 4: Increased Reliability Through Failure Predictive Scheduling with Temperature Sensor Feedback Wesley Emeneker CSE 534 Dr. Sandeep Gupta](https://reader036.vdocuments.net/reader036/viewer/2022082908/5a4d1b487f8b9ab0599a4888/html5/thumbnails/4.jpg)
Methodology
MTBF half-life between 5-10 degrees C MTBF calculation:
Temperature floats to max based on equation modeled after measured values:
Combined failure probability for distributed tasks:
Failure prediction is a random variable:
halflifeTTc
MTBFMTBF0
2
10
501
sTT
cc eTT
)()()()( BAPBPAPBAP
MTBFe1
1
![Page 5: Increased Reliability Through Failure Predictive Scheduling with Temperature Sensor Feedback Wesley Emeneker CSE 534 Dr. Sandeep Gupta](https://reader036.vdocuments.net/reader036/viewer/2022082908/5a4d1b487f8b9ab0599a4888/html5/thumbnails/5.jpg)
Results
Tasks on lightly loaded systems are more reliable with failure predictive scheduling
3 Processors
0.000
10.000
20.000
30.000
40.000
50.000
60.000
70.000
80.000
Avg Finish Avg StdDev Avg Max Finish
Sim
ulat
ion
Tim
e Optimal
Non-optimal
![Page 6: Increased Reliability Through Failure Predictive Scheduling with Temperature Sensor Feedback Wesley Emeneker CSE 534 Dr. Sandeep Gupta](https://reader036.vdocuments.net/reader036/viewer/2022082908/5a4d1b487f8b9ab0599a4888/html5/thumbnails/6.jpg)
Tasks on heavily loaded systems do not benefit from predictive scheduling
10 Processors
0.000
50.000
100.000
150.000
200.000
250.000
300.000
350.000
Avg Finish Avg StdDev Avg Max Finish
Sim
ulat
ion
Tim
e
Optimal
Non-optimal
Results
![Page 7: Increased Reliability Through Failure Predictive Scheduling with Temperature Sensor Feedback Wesley Emeneker CSE 534 Dr. Sandeep Gupta](https://reader036.vdocuments.net/reader036/viewer/2022082908/5a4d1b487f8b9ab0599a4888/html5/thumbnails/7.jpg)
Conclusions
Reliability scheduling with respect to thermal management can make a significant difference for lightly loaded systems
Heavily loaded systems do not see a benefit from reliability scheduling