Fault tolerance schemes for mission critical applications
Abstract
In absence of atmospheric protection, electronics involved in space systems are exposed to radiation effects. For maintaining the proper functioning of the device in space various distinct techniques are put in use. A prominent radiation effect called Single Event Upset (SEU) is capable of changing the state of data or any component within the system. Thus, if not rectified even a single error may lead to system failure.
Further, the space system must have flexibility for easy modification and upgradation. Hence, the idea of developing a fault tolerant system in FPGA based on TMR scheme stands justified. The TMR scheme, which involves execution of same program in three redundant processors with majority voters to produce output, is used for SEU mitigation. Moreover, using softcore processors and FPGA provides the space system with wide flexibility for reconfiguration.
This study provides a discussion on various Radiation effects on electronics in space environment, and the techniques used in mitigation of these effects for fault tolerance. In this thesis, a new context recovery algorithm for TMR-based system to detect and correct an SEU is proposed. Further, a hardware implementation of a trivial context recovery strategy (Checkpointing and Rollback scheme) is realized in Spartan-6 FPGA SP601 Evaluation kit for a MicroBlaze soft core processor system and tested.
Collections
- M Tech Dissertations [923]