Publication | Closed Access
Workload redistribution for fault-tolerance in a hard real-time distributed computing system
28
Citations
8
References
2003
Year
Unknown Venue
Cluster ComputingAvailabilityEngineeringComputer ArchitectureFault ToleranceFault-tolerant MessagingFormal VerificationOptimal System DesignFault-tolerance CapabilityReliability EngineeringSystems EngineeringParallel ComputingComputer EngineeringScheduling (Computing)Distributed SystemsComputer ScienceWorkload RedistributionReal-time ComputingScheduling AnalysisSystem WorkloadHigh Availability SoftwareFault-tolerant NetworkScheduling (Operating Systems)Real-time SystemsHard Real-timeScheduling (Project Management)
In a hard real-time distributed computing system (HRTDCS), all the tasks are required to meet their associated deadlines; a task not meeting its deadline leads to a catastrophic failure of the system. The authors consider an HRTDCS that executes both periodic and aperiodic tasks associated with timing, precedence, and resource constraints. The fault-tolerance capability in such a system is achieved through the use of time redundancy. The problem of workload redistribution for fault tolerance in an HRTDCS is studied. A graph model to represent the system workload is developed. Three performance measures for the analysis of an HRTDCS are defined. A nonpreemptive scheduling algorithm is proposed to distribute the workload of the operational nodes of the HRTDCS in the presence of both hardware and task failures. This task allocation strategy is applied to a practical system, namely, the HRTDCS onboard a spacecraft. The performance measures obtained for a typical system workload indicate that the algorithm is quite suitable for an HRTDCS with regard to uniform workload distribution.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">></ETX>
| Year | Citations | |
|---|---|---|
Page 1
Page 1