Unreliable failure detectors for reliable distributed systems

TLDR

A companion paper shows that one of the introduced failure detectors is the weakest known for solving Consensus. The study introduces unreliable failure detectors and investigates their use for solving Consensus in asynchronous crash‑prone systems. Unreliable failure detectors are characterised by completeness and accuracy properties. Consensus can be solved even with detectors that make infinitely many mistakes, and detectors are classified by whether they work with any number of crashes or require a majority; moreover, Consensus and Atomic Broadcast are reducible to each other in asynchronous crash‑prone systems. Citation: Chandra et al., 1992.

Abstract

We introduce the concept of unreliable failure detectors and study how they can be used to solve Consensus in asynchronous systems with crash failures. We characterise unreliable failure detectors in terms of two properties—completeness and accuracy. We show that Consensus can be solved even with unreliable failure detectors that make an infinite number of mistakes, and determine which ones can be used to solve Consensus despite any number of crashes, and which ones require a majority of correct processes. We prove that Consensus and Atomic Broadcast are reducible to each other in asynchronous systems with crash failures; thus, the above results also apply to Atomic Broadcast. A companion paper shows that one of the failure detectors introduced here is the weakest failure detector for solving Consensus [Chandra et al. 1992].

References

Page 1

	Year	Citations

Page 1