Detecting Termination by Weight-Throwing in a Faulty Distributed System

Yu-Chee Tseng*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

28 Scopus citations

Abstract

This paper presents a fault-tolerant termination detection algorithm for a distributed system in which processes tend to fail. Allowing an arbitrary number of processes to have fail-stop behavior, the algorithm can detect termination efficiently with O(M + kn + n) control messages and O(k + 1) detection delays, where M is the number of basic messages issued, n is the number of processes, and k is the actual number of processes that fail. This algorithm has fewer detection delays than existing algorithms in the literature and comparable performance in terms of message complexity. In particular, when no fault occurs, the algorithm has constant detection delay and it uses, in the worst case, an optimal number of messages.

Original languageEnglish
Pages (from-to)7-15
Number of pages9
JournalJournal of Parallel and Distributed Computing
Volume25
Issue number1
DOIs
StatePublished - 1 Jan 1995

Fingerprint Dive into the research topics of 'Detecting Termination by Weight-Throwing in a Faulty Distributed System'. Together they form a unique fingerprint.

Cite this