Impacts of Task Re-Execution Policy on MapReduce Jobs

Jia Chun Lin, Fang Yie Leu, Ying-ping Chen*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

5 Scopus citations


MapReduce is a popular distributed programming framework for large-scale data processing. To prevent MapReduce jobs from being interrupted by node failures that occur frequently in a MapReduce cluster consisting of a set of commodity machines/nodes, the most well-known MapReduce implementation, i.e. Hadoop, adopts a task re-execution policy (TR policy). When a map/reduce task of a job crashes, the TR policy assigns another node to reperform the task. However, the impact of the TR policy on MapReduce jobs in terms of reliability, job turnaround time (JTT) and energy consumption are not clear, particularly when jobs have different features, e.g. different filtering percentages, different input-data sizes, and different numbers of reduce tasks. In this paper, we formally analyze the job completion reliability (JCR) of a job based on Poisson distributions, and then derive the expected JTT and job energy consumption (JEC) based on the universal generation function. Extensive analyses are further conducted to explore the impact of the TR policy on JCR, JTT and JEC of jobs with different features. The results show that employing the TR policy can dramatically improve JCR for a large MapReduce job. Moreover, if the JCR of a job is highly improved by the TR policy, the expected JTT and JEC will not be significantly prolonged and increased, respectively.

Original languageEnglish
Pages (from-to)701-714
Number of pages14
JournalComputer Journal
Issue number5
StatePublished - 9 May 2016


  • MapReduce
  • Poisson distribution
  • job completion reliability
  • job energy consumption
  • job turnaround time
  • universal generation function

Fingerprint Dive into the research topics of 'Impacts of Task Re-Execution Policy on MapReduce Jobs'. Together they form a unique fingerprint.

Cite this