Impact of MapReduce task re-execution policy on job completion reliability and job completion time

Jia Chun Lin, Fang Yie Leu, Ying-ping Chen, Waqaas Munawar

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

MapReduce has been a worldwide accepted framework for solving data-intensive applications. To prevent MapReduce jobs from being interrupted by node failures which occur frequently in a large-scale MapReduce cluster, current MapReduce implementations, e.g., Hadoop, employ a task re-execution policy (TR policy for short) for MapReduce jobs, i.e., when a map/reduce task of a job fails due to node failure, this policy reperforms the task on another node. However, the impact of the TR policy on job completion reliability and job completion time have not been studied from a theoretical viewpoint, especially when the job is given different characteristics, e.g., different input data sizes, different numbers of reduce tasks, and different intermediate data sizes. In this study, we derive the job completion reliability (JCR for short) of a MapReduce job based on Poisson distributions and analyze the expected job completion time (JCT for short) based on the universal generation function. We use nine settings of task re-execution factor (TR factor for short) to explore the impact of the TR policy on the JCR and JCT of jobs. The results show that the TR policy can effectively improve JCR without significantly prolonging JCT. But there is no single TR factor with which all jobs can achieve a high JCR.

Original languageEnglish
Title of host publicationProceedings - 2014 IEEE 28th International Conference on Advanced Information Networking and Applications, IEEE AINA 2014
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages712-718
Number of pages7
ISBN (Print)9781479936298
DOIs
StatePublished - 1 Jan 2014
Event28th IEEE International Conference on Advanced Information Networking and Applications, IEEE AINA 2014 - Victoria, BC, Canada
Duration: 13 May 201416 May 2014

Publication series

NameProceedings - International Conference on Advanced Information Networking and Applications, AINA
ISSN (Print)1550-445X

Conference

Conference28th IEEE International Conference on Advanced Information Networking and Applications, IEEE AINA 2014
CountryCanada
CityVictoria, BC
Period13/05/1416/05/14

Keywords

  • MapReduce
  • Poisson distribution
  • job completion reliability
  • job completion time
  • universal generation function

Fingerprint Dive into the research topics of 'Impact of MapReduce task re-execution policy on job completion reliability and job completion time'. Together they form a unique fingerprint.

Cite this