Big Data Cluster Processing Through Optimized Speculative Execution

Abstract

A big parallel processing job can be delayed substantially as long as one of its many tasks is being assigned to an unreliable or congested machine. To tackle this so-called straggler problem, most parallel processing frameworks such as MapReduce have adopted various strategies under which the system may speculatively launch additional copies of the same task if its progress is abnormally slow when extra idling resource is available. In this paper, we focus on the design of speculative execution schemes for parallel processing clusters from an optimization perspective under different loading conditions. For the lightly loaded case, we analyze and propose one cloning scheme, namely, the Smart Cloning Algorithm (SCA) which is based on maximizing the overall system utility. We also derive the workload threshold under which SCA should be used for speculative execution. For the heavily loaded case, we propose the Enhanced Speculative Execution (ESE) algorithm which is an extension of the Microsoft Mantri scheme to mitigate stragglers. Our simulation results show SCA reduces the total job flowtime, i.e., the job delay/ response time by nearly 6% comparing to the speculative execution strategy of Microsoft Mantri. In addition, we show that the ESE Algorithm outperforms the Mantri baseline scheme by 71% in terms of the job flowtime while consuming the same amount of computation resource.

Authors and Affiliations

D. Sasi Redkha

Keywords

Related Articles

Progressive Collapse of Reinforced Concrete Building

In this project it is proposed to carry out progressive collapse analysis of 13 storey RC frame building by removing different column one at a time as per the GSA guidelines. Building consists of 5 X 5 bay 5 m in both di...

A Review on Selective Catalytic Reduction for NOX Reduction

The energy requirement has increased rapidly all over the world due to industrialisation and the changes of subsequent lifestyle. Most of this energy is generated from fossil fuels such as coal, natural gas, gasoline, an...

Content Based Image Retrieval System Using Relevance Feedback

Content based image retrieval (CBIR) is the basis of image retrieval systems. Image retrieval based on image content has become an interesting topic in the field of image processing. To be more profitable, relevance feed...

Descriptive Study of Road Traffic Accidents in Kashmir

Accidents, tragically, are not often due to ignorance, butare due to carelessness, thoughtlessness and over confidence. Human, vehicle and environmental factors play roles before, during and after a trauma event. Acciden...

PAL TRACKER: Track and Meet Friends and Family Nearby

In this paper, Do we really care about who is around us and the security of our family members? Many social media companies have envisioned the internet as a facilitator for real world interaction — a way to make it easi...

Download PDF file
  • EP ID EP245626
  • DOI -
  • Views 138
  • Downloads 0

How To Cite

D. Sasi Redkha (2017). Big Data Cluster Processing Through Optimized Speculative Execution. International journal of Emerging Trends in Science and Technology, 4(9), 5891-5897. https://europub.co.uk/articles/-A-245626