Selection of a Checkpoint Interval in Coordinated Checkpointing Protocol for Fault Tolerant Open MPI

Journal Title: International Journal on Computer Science and Engineering - Year 2010, Vol 2, Issue 6

Abstract

The goal of this paper is to address the selection of efficient checkpoint interval which reduces the total overhead cost due to the checkpointing and restarting of the applications in a distributed system environment. Coordinated checkpointing rollback recovery protocol is used for making the application programs fault tolerant on a stand-alone system under no load conditions using BLCR and OPEN MPI at system level. We have presented an experimental study in which we have determined the optimum checkpoint interval and we have used it to compare the performance of coordinated checkpointing protocol using two types of checkpointing intervals namely fixed and incremental checkpoint intervals. We measured the checkpoint cost, rollback cost and total cost of overheads aused by the above two methods of checkpointing intervals Failures are simulated using the Poisson distribution with one failure per hour and the inter arrival time between the failures follow exponential distribution. We have observed from the results that, rollback overhead and total cost of overheads due to checkpointing the application are very high in incremental checkpoint interval method than in fixed checkpoint interval method. Hence, we conclude that fixed checkpointing interval method is more efficient as it reduces the rollback overhead and also total cost of overheads considerably.

Authors and Affiliations

Mallikarjuna Shastry P. M. , K. Venkatesh

Keywords

Related Articles

A Survey on Performance Testing Approaches of Web Application and Importance of WAN Simulation in Performance Testing

In today’s era of internet most of the applications developed are either web applications or web interface is provided to the applications. In either of the cases it’s very much critical for developers of such applicatio...

Mining Best-N Frequent Patterns in a Video Sequence

Video mining is used to discover and describe interesting patterns in video data, which has become one of the core problem areas of the data mining research community. Compared to the mining of other types of data (e.g.,...

AN IMPROVED LOW COMPLEX SPATIALLY SCALABLE ACC-DCT BASED VIDEO COMPRESSION METHOD

In this paper, we propose a low complex Scalable ACC-DCT based video compression approach which tends to hard exploit the pertinent temporal redundancy in the video frames to improve compression efficiency with less proc...

Comparison of Digital Water Marking methods

In Digital watermarking, image or video is embedded information data within an insensible form for human visual system but in a way that protects from attacks such as common image processing techniques. Spatial domain(Le...

Mobility Prediction and Load Balancing Based Adaptive Handovers for LTE Systems

In cellular networks including Long Term Evolution (LTE) systems, how to balance the load is indispensable because traffic load and local user densities vary dynamically. A load balancing problem occurs when available wi...

Download PDF file
  • EP ID EP113546
  • DOI -
  • Views 126
  • Downloads 0

How To Cite

Mallikarjuna Shastry P. M. , K. Venkatesh (2010). Selection of a Checkpoint Interval in Coordinated Checkpointing Protocol for Fault Tolerant Open MPI. International Journal on Computer Science and Engineering, 2(6), 2064-2070. https://europub.co.uk/articles/-A-113546