Fault Detection and Tolerance in Cluster of Workstations using Message Passing Interface

Abstract

A Cluster of Workstations (COW) is network based multi-computer system aimed to replace supercomputers. A cluster of workstations works on Divisible Load Theory (DLT) according to which a job is divided into n subtasks and delegated to n workstations in the COW architecture. To get the job completed, all subtasks must be completed. Therefore, for satisfactory job completion, all workstations must be functional. However, a faulty node can suspend the overall job completion task until and unless some fault avoidance and correction measures are taken. This paper presents a fault detection and fault tolerant algorithm which will use Message Passing Interface (MPI) to identify faulty workstations and transfer the subtask being performed by them to a normally working workstation. The assigned workstations will continue their original subtasks in addition to assigned subtasks on time sharing basis.

Authors and Affiliations

Syed Misbahuddin

Keywords

Related Articles

AUGMENTING WALKABILITY, VISIBILITY AND ARRANGEMENT FOR KOREAN ICU

Abstract—Clinical teams are facing increasing demands to perform more consistently and efficiently in delivering improved health outcomes. Hospital management team in Korea face difficulties in complex routine task for n...

Implementation of Fruit Grading & Sorting Station Using Digital Image Processing Techniques

No doubt that today's technology has approximately solved many common as well as complex issues. Engineers and researchers are always in the quest for the best, brief and efficient methods to cope up the real world probl...

Importance of Information Availability, its effects on Business & the proposed Model

Abstract—Many a time the irony one can face is the unavailability of the resources when they are needed the most causing unavoidable/irreversible loss. These kinds of scenarios can cost organizations their business. Ente...

Human Heart Disease Prediction System Using Data Mining Techniques

— Prediction of heart disease is a big issue in now a day because in electronic life everyone is busy and due to heavy load of work people do not give attention to their health. To diagnose a disease is a big issue. The...

Maximum Likelihood Decoder for Variable Length Codes

Variable Length Codes (VLC) are used to transfer same amount of digital information in relatively short period of time. In variable length coding, the characters with higher probability of occurrence are assigned shorter...

Download PDF file
  • EP ID EP431531
  • DOI -
  • Views 183
  • Downloads 0

How To Cite

Syed Misbahuddin (2011). Fault Detection and Tolerance in Cluster of Workstations using Message Passing Interface. Sir Syed University Reseacrh Journal of Engineering and Technology, 1(1), 1-4. https://europub.co.uk/articles/-A-431531