Comparison of EM and Two-Step Cluster Method for Mixed Data: An Application

Abstract

There have been more than 50 type clustering algorithms developed for getting meaningful information from big datasets and grouping individuals according to their characteristics. In actual researches, it is often seen that data involves all types of variables. In this case, it is very important to select appropriate clustering algorithm according to different data types. In this study, we will provide information about EM(Expectation Maximization),Two–Step Clustering methods which are developed in recent years and one of the best methods for data sets containing mixed types of variables. And the second aim is to compare the methods by producing a data set from health field information.These algorithms are generally recommended for large data sets but there are also used n medium-sized data sets. Medium- sized data sets are more often in actual researches.Therefore, fifty people for control group and fifty people for patients that have polycystic over syndrome were taken to the study. Totally nineteen variables were measured from these subjects and thirteen of them were quantitative, six of them were qualitative.Clusters were obtained by EM and Two-Step cluster methods.To evaluate the relationships between the clusters obtained from algorithms and actually known patient, control groups were analyzed by Kappa coefficient. It was found that EM clustering algorithm has highest compliance coefficient comparing with Two-Step cluster(Kappa=0,740;p<0,001) and it was seen EM method was a better algorithm for finding both patients and controls. As a result, we can say that researchers may have successful results for classifying diseases by appropriate clustering methods>

Authors and Affiliations

Özge Pasin, Handan Ankaralı

Keywords

Related Articles

A Rare Case of Spondyloepiphyseal Dysplasia with Ocular Manifestations

Spondyloepiphyseal dysplasia (SED) is a rare heterogenous form of chondrodysplasia characterized by congenital dwarfism with a short trunk and epiphyseal dysplasia in the long bones and vertebral bodies. There is a defec...

The phenomenon of International Conflict in International Relations

There is an ongoing debate on what has brought about the reduction in violence over the past century, since the first and second World Wars, but the phenomenon cannot be reduced to the application of...

Comparison of glycosylated hemoglobin and different lipid parameters in with and without type-2 diabetes mellitus patients and controls

Objective: To compare of glycosylated hemoglobin and different lipid parameters in with and without type-2 diabetes mellitus patients and controls. Methods: This was a prospective and analytical study conducted in tertia...

A Mysterious Case of Vulval Botryomycosis Simulating Lymphangiectasias – A Special Case Report

Botryomycosis (or bacterial pseudomycosis or pyoderma vegetans) is a rare chronic, granulomatous, suppurative bacterial infection involving the skin and subcutaneous tissues. It was first described by bollinger. We are r...

Anatomical and Orthopaedic Perspective of a Block Cervical Vertebra Anatomic, Embryological and Orthopedic Perspective

The vertebral column can have many anatomic defects. One of the rarer vertebral defect is the block vertebrae in which two adjacent vertebrae can be fused. This is a classic example of metamerism in human body. These two...

Download PDF file
  • EP ID EP313214
  • DOI -
  • Views 71
  • Downloads 0

How To Cite

Özge Pasin, Handan Ankaralı (2017). Comparison of EM and Two-Step Cluster Method for Mixed Data: An Application. International Journal Of Medical Science And Clinical Invention, 4(3), -. https://europub.co.uk/articles/-A-313214