A REVIEW ON CLUSTERING-BASED FEATURE SUBSET SELECTION ALGORITHM FOR HIGH DIMENSIONAL DATA

Abstract

 In HD dataset, feature selection involves identifying the subset of good features by using clustering approach. Feature selection involves removal of irrelevant and redundant features which are the essential data preprocessing activities for effective data mining. A Clustering based approach for good feature selection evaluated from both the efficiency and effectiveness points of view. Efficiency relates the time required to find a subset of good features while the effectiveness is related to the quality of the subset of features. The feature selection algorithm for high dimensional data produces the more compatible results as the original entire set of results based on search strategies, evaluation criteria, and data mining tasks. It reveals unattempted combinations, and provides guidelines in selection of feature selection algorithms. FAST algorithm for feature subset selection works in two steps. The first step involves distribution of feature subsets into clusters by using graph-theoretic clustering methods and the second step involves selection of most useful ,efficient features that is strongly related to the target classes which form the subset of good features. In FAST algorithm to increase the the efficiency we adopted efficient Minimum Spanning Tree clustering method. Based on some of these criteria, a clustering-based feature selection algorithm for HD data is proposed and experimentally evaluated in this paper.

Authors and Affiliations

Keywords

Related Articles

 Anti-Jamming For Wireless Sensor Networks

 Resilience to electromagnetic jamming and its avoidance are difficult problems. It is often both hard to distinguish malicious jamming from congestion in the broadcast regime and a challenge to conceal the activit...

 OPTIMISING THE TRAFFIC MOBILITY FOR SUSTAINABLE DEVELOPMENT FOR SALEM CITY

 Urban traffic congestion is a recurring problem in large cities which has a negative impact on mobility, environment, local economy and quality of life. An efficient infrastructure for urban mobility is essential...

 COMPLEX PURCHASING – A CASE STUDY OF EVALUATION MODELS FOR LONG-TERM NETWORK CAPITAL INVESTMENTS

 This paper addresses purchasing evaluation models in the European electrical grid sector, with a particular focus on complex and long-term network capital investments. The findings are based on 49 interviews with...

 Generation of Electricity Using Sugar mill Waste Water by Microbial Fuel Cell

 The application of microbial fuel cell (MFC) for electricity generation has been developing recently. This research explores the application of single chamber MFC in generating electricity using sugar wastewater.T...

 Survey of Various Methods for Optimum Load Dispatch in Hybrid Power System

 Scarcity of energy resources, increasing power generation cost and ever-growing demand for electric energy, it is necessary to utilize the power as much as possible. To improve the power utilization factor, econom...

Download PDF file
  • EP ID EP127098
  • DOI -
  • Views 77
  • Downloads 0

How To Cite

(2015).  A REVIEW ON CLUSTERING-BASED FEATURE SUBSET SELECTION ALGORITHM FOR HIGH DIMENSIONAL DATA. International Journal of Engineering Sciences & Research Technology, 4(1), 199-202. https://europub.co.uk/articles/-A-127098