Relevant Feature Selection from High-Dimensional Data Using MST Based Clustering

Abstract

Feature selection is the process of identifying a subset of the most useful features that produces compatible results as the original entire set of features. Features provide the information about the data set. In Highdimensional data representation each sample is described by many features. The data sets are typically not task-specific, many features are irrelevant or redundant and should be pruned out or filtered for the purpose of classifying target objects. Given a set of features the feature selection problem is to find a subset of features that “maximizes the learner’s ability to classify patterns”. A graph theoretic clustering algorithm based on boruvka’s algorithm is implemented and experimentally evaluated in this paper. The proposed algorithm works in two steps. In the first step, features are divided into clusters by using graphtheoretic clustering methods. In the second step, the most representative feature that is strongly related to target classes is selected from each cluster to form a subset of features. All the representative features from different clusters form the final feature subset. After finding feature subset accuracy of a classifier, time required for classification and proportion of features selected can be calculated

Authors and Affiliations

Yaswanth Kumar Alapati

Keywords

Related Articles

Using Raspberry Pito Design Smart Mirror Applications

Smart mirrors, which proceed with the works today and will have its spot later on innovation, give both mirrorand PC supported data administrations to its clients. Because of the microcontroller cards lo...

Compressed Data Transfer from Mobile-To- Mobile Using Wi-Fi

This paper proposes a project with an aim to achieve Compressed File Sharing, between two Users with fast transfer of data. The data includes documents, image, audio and video. The connectivity is done using Wi-Fi direct...

Isolation and Screening of Protease Producing Bacteria from Soil

Proteases execute a large variety of functions and have important biotechnological applications. Proteases represent one of the three largest groups of industrial enzymes and find application in detergents, leather indus...

A Survey on Digital Video Watermarking

At the leading edge of the information world everything is available in the form of digital media. Digital watermarking was introduced to provide the copy right protection and owners’ authentication. Digital video waterm...

A Review: Handover in 3G/UMTS Network

Mobile communication with new technology is the fastest growing area with regularly increased data rates and coverage areas. Therefore the upcoming challenge is to make the best possible use of the available different ty...

Download PDF file
  • EP ID EP221601
  • DOI -
  • Views 84
  • Downloads 0

How To Cite

Yaswanth Kumar Alapati (2015). Relevant Feature Selection from High-Dimensional Data Using MST Based Clustering. International journal of Emerging Trends in Science and Technology, 2(3), 1997-2001. https://europub.co.uk/articles/-A-221601