A FAST Algorithm for High Dimensional Data using Clustering-Based Feature Subset Selection

Abstract

Feature subset clustering is a powerful technique to reduce the dimensionality of feature vectors for text classification and involves identifying a subset of the most useful features that produces compatible results as the original entire set of features. A novel approach called supervised attribute clustering algorithm is proposed to improve the accuracy and check the probability of the patterns. The FAST algorithm works in two steps. In the first step, features are divided into clusters by using graph-theoretic clustering methods. In the second step, the most representative feature that is strongly related to target classes is selected from each cluster to form a subset of features. A feature selection algorithm may be evaluated from both the efficiency and effectiveness points of view. Efficiency is related to the time required to find a subset of features while the effectiveness is related to quality of subset of features.Features in different clusters are relatively independent; the clusteringbased strategy of FAST has a high probability of producing a subset of useful and independent features. To ensure the efficiency of FAST, we adopt the efficient minimum-spanning tree clustering method.

Authors and Affiliations

Puppala Priyanka, M Swapna

Keywords

Related Articles

Ecstatic Deception from Video using Oxidase Zest

To represent complex human emotional expressions Various Dimensional models has been used. Activation and valence are two common dimensions in such models. They can be used to describe certain emotions. This project pro...

MR Brain Image Segmentation Based on Self-Organizing Map and Neural Network

Image segmentation is an important process to extract information from complex medical images. Segmentation has wide application in medical field. The main objective of image segmentation is to partition an image into m...

Smart Traffic Control System Based on Image Processing

In today’s life we are facing many problems one of which is traffic congestion becoming more serious day after day. The major reason leading to traffic jam is the high number of vehicle which was caused by the populatio...

Comparative Study of Various Types of Dampers used for Multi-Story R.C.C. Building

dampers are used to resist lateral forces coming on the structure. Dampers are the energy dissipating devices which also resist displacement of rc building during earthquake. These dampers help the structure to reduce t...

slugDesign and Simulation of Three StagespHEMT LNA At C-Band

Thispaper represents the designing of three stage LNA using EC2612 pHEMT technology.pHEMT technology gives high transconductance and shows better reliability. This three stage amplifier has been designed for C-band appl...

Download PDF file
  • EP ID EP19027
  • DOI -
  • Views 302
  • Downloads 9

How To Cite

Puppala Priyanka, M Swapna (2014). A FAST Algorithm for High Dimensional Data using Clustering-Based Feature Subset Selection. International Journal for Research in Applied Science and Engineering Technology (IJRASET), 2(11), -. https://europub.co.uk/articles/-A-19027