FEATURE SELECTION METHODS AND ALGORITHMS
Journal Title: International Journal on Computer Science and Engineering - Year 2011, Vol 3, Issue 5
Abstract
Feature selection is an important topic in data mining, especially for high dimensional datasets. Feature selection (also known as subset selection) is a process commonly used in machine learning, wherein subsets of the features available from the data are selected for application of a learning algorithm. The best subset contains the least number of dimensions that most contribute to accuracy; we discard the remaining, unimportant dimensions. This is an important stage of preprocessing and is one of two ways of avoiding the curse of dimensionality (the other is feature extraction). There are two approaches in Feature selection known as Forward selection and backward selection. Feature selection has been an active research area in pattern recognition, statistics, and data mining communities. The main idea of feature selection is to choose a subset of input variables by eliminating features with little or no predictive information. Feature selection methods can be decomposed into three broad classes. One is Filter methods and another one is Wrapper method and the third one is Embedded method. This paper presents an empirical comparison of feature selection methods and its algorithms. In view of the substantial number of existing feature selection algorithms, the need arises to count on criteria that enable to adequately decide which algorithm to use in certain situations. This work reviews several fundamental algorithms found in the literature and assesses their performance in a controlled scenario.
Authors and Affiliations
L. Ladha , T. Deepa,
On the Security of Image Encoding Based on Fractal Functions
The information age brings some unique challenges to society. New technology and new applications bring new threats and force us to invent new protection mechanisms. So every few years, computer security needs to reinven...
Adaption of Proactive Measure for Improving Performance Throughput Of TCP-Vegas
TCP (Transmission Control Protocol) throughput considered to be one of the most important aspects for analyzing the performance of TCP. This article represents a novel proactive technique based Acknowledge Delay which ai...
Cloud Based Distributed Databases: The Future Ahead
Fault tolerant systems are necessary to be there for distributed databases for data centers or distributed databases requires having fault tolerant system due to the higher data scales supported by current data centers....
EFFICIENT ALGORITHM FOR MINING FREQUENT ITEMSETS USING CLUSTERING TECHNIQUES
Now a days, Association rule plays an important role. The purchasing of one product when another product is purchased represents an association rule. The Apriori algorithm is the basic algorithm for mining association ru...
Quantum Teleportation circuit using Matlab and Mathematica
This Paper describes a basic Quantum Teleportation circuit using mat lab Qlib tool. Teleportation is a new and exciting field of future communication. We know that security in data communication is a major concern nowada...