A Rank Aggregation Algorithm for Ensemble of Multiple Feature Selection Techniques in Credit Risk Evaluation
Journal Title: International Journal of Advanced Research in Artificial Intelligence(IJARAI) - Year 2016, Vol 5, Issue 9
Abstract
In credit risk evaluation the accuracy of a classifier is very significant for classifying the high-risk loan applicants correctly. Feature selection is one way of improving the accuracy of a classifier. It provides the classifier with important and relevant features for model development. This study uses the ensemble of multiple feature ranking techniques for feature selection of credit data. It uses five individual rank based feature selection methods. It proposes a novel rank aggregation algorithm for combining the ranks of the individual feature selection methods of the ensemble. This algorithm uses the rank order along with the rank score of the features in the ranked list of each feature selection method for rank aggregation. The ensemble of multiple feature selection techniques uses the novel rank aggregation algorithm and selects the relevant features using the 80%, 60%, 40% and 20% thresholds from the top of the aggregated ranked list for building the C4.5, MLP, C4.5 based Bagging and MLP based Bagging models. It was observed that the performance of models using the ensemble of multiple feature selection techniques is better than the performance of 5 individual rank based feature selection methods. The average performance of all the models was observed as best for the ensemble of feature selection techniques at 60% threshold. Also, the bagging based models outperformed the individual models most significantly for the 60% threshold. This increase in performance is more significant from the fact that the number of features were reduced by 40% for building the highest performing models. This reduces the data dimensions and hence the overall data size phenomenally for model building. The use of the ensemble of feature selection techniques using the novel aggregation algorithm provided more accurate models which are simpler, faster and easy to interpret.
Authors and Affiliations
Shashi Dahiya, S. S Handa, N. P Singh
A Design of a Multi-Agent Smart E-Examiner
this paper proposes a design of an application of multi agent technology on a semantic net knowledge base, to build a smart e-examiner system. This e-examiner could be used in building and grading a personalized sp...
Image Prediction Method with Nonlinear Control Lines Derived from Kriging Method with Extracted Feature Points Based on Morphing
Method for image prediction with nonlinear control lines which are derived from extracted feature points from the previously acquired imagery data based on Kriging method and morphing method is proposed. Through comparis...
A Semantic-Aware Data Management System for Seismic Engineering Research Projects and Experiments
The invention of the Semantic Web and related technologies is fostering a computing paradigm that entails a shift from databases to Knowledge Bases (KBs). There the core is the ontology that plays a main role in en...
Identification of Ornamental Plant Functioned as Medicinal Plant Based on Redundant Discrete Wavelet Transformation
Human has a duty to preserve the nature. One of the examples is preserving the ornamental plant. Huge economic value of plant trading, escalating esthetical value of one space and medicine efficacy that contained i...
Hybrid Metaheuristics for the Unrelated Parallel Machine Scheduling to Minimize Makespan and Maximum Just-in-Time Deviations
This paper studies the unrelated parallel machine scheduling problem with three minimization objectives – makespan, maximum earliness, and maximum tardiness (MET-UPMSP). The last two objectives combined are related...