Extracting Credit Rules from Imbalanced Data: The Case of an Iranian Export Development Bank
Journal Title: Journal of Information Systems and Telecommunication - Year 2015, Vol 3, Issue 1
Abstract
Credit scoring is an important topic, and banks collect different data from their loan applicant to make an appropriate and correct decision. Rule bases are of more attention in credit decision making because of their ability to explicitly distinguish between good and bad applicants. The credit scoring datasets are usually imbalanced. This is mainly because the number of good applicants in a portfolio of loan is usually much higher than the number of loans that default. This paper use previous applied rule bases in credit scoring, including RIPPER, OneR, Decision table, PART and C4.5 to study the reliability and results of sampling on its own dataset. A real database of one of an Iranian export development bank is used and, imbalanced data issues are investigated by randomly Oversampling the minority class of defaulters, and three times under sampling of majority of non-defaulters class. The performance criterion chosen to measure the reliability of rule extractors is the area under the receiver operating characteristic curve (AUC), accuracy and number of rules. Friedman’s statistic is used to test for significance differences between techniques and datasets. The results from study show that PART is better and good and bad samples of data affect its results less.
Authors and Affiliations
Seyed Mahdi Sadatrasoul, Mohammad Reza Gholamian, Kamran Shahanaghi
Identification of a Nonlinear System by Determining of Fuzzy Rules
In this article the hybrid optimization algorithm of differential evolution and particle swarm is introduced for designing the fuzzy rule base of a fuzzy controller. For a specific number of rules, a hybrid algorithm for...
A New Architecture for Intrusion-Tolerant Web Services Based on Design Diversity Techniques
Web services are the realization of service-oriented architecture (SOA). Security is an important challenge of SOAP-based Web services. So far, several security techniques and standards based on traditional security mech...
A New Approach to Overcome the Count to Infinity Problem in DVR Protocol Based on HMM Modelling
Due to low complexity, power and bandwidth saving Distance Vector Routing has been introduced as one of the most popular dynamic routing protocol. However, this protocol has a serious drawback in practice called Count To...
A Novel Resource Allocation Algorithm for Heterogeneous Cooperative Cognitive Radio Networks
In cognitive radio networks (CRN), resources available for use are usually very limited. This is generally because of the tight constraints by which the CRN operate. Of all the constraints, the most critical one is the l...
A New Node Density Based k-edge Connected Topology Control Method: A Desirable QoS Tolerance Approach
This research is an ongoing work for achieving consistency between topology control and QoS guarantee in MANET. Desirable topology and Quality of Service (QoS) control are two important challenges in wireless communicati...