A Strategy for Training Set Selection in Text Classification Problems
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2013, Vol 4, Issue 6
Abstract
An issue in text classification problems involves the choice of good samples on which to train the classifier. Training sets that properly represent the characteristics of each class have a better chance of establishing a successful predictor. Moreover, sometimes data are redundant or take large amounts of computing time for the learning process. To overcome this issue, data selection techniques have been proposed, including instance selection. Some data mining techniques are based on nearest neighbors, ordered removals, random sampling, particle swarms or evolutionary methods. The weaknesses of these methods usually involve a lack of accuracy, lack of robustness when the amount of data increases, over?tting and a high complexity. This work proposes a new immune-inspired suppressive mechanism that involves selection. As a result, data that are not relevant for a classifier’s ?nal model are eliminated from the training process. Experiments show the e?ectiveness of this method, and the results are compared to other techniques; these results show that the proposed method has the advantage of being accurate and robust for large data sets, with less complexity in the algorithm.
Authors and Affiliations
Maria Passini, Katiusca Estébanez, Grazziela Figueredo, Nelson Ebecken
An Enhanced Steganographic Model Based on DWT Combined with Encryption and Error Correction Techniques
The problem of protecting information, modification, privacy and origin validation are very important issues and became the concern of many researchers. Handling these problems definitely is a big challenge and this is p...
Towards Adaptive user Interfaces for Mobile-Phone in Smart World
All applications are developed for context adaptation and provide communication with users through their interfaces. These applications offer new opportunities for developers as well as users by collecting context data a...
Impact of Web 2.0 on Digital Divide in AJ&K Pakistan
Digital divide is normally measured in terms of gap between those who can efficiently use new technological tools, such as internet, and those who cannot. It was also hypothesized that web 2.0 tools motivate people to us...
Effect of TCP Buffer Size on the Internet Applications
The development of applications, such as online video streaming, collaborative writing, VoIP, text and video messengers is increasing. The number of such TCP-based applications is increasing due to the increasing availab...
Efficiency in Motion: The New Era of E-Tickets
The development of mobile applications has played an important role in technology. Due to recent advances in technology, mobile applications are creating more attraction across the world. Mobile application is a very int...