Classifying Arabic Text Using KNN Classifier
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2016, Vol 7, Issue 6
Abstract
With the tremendous amount of electronic documents available, there is a great need to classify documents automatically. Classification is the task of assigning objects (images, text documents, etc.) to one of several predefined categories. The selection of important terms is vital to classifier performance, feature set reduction techniques such as stop word removal, stemming and term threshold were used in this paper. Three term-selection techniques are used on a corpus of 1000 documents that fall in five categories. A comparison study is performed to find the effect of using full-word, stem, and the root term indexing methods. K-nearest – neighbors classifiers used in this study. The averages of all folds for Recall, Precision, Fallout, and Error-Rate were calculated. The results of the experiments carried out on the dataset show the importance of using k-fold testing since it presents the variations of averages of recall, precision, fallout, and error rate for each category over the 10-fold.
Authors and Affiliations
Amer Al-Badarenah, Emad Al-Shawakfa, Khaleel Al-Rababah, Safwan Shatnawi, Basel Bani-Ismail
Development Trends of Online-based Aural Rehabilitation Programs for Children with Cochlear Implant Coping with the Fourth Industrial Revolution and Implication in Speech-Language Pathology
The Korea Research Foundation selected the miniaturization and development of home care devices as the future promising technologies in the biotechnology (BT) area along with the Fourth Industrial Revolution. Accordingly...
Motif Detection in Cellular Tumor p53 Antigen Protein Sequences by using Bioinformatics Big Data Analytical Techniques
Due to the rapid growth of data in the field of big data and bioinformatics, the analysis and management of the data is a very difficult task for the scientist and the researchers. Data exists in many formats like in the...
Learner Cognitive Behavior and Influencing Factors in Web-based Learning Environment
In educational institutions, to improve student learning outcome and performance, the information and communication technology has enabled us to embark web-based learning approaches. The traditional web-based learning en...
Firefly Algorithm for Adaptive Emergency Evacuation Center Management
Flood disaster is among the most devastating natural disasters in the world, claiming more lives and causing property damage. The pattern of floods across all continents has been changing, becoming more frequent, intense...
Secure user Authentication and File Transfer in Wireless Sensor Network using Improved AES Algorithm
The WSN technology is a highly efficient and effective way of gathering highly sensitive information and is often deployed in mission-critical applications, which makes the security of its data transmission of vital sign...