Classifying Arabic Text Using KNN Classifier

Abstract

With the tremendous amount of electronic documents available, there is a great need to classify documents automatically. Classification is the task of assigning objects (images, text documents, etc.) to one of several predefined categories. The selection of important terms is vital to classifier performance, feature set reduction techniques such as stop word removal, stemming and term threshold were used in this paper. Three term-selection techniques are used on a corpus of 1000 documents that fall in five categories. A comparison study is performed to find the effect of using full-word, stem, and the root term indexing methods. K-nearest – neighbors classifiers used in this study. The averages of all folds for Recall, Precision, Fallout, and Error-Rate were calculated. The results of the experiments carried out on the dataset show the importance of using k-fold testing since it presents the variations of averages of recall, precision, fallout, and error rate for each category over the 10-fold.

Authors and Affiliations

Amer Al-Badarenah, Emad Al-Shawakfa, Khaleel Al-Rababah, Safwan Shatnawi, Basel Bani-Ismail

Keywords

Related Articles

Development Trends of Online-based Aural Rehabilitation Programs for Children with Cochlear Implant Coping with the Fourth Industrial Revolution and Implication in Speech-Language Pathology

The Korea Research Foundation selected the miniaturization and development of home care devices as the future promising technologies in the biotechnology (BT) area along with the Fourth Industrial Revolution. Accordingly...

Motif Detection in Cellular Tumor p53 Antigen Protein Sequences by using Bioinformatics Big Data Analytical Techniques

Due to the rapid growth of data in the field of big data and bioinformatics, the analysis and management of the data is a very difficult task for the scientist and the researchers. Data exists in many formats like in the...

Learner Cognitive Behavior and Influencing Factors in Web-based Learning Environment

In educational institutions, to improve student learning outcome and performance, the information and communication technology has enabled us to embark web-based learning approaches. The traditional web-based learning en...

Firefly Algorithm for Adaptive Emergency Evacuation Center Management

Flood disaster is among the most devastating natural disasters in the world, claiming more lives and causing property damage. The pattern of floods across all continents has been changing, becoming more frequent, intense...

Secure user Authentication and File Transfer in Wireless Sensor Network using Improved AES Algorithm

The WSN technology is a highly efficient and effective way of gathering highly sensitive information and is often deployed in mission-critical applications, which makes the security of its data transmission of vital sign...

Download PDF file
  • EP ID EP154286
  • DOI 10.14569/IJACSA.2016.070633
  • Views 116
  • Downloads 0

How To Cite

Amer Al-Badarenah, Emad Al-Shawakfa, Khaleel Al-Rababah, Safwan Shatnawi, Basel Bani-Ismail (2016). Classifying Arabic Text Using KNN Classifier. International Journal of Advanced Computer Science & Applications, 7(6), 259-268. https://europub.co.uk/articles/-A-154286