Urdu Text Classification using Majority Voting
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2016, Vol 7, Issue 8
Abstract
Text classification is a tool to assign the predefined categories to the text documents using supervised machine learning algorithms. It has various practical applications like spam detection, sentiment detection, and detection of a natural language. Based on the idea we applied five well-known classification techniques on Urdu language corpus and assigned a class to the documents using majority voting. The corpus contains 21769 news documents of seven categories (Business, Entertainment, Culture, Health, Sports, and Weird). The algorithms were not able to work directly on the data, so we applied the preprocessing techniques like tokenization, stop words removal and a rule-based stemmer. After preprocessing 93400 features are extracted from the data to apply machine learning algorithms. Furthermore, we achieved up to 94% precision and recall using majority voting.
Authors and Affiliations
Muhammad Usman, Zunaira Shafique, Saba Ayub, Kamran Malik
Good Quasi-Cyclic Codes from Circulant Matrices Concatenation using a Heuristic Method
In this paper we present a method to search q circulant matrices; the concatenation of these circulant matrices with circulant identity matrix generates quasi-cyclic codes with high various code rate q/(q+1) (q an intege...
Comparison of Event Choreography and Orchestration Techniques in Microservice Architecture
Microservice Architecture (MSA) is an architectural design pattern which was introduced to solve the challenges involved in achieving the horizontal scalability, high availability, modularity and infrastructure agility f...
A Variant of Genetic Algorithm Based Categorical Data Clustering for Compact Clusters and an Experimental Study on Soybean Data for Local and Global Optimal Solutions
Almost all partitioning clustering algorithms getting stuck to the local optimal solutions. Using Genetic algorithms (GA) the results can be find globally optimal. This piece of work offers and investigates a new variant...
Surface Texture Synthesis and Mixing Using Differential Colors
In neighborhood-based texture synthesis, adjacent local regions need to satisfy color continuity constraints in order to avoid visible seams. Such continuity constraints seriously restrict the variability of synthesized...
On an internal multimodel control for nonlinear multivariable systems - A comparative study
An internal multimodel control designed for nonlinear multivariable systems, is proposed in this paper. This approach is based on the multi-modeling of nonlinear systems and the realization of a specific inversion of eac...