Comparing Neural Network Approach with N-Gram Approach for Text Categorization

Journal Title: International Journal on Computer Science and Engineering - Year 2010, Vol 2, Issue 1

Abstract

This paper compares Neural network Approach with N-gram approach, for text categorization, and demonstrates that Neural Network approach is similar to the N-gram approach but with much less judging time. Both methods demonstrated here are aimed at language identification. The presence of particular characters, words and the statistical information of word lengths are used as a feature vector. In an identification experiment with Asian languages the neural network approach achieved 98% correct classification rate with 500 bytes, but it is five times faster than n-gram based approach.

Authors and Affiliations

A. Suresh Babu , P. N. V. S. Pavan Kumar

Keywords

Related Articles

FEATURE SELECTION METHODS AND ALGORITHMS

Feature selection is an important topic in data mining, especially for high dimensional datasets. Feature selection (also known as subset selection) is a process commonly used in machine learning, wherein subsets of the...

Conditional Random Fields based Pronominal Resolution in Tamil

This paper deals with Tamil pronominal resolution using Conditional Random Fields a machine learning approach. A detailed linguistic analysis of Tamil pronominals and its antecedence occurring in various syntactic constr...

Analysis of AOMDV and OLSR Routing Protocols Under Levy-Walk Mobility Model and Gauss-Markov Mobility Model for Ad Hoc Networks

In this paper we have compared AOMDV and OLSR routing protocol using Levy-Walk Mobility Model and Gauss-Markov Mobility Model. OLSR is a proactive, table-driven, link state routing protocol while AOMDV is a reactive rout...

Architecture design of a virtualized embedded system

Nowadays, embedded systems have become a major driver of technological developments particularly in the industrial sector. In an embedded system, the hardware and software components are so intertwined that evolution is...

Tree-kNN: A Tree-Based Algorithm for Protein Sequence Classification

The phylogenomic classification of protein sequences attempts to categorize a given protein within the evolutionary context of the entire family. It involves mainly four steps: selection of homologous sequences, multiple...

Download PDF file
  • EP ID EP113235
  • DOI -
  • Views 113
  • Downloads 0

How To Cite

A. Suresh Babu, P. N. V. S. Pavan Kumar (2010). Comparing Neural Network Approach with N-Gram Approach for Text Categorization. International Journal on Computer Science and Engineering, 2(1), 80-83. https://europub.co.uk/articles/-A-113235