News Web Portal based on Natural Language Processing

Journal Title: Romanian Journal of Human - Computer Interaction - Year 2008, Vol 1, Issue 3

Abstract

The paper presents an autonomous text classification module for a news web portal for the Romanian language. Statistical natural language processing techniques are combined in order to achieve a completely autonomous functionality of the portal. The news items are automatically collected from a large number of news sources using web syndication. Afterward, machine-learning techniques are used for achieving an automatic classification of the news stream. Firstly, the items are clustered using an agglomerative algorithm and the resulting groups correspond to the main news topics. Thus, more in-formation about each of the main topics is acquired from various news sources. Secondly, text classification algorithms are applied to automatically label each cluster of news items in a predetermined number of classes. More than a thou-sand news items were employed for both the training and the evaluation of the classifiers. The paper presents a complete comparison of the results obtained for each method.

Authors and Affiliations

Traian Rebedea, Costin-Gabriel Chiru, Ştefan Trăuşan-Matu

Keywords

Related Articles

The Accesibility Of Elearning Platforms For The Visually Impaired Students

The accessibility of the interfaces, with which a student with visual impairments interacts, remains an unsolved issue. Even if, there have been elaborated and recommended various standards for web accessibility (e.g. W3...

Accessibility Evaluation of a Web Application for Visually Impaired People

Currently the Internet is the most important source of information for more and more people. An important requirement is to ensure equal access to information for all citizens, including persons with special needs. The...

A need, no app: just do it! But do people support dynamic composition of interactive systems for fulfilling emergent needs?

In Human Computer Interaction engineering, both the context of use (<user, platform, environment>) and the user task (<goal, procedure>) are supposed to be set at design time. However, in ubiquitous computing, the contex...

Virtual Reality Model in Geographical Information Systems

The paper presents a software architecture to implement a virtual reality model inside the Geographical Information Systems (GIS). Spatial data provides a schematic view of reality, so it is necessary to use raster data...

Experimentation Of An Acceptance Model Of E-Learning Systems

The aim of this paper is to test a model of acceptance of e-learning systems using the UTAUT reference model. After a brief description of UTAUT model, are described the proposed model and the assumptions, methods used a...

Download PDF file
  • EP ID EP28767
  • DOI -
  • Views 360
  • Downloads 10

How To Cite

Traian Rebedea, Costin-Gabriel Chiru, Ştefan Trăuşan-Matu (2008). News Web Portal based on Natural Language Processing. Romanian Journal of Human - Computer Interaction, 1(3), -. https://europub.co.uk/articles/-A-28767