An Intra and Inter-Topic Evaluation and Cleansing Method
Journal Title: Romanian Journal of Human - Computer Interaction - Year 2010, Vol 3, Issue 2
Abstract
Topic modeling is a growing research field and novel ways of interpreting and evaluating results are necessary. We propose a method for evaluating and improving the performance of topic models generating algorithms relying on WordNet data. We first propose a measure for determining a topic model’s fitness factoring in its broadness and redundancy. Then, for each individual topic, the amount of relevant information it provides, along with its most important words and related concepts are determined by defining a cohesion function based on the topic’s projection on WordNet concepts. The model as a whole is improved by eliminating each topic’s outliers with respect to the ontology projection. We define a inter topic ontology based distance and we further use it to investigate the impact of removing redundant topics from a model with regard to the overlap between topics’ ontological projections. Clustering similar topics into conceptually cohesive groups is tried as an alternative to pruning less relevant topics. Results show that evaluating and improving statistical models with WordNet is a promising research track that leads to more coherent topic models.
Authors and Affiliations
Claudiu Muşat, Marian-Andrei Rizoiu , Ştefan Trauşan-Matu
User interface standardization
The UsiXML ITEA 2 project developed an innovative model-driven engineering method to improve the user interface design for the benefit of both industrial and academic end-users in terms of productivity and reusability. T...
Java Solutions for Real Time Vocal Signal Transmission using the UDP Protocol
Java technologies cover at this moment all aspects concerning software developing. This paper proposes a software solution for real time vocal signal transmission using Java technologies. The proposed solution can be an...
Language Resources for a Question-Answering System for Romanian
We describe here several language resources (a lexicon, a paradigmatic morphology, two linguistic thesauri – the Romanian wordnet and Eurovoc – and a parallel multilingual corpus) from the perspective of their utility es...
Usability evaluation of a learning scenario for Biology implemented onto an augmented reality platform
The combination between real and virtual in the augmented reality systems requires suitable interaction techniques that need to be tested with users in order to avoid usability problems. Formative evaluation aims at find...
Using Context for Improving Human-Computer Interaction in Intelligent Spaces
In this paper we included some concepts and trends about diminishing the interaction complexity of an user with his equipments throughout an intelligent space. The intelligent space includes different equipments and so...