WikiDetect: Automatic Vandalism Detection On Wikipedia
Journal Title: Romanian Journal of Human - Computer Interaction - Year 2011, Vol 4, Issue 2
Abstract
Article vandalism has always been one of the greatest security issues of Wikipedia, yet few automatic (non-human) solutions for this problem have been developed so far. Large amounts of time are spent by volunteers correcting vandalized edits, instead of using this time to add quality content to Wikipedia articles. The purpose of this paper is to introduce a new vandalism detection system, based on a machine learning technique trained on a corpus of real data, and to test its performance. The application functions in a very realistic environment, as it analyzes expert annotated wikitext, extracted from the encyclopedia’s database, which is used to evaluate different vandalism detection algorithms. The paper presents a critical analysis of the obtained results, comparing them to existing solutions, and suggests different statistical classification methods that bring improvements to the system.
Authors and Affiliations
Dan Cioiu, Traian Eugen Rebedea
A User Centered Approach in Designing a Career Decision Making Assistant
The present paper analyses the results of a user need research during the development of an intelligent assistant to support decisions in choosing the specialty in higher education. Our research was composed of multiple...
Recognizing named entities, quotes and events in news and social media items in Romanian
At the border of natural language processing and information retrieval, named entity recognition has represented one of the most important research problems of the two domains, that has not been solved perfectly yet even...
A Microformats-based Tool for Personal Data Management
We found ourselves in a time when we experience a real blooming of the social side of the Web, by having more and more (apparently) separated entities communicate together. In order to be able to communicate easily and t...
Mood and Sentiment Assessment Using Latent Semantic Analysis
The analysis of written communication can reveal subtle information, such as speaker’s emotional state, attitude and intentions. However, these cannot always be extracted accurately, at a level comparable to humans’ abil...
Automatic Language Recognition with application in Diferentiated Speech Synthesis
This paper briefly presents several aspects concerning automatic language recognition and continues with particularities for algorithms used in language differentiated speech synthesis. Several algorithm optimization met...