Using Big Data Technique for Building Edit Alert System for Wikipedia Infoboxes Based on MapReduce Method
Journal Title: International Journal of Innovative Research in Computer Science and Technology - Year 2018, Vol 6, Issue 4
Abstract
Wikipedia is an online encyclopedia and has become a vital information resource for users as well as for many knowledge bases derived from it. This information requires manual editing for update. Wikipedia provides an infobox on the right hand side of many articles. An infobox of a Wikipedia article generally contains key facts in thearticle and is organized as attribute-value pairs. All the Wikipedia’s content is manually updated or maintained by contributors. This leads to the fact that its information is not updated regularly and completely. In this paper, we present a novel system that focuses onprediction of data items that are most likely to be updated, based on the category of page, record key, last time updated, etc. for alerting Wikipedia editors, about the data items that might need update soon, using Time series modeling. Concept of Bipartite graph is used to perform user based collaborative filtering to find similar editors who might be interested in editing the infobox. The update alert is sent to editors found using Bipartite graph along with the past editors of a particular infobox. The technique to deal with vandalic and erroneous edits is also discussed and its analysis is given. We have also presented various tasks that can be carried out on infoboxes
Authors and Affiliations
Khushboo Bhatia, Arnab Halder, Yashi Yadav, Ankush Sarsewar, Priyanka Singh, Khushboo Khurana
Significance of Patient Satisfaction in the Healthcare Industry: Part 1
Healthcare industry has been noticed as one of the largest and fastest growing industries in the service sector. Healthcare industry, however, has been challenged to find alternative ways to sustain compatibility among t...
Review on Retrieving Biological Sequence Alignment using Smith-Waterman Algorithm
Bioinformatics is one of the interdisciplinary research area. In various genome projects, huge biological sequences are available. Biological sequence analysis is fundamental operation in Bioinformatics and the goal is t...
Evolving Constraints in Military Applications using Wireless Sensor Networks
WSNs consist of a large number of small sensor nodes. These nodes are very cheap in terms of cost. In military operations, there is always a threat of being attacked by enemies. So, the use of these cheap sensor nodes wi...
A Review on Wireless Sensor Networks and its applications
Around the globe, wireless sensor nodes are utilized in a variety of real-time applications. Significant research in the field of wireless sensor networks (WSNs) has been done in the last decade due to an increase in dem...
A Review on the Detection and Classification of Glaucoma Disease Based on Transfer Learning
An eye infection is a condition affecting the eyes that can be caused by a bacterium, virus, or fungus. Numerous eye infections exist, such as glaucoma, cellulitis, keratitis, and conjunctivitis. A few of the symptoms ma...