Feature-based Model for Extraction and Classification of High Quality Questions in Online Forum
Journal Title: Journal of Advances in Mathematics and Computer Science - Year 2017, Vol 22, Issue 1
Abstract
Aims: To design and implement a classification-based model using specific features for identification and extraction of high quality questions in a thread. Study Design: The study design is divided into three modules: preprocessing, configuration, and question classification Place and Duration of Study: Department of Computer Science of the Federal University of Technology Akure, between June 2016 and December 2016 Methodology: This research proposes a way of identifying, extracting and classifying questions in order to enhance high quality answers in an online forum. One of the major issues in question extraction and classification in forum is the restriction on the number of categories considered such as Who, What, Where, Where, Which, Why and How which are not sufficient to capture all possible questions. In this work, a number of parameters were proposed and aggregated using fuzzy logic for context based spam detection and removal in order to enhance question identification and classification. Part of speech (POS) tagging was applied to analyse the structure of each extracted sentence based on the presence and position of predefined question tags; with this, issues like case sensitivity, grammatical construction and synonyms are addressed. Question classification is carried out with Naïve Bayes and identifying semantic relationship between extracted questions is achieved with cosine similarity model. Experiments were performed on dataset constructed from Research Gate website. Results: We presented questions extracted from researchgate website into the system. The output consists of the corresponding POS tags and the category the question is classified into. The number of questions extracted from the website is dependent on the number of questions available in a forum. We were able to achieve a successful result of 3015 correctly extracted and classified questions at 80% POS tag occurrence. Conclusion: Our approach to question identification and classification was effective and covers more question categories. This can be applied to any question answering system.
Authors and Affiliations
Bolanle Ojokoh, Tobore Igbe, Ayobami Araoye
On Skew Circulant Type Matrices Involving any Continuous Pell Numbers
In this paper, the invertibility of the skew circulant type matrices are discussed, the determinants and the inverse matrices of them are given. We present the four kinds of norms and bounds for the spread of these speci...
Mathematical Model of Drinking Epidemic
A non-linear SHTR mathematical model was used to study the dynamics of drinking epidemic. We discussed the existence and stability of the drinking-free and endemic equilibria. The drinking-free equilibrium was locally as...
A Note on the Fuglede and Fuglede-Putnam’s Theorems
In this paper, we investigate the extension of Fuglede and Fuglede-Putnam’s Theorems to two bounded linear operators.
The Impulse Interactive Cuts of Entropy Functional Measure on Trajectories of Markov Diffusion Process, Integrating in Information Path Functional, Encoding and Applications
The introduced entropy integral measure on random trajectories (EF) is defined by the process additive functional with functions drift and diffusion reducing this functional on trajectories to a regular integral function...
Feature-based Model for Extraction and Classification of High Quality Questions in Online Forum
Aims: To design and implement a classification-based model using specific features for identification and extraction of high quality questions in a thread. Study Design: The study design is divided into three modules: pr...