A Novel Information Retrieval Approach using Query Expansion and Spectral-based
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2016, Vol 7, Issue 9
Abstract
Most of the information retrieval (IR) models rank the documents by computing a score using only the lexicographical query terms or frequency information of the query terms in the document. These models have a limitation as they does not consider the terms proximity in the document or the term-mismatch or both of the two. The terms proximity information is an important factor that determines the relatedness of the document to the query. The ranking functions of the Spectral-Based Information Retrieval Model (SBIRM) consider the query terms frequency and proximity in the document by comparing the signals of the query terms in the spectral domain instead of the spatial domain using Discrete Wavelet Transform (DWT). The query expansion (QE) approaches are used to overcome the word-mismatch problem by adding terms to query, which have related meaning with the query. The QE approaches are divided to statistical approach Kullback-Leibler divergence (KLD) and semantic approach P-WNET that uses WordNet. These approaches enhance the performance. Based on the foregoing considerations, the objective of this research is to build an efficient QESBIRM that combines QE and proximity SBIRM by implementing the SBIRM using the DWT and KLD or P-WNET. The experiments conducted to test and evaluate the QESBIRM using Text Retrieval Conference (TREC) dataset. The result shows that the SBIRM with the KLD or P-WNET model outperform the SBIRM model in precision (P@), R-precision, Geometric Mean Average Precision (GMAP) and Mean Average Precision (MAP).
Authors and Affiliations
Sara Alnofaie, Mohammed Dahab, Mahmoud Kamal
Analysis of Heart Rate Variability by Applying Nonlinear Methods with Different Approaches for Graphical Representation of Results
There is an open discussion over nonlinear properties of the Heart Rate Variability (HRV) which takes place in most scientific studies nowadays. The HRV analysis is a non-invasive and effective tool that manages to refle...
A Method of Automatic Domain Extraction of Text to Facilitate Retrieval of Arabic Documents
Arabic content on the internet has increased over the web because of the growth of the number of Arabic persons who use the internet in the world. Accordingly, this study introduces an automatic approach of domain extrac...
Faculty’s Social Media usage in Higher Education Embrace Change or Left Behind
This paper addresses faculty members’ (academic staff) viewpoints on benefits, barriers and concerns of utilizing social media and also investigates differences with respect to their social media experience in teaching,...
Applications of Multi-criteria Decision Making in Software Engineering
Every complex problem now days require multicriteria decision making to get to the desired solution. Numerous Multi-criteria decision making (MCDM) approaches have evolved over recent time to accommodate various applicat...
A Proposed Adaptive Scheme for Arabic Part-of Speech Tagging
This paper presents an Arabic-compliant part-of-speech (POS) tagging scheme based on using atomic tag markers that are grouped together using brackets. This scheme promotes the speedy production of annotations while pres...