Investigate the use of Anchor-Text and of Query-Document Similarity Scores to Predict the Performance of Search Engine

Abstract

Query difficulty prediction aims to estimate, in advance, whether the answers returned by search engines in response to a query are likely to be useful. This paper proposes new predictors based upon the similarity between the query and answer documents, as calculated by the three different models. It examined the use of anchor text-based document surrogates, and how their similarity to queries can be used to estimate query difficulty. It evaluated the performance of the predictors based on 1) the correlation between the average precision (AP), 2) the precision at 10 (P@10) of the full text retrieved results, 3) a similarity score of anchor text, and 4) a similarity score of full-text, using the WT10g data collection of web data. Experimental evaluation of our research shows that five of our proposed predictors demonstrate reliable and consistent performance across a variety of different retrieval models.

Authors and Affiliations

Abdulmohsen Almalawi, Rayed AlGhamdi, Adel Fahad

Keywords

Related Articles

Development of Rest Facility Information Exchange System by Utilizing Delay Tolerant Network

In this paper, we propose temporary rest facilities information exchange system among many people unable to get home by utilizing Delay Tolerant Network (DTN) after a disaster. When public transportation services are int...

A Comparative Study between Applications Developed for Android and iOS

Now-a-days, mobile applications implement complex functionalities that use device’s core features extensively. This paper realizes a performance analysis of the most important core features used frequently in mobile appl...

Semantic Searching and Ranking of Documents using Hybrid Learning System and WordNet

Semantic searching seeks to improve search accuracy of the search engine by understanding searcher’s intent and the contextual meaning of the terms present in the query to retrieve more relevant results. To find out the...

PRIVACY-PRESERVING CLUSTERING USING REPRESENTATIVES OVER ARBITRARILY PARTITIONED DATA

The challenge in privacy-preserving data mining is avoiding the invasion of personal data privacy. Secure computa- tion provides a solution to this problem. With the development of this technique, fully homomorphic encry...

Factors Influencing Patients’ Attitudes to Exchange Electronic Health Information in Saudi Arabia: An Exploratory Study

Health Information Exchange (HIE) systems electronically transfer patients’ clinical, demographic, and health-related information between different care providers. These exchanges offer improved health care quality, redu...

Download PDF file
  • EP ID EP240767
  • DOI 10.14569/IJACSA.2017.081140
  • Views 90
  • Downloads 0

How To Cite

Abdulmohsen Almalawi, Rayed AlGhamdi, Adel Fahad (2017). Investigate the use of Anchor-Text and of Query-Document Similarity Scores to Predict the Performance of Search Engine. International Journal of Advanced Computer Science & Applications, 8(11), 320-332. https://europub.co.uk/articles/-A-240767