Query Expansion in Information Retrieval using Frequent Pattern (FP) Growth Algorithm for Frequent Itemset Search and Association Rules Mining

Abstract

Documents on the Internet have increased in number exponentially; this has resulted in users having difficulty finding documents or information needed. Special techniques are needed to retrieve documents that are relevant to user queries. One technique that can be used is Information Retrieval (IR). IR is the process of finding data (generally documents) in the form of text that matches the information needed from a collection of documents stored on a computer. Problems that often appear on IRs are incorrect user queries; this is caused by user limitations in representing their needs in the query. Researchers have proposed various solutions to overcome these limitations, one of which is to use the Expansion Query (QE). Various methods that have been applied to QE include Ontology, Latent Semantic Indexing (LSI), Local Co-Occurrence, Relevance Feedback, Concept Based, WordNet / Synonym Mapping. However, these methods still have limitations, one of them in terms of displaying the connection or relevance of the appearance of words or phrases in the document collection. To overcome this limitation, in this study we have proposed an approach to QE using the FP-Growth algorithm for the search for frequent itemset and Association Rules (AR) on QE. In this study, we applied the use of AR to QE to display the relevance of the appearance of a word or term with another word or term in the collection of documents, where the term produced is used to perform QE on user queries. The main contribution in this study is the use of Association rules with FP-Growth in the collection of documents to look for the connection of the emergence of words, which is then used to expand the original query of users on IR. For the evaluation of QE performance, we use recall, precision, and f-measure. Based on the research that has been done, it can be concluded that the use of AR on QE can improve the relevance of the documents produced. This is indicated by the average recall, precision, and f-measure values produced at 94.44%, 89.98%, and 92.07%. After comparing the IR process without QE with IR using QE, an increase in recall value was 25.65%, precision was 1.93%, and F-Measure was 15.78%.

Authors and Affiliations

Lasmedi Afuan, Ahmad Ashari, Yohanes Suyanto

Keywords

Related Articles

Nonquadratic Lyapunov Functions for Nonlinear Takagi-Sugeno Discrete Time Uncertain Systems Analysis and Control

This paper deals with the analysis and design of the state feedback fuzzy controller for a class of discrete time Takagi -Sugeno (T-S) fuzzy uncertain systems. The adopted framework is based on the Lyapunov theory and us...

Intelligent System for the use of the Scientific Research Information System

As part of the digital governance of scientific research of Moroccan universities and national research institutions, the Ministry of Higher Education, Scientific Research and Executive Training has shown great interest...

Leisure Technology for the Elderly: A Survey, User Acceptance Testing and Conceptual Design

The Alzheimer’s disease damages neuronal and synaptic system due to the high level of amyloid beta in the brain. It is the common cause of dementia which is more common to afflict the elderly where they will gradually lo...

New Hybrid Task Scheduling Algorithm with Fuzzy Logic Controller in Grid Computing

Distributed heterogeneous architecture is extensively applied to a diversity of large scale research projects conducive to solve complex computational problems. Mentioned distributed systems consist of multiple heterogen...

Effect of Service Broker Policies and Load Balancing Algorithms on the Performance of Large Scale Internet Applications in Cloud Datacenters

Cloud computing is advancing rapidly. With such advancement, it has become possible to develop and host large scale distributed applications on the Internet more economically and more flexibly. However, the geographical...

Download PDF file
  • EP ID EP468349
  • DOI 10.14569/IJACSA.2019.0100235
  • Views 110
  • Downloads 0

How To Cite

Lasmedi Afuan, Ahmad Ashari, Yohanes Suyanto (2019). Query Expansion in Information Retrieval using Frequent Pattern (FP) Growth Algorithm for Frequent Itemset Search and Association Rules Mining. International Journal of Advanced Computer Science & Applications, 10(2), 263-267. https://europub.co.uk/articles/-A-468349