A Web Search Engine based approach to Measure the Semantic Similarity between Words using Page Count and Snippets Method (PCSM)

Abstract

Semantic similarity measures play an important role in Information Retrieval, Natural Language Processing and Web Mining applications such as community mining, relation detection, entity disambiguation and document clustering etc. This paper proposes Page Count and Snippets Method (PCSM) to estimate semantic similarity between any two words (or entities) based on page counts and text snippets retrieved from a web search engine. It defines five page count based concurrence measures and integrates them with lexical patterns extracted from text snippets. A lexical pattern extraction algorithm is proposed to identify the semantic relations that exist between any query word pair. Similarity score of both methods are integrated by using Support Vector Machine (SVM) to get optimal results. The proposed method is compared with Miller and Charles (MC) benchmark data sets and the performance is measured by using Pearson correlation value. The correlation value of proposed method is 0.8960% which is higher than existing methods. The PCSM also evaluates semantic relations between named entities to improve Precision, Recall and F-score

Authors and Affiliations

Ms. Vaishali Nirgude , Dr. Rekha Sharma , Dr. R. R. Sedamkar

Keywords

Related Articles

Data Mining: Estimation of Missing Values Using Lagrange Interpolation Technique

In the real world Most of the datasets have missing data. The presence of missing values in a dataset can affect the performance of Mining Algorithms. In this paper we are using Lagrange interpolation method for predicti...

FPGA IMPLEMENTATION OF FOUR PHASE CODE DESIGN USING MODIFIED GENETIC ALGORITHM (MGA)  

The proposed architecture consists of an efficient VLSI hardware implementation of the Modified Genetic Algorithm for identifying the good pulse compression sequences based on Discrimination Factor. The main advantag...

Extensive study of image enhancement via stochastic optimization technique:MPSO 

Recent literatures show how modified particle swarm had achieved its name and fame over its parental algorithm called as PSO by optimizing. In this paper we exploit its advantage over image enhancement for improvin...

Smart antenna for wi-max radio system  

In simple words, smart antenna is such that it can sense its environment and can adjust its gain in different directions accordingly. They provide a smart solution to the problem of communication traffic overload i.e. th...

A Comparative Study of Conditional Privacy Preservation Approaches in VANET’S  

Conditional Privacy preservation in VANETs (Vehicular Ad-hoc Networks) must be achieved in the sense that the user related privacy information, including the driver's name, the license plate, speed, position, and t...

Download PDF file
  • EP ID EP115190
  • DOI -
  • Views 68
  • Downloads 0

How To Cite

Ms. Vaishali Nirgude, Dr. Rekha Sharma, Dr. R. R. Sedamkar (2013). A Web Search Engine based approach to Measure the Semantic Similarity between Words using Page Count and Snippets Method (PCSM). International Journal of Advanced Research in Computer Engineering & Technology(IJARCET), 2(7), 2252-2257. https://europub.co.uk/articles/-A-115190