EmpiricalAnalysis of Document Similarity Using Statistical Model

Abstract

Information retrieval is great technology behind web search services. This paper presents the statistical method for content based information. Mainly three paradigms of models are used in retrieving information. These are Boolean, probabilistic and vector space model. This paper also presents empirical studies of document similarity and discusses the issue of information retrieval system using statistical model. Vector space model is classical and most used retrieval model. The operation of retrieving information is calculated by using the cosine similarity function of query vector and set of documents vector. Finally, we concludethe results with human score various type documents like sports, politics and short stories.

Authors and Affiliations

Jyoti Phogat, Atul Kumar

Keywords

Related Articles

Geoengineering Characterization of the Rock Masses of Northern Face of Jabal Sabir, Taiz City, Yemen

This paper is aimed at the description and the geotechnical characterization of the Tertiary granitic rock masses of the northern face of Sabir Mountain, Taiz city, Yemen, for the first time. For accomplishing this task,...

Design of Fast Fourier Transform using Radix-2 Butterfly using Shift Register and Folding technique

The Discrete Fourier Transform (DFT) is an important technique in the field of Digital Signal Processing (DSP) and Telecommunications, especially for applications in Orthogonal Frequency Division Multiplexing (OFDM) syst...

A Methodical Study Of Web Crawler

World Wide Web (or simply web) is a massive, wealthy, preferable, effortlessly available and appropriate source of information and its users are increasing very swiftly now a day. To salvage information from web, search...

Security Implementation on EAV model using Negative database and Shuffling.

This Paper presents an improvised security mechanism for EAV (Entity Attribute Value) data model. EAV data model for data storage has been used in various information systems now days as it gives an advantage of data fle...

Design and Development of Battery Capacity Management(BCM) Gauge

Environmental Issues Are Gaining More Importance Nowadays Due To Their Effect On Human Lives. The Major Percentage Of Environmental Pollution Is Due To Automobiles And Other Pollutant Industries. Hence There Is A Move To...

Download PDF file
  • EP ID EP391519
  • DOI 10.9790/9622-0706074650.
  • Views 120
  • Downloads 0

How To Cite

Jyoti Phogat, Atul Kumar (2017). EmpiricalAnalysis of Document Similarity Using Statistical Model. International Journal of engineering Research and Applications, 7(6), 46-50. https://europub.co.uk/articles/-A-391519