EmpiricalAnalysis of Document Similarity Using Statistical Model
Journal Title: International Journal of engineering Research and Applications - Year 2017, Vol 7, Issue 6
Abstract
Information retrieval is great technology behind web search services. This paper presents the statistical method for content based information. Mainly three paradigms of models are used in retrieving information. These are Boolean, probabilistic and vector space model. This paper also presents empirical studies of document similarity and discusses the issue of information retrieval system using statistical model. Vector space model is classical and most used retrieval model. The operation of retrieving information is calculated by using the cosine similarity function of query vector and set of documents vector. Finally, we concludethe results with human score various type documents like sports, politics and short stories.
Authors and Affiliations
Jyoti Phogat, Atul Kumar
Geoengineering Characterization of the Rock Masses of Northern Face of Jabal Sabir, Taiz City, Yemen
This paper is aimed at the description and the geotechnical characterization of the Tertiary granitic rock masses of the northern face of Sabir Mountain, Taiz city, Yemen, for the first time. For accomplishing this task,...
Design of Fast Fourier Transform using Radix-2 Butterfly using Shift Register and Folding technique
The Discrete Fourier Transform (DFT) is an important technique in the field of Digital Signal Processing (DSP) and Telecommunications, especially for applications in Orthogonal Frequency Division Multiplexing (OFDM) syst...
A Methodical Study Of Web Crawler
World Wide Web (or simply web) is a massive, wealthy, preferable, effortlessly available and appropriate source of information and its users are increasing very swiftly now a day. To salvage information from web, search...
Security Implementation on EAV model using Negative database and Shuffling.
This Paper presents an improvised security mechanism for EAV (Entity Attribute Value) data model. EAV data model for data storage has been used in various information systems now days as it gives an advantage of data fle...
Design and Development of Battery Capacity Management(BCM) Gauge
Environmental Issues Are Gaining More Importance Nowadays Due To Their Effect On Human Lives. The Major Percentage Of Environmental Pollution Is Due To Automobiles And Other Pollutant Industries. Hence There Is A Move To...