An Efficient Approach towards Duplicate Detection System

Abstract

Information on the web is very huge in size and the tasks of search engines have become more and more complex as a single entity on the web have two or more representations in databases. The duplicate detection is the process of identifying the entities who has multiple representation of the same real world entity, as the duplicate detection methods has to process large datasets, the identification of duplicate document in a large database is a issue significantly with wide-spread applications. In this paper a review on various approaches of duplicate detection will be presented. Proposed system will compare two Duplication detection methods, the first is based on two novel progressive duplicate detection algorithms that significantly increase the efficiency of finding duplicates if the execution time is limited. The second is based on Secure Hashing Algorithm which will detect and delete duplicate data, the secure hash algorithm will perform data de-duplication task in order to overcome the issues of time and to reduce hash collision.

Authors and Affiliations

Miss. Ruchira Dhananjay Deshpande, Sonali Bodkhe

Keywords

Related Articles

Multimodal Biometric System using Fingerprint and Iris

A single biometric identifier in making a personal identification is often not able to meet the desired performance requirements. Biometric identification based on multiple biometrics represents an emerging trend. Autom...

Comparison of Various Transformerless Full-bridge Topologies for Photovoltaic Grid -Tied Inverters

If there is no transformer is used in the single phase grid tied photovoltaic system, then the electrical connection live between the grid and the PV array. In this condition, the generated common mode voltage largely d...

Power Factor Corrected Zeta Converter Based SMPS with High Frequency Isolation

T Multiple output Switched Mode Power Supplies (SMPSs) for personal computers (PCs) and other equipments are normally depict extremely bad power quality indices at the utility interface such as total harmonic distortion...

slugStudy of Signcryption and ECC scheme

the explosive growth is the use of mobile devices demands a new generation of PKC scheme that includes limitations of power and bandwidth at the same time to provide sufficient level of security for such devices, the fu...

Static Load Balancing Using ASA Max-Min Algorithm

In recent times, a huge demand for computational resources has led to the development of large network known as a Grid [1]. A grid allows resources to be acquired in real time on an on-demand basis making sophisticated...

Download PDF file
  • EP ID EP23038
  • DOI -
  • Views 228
  • Downloads 4

How To Cite

Miss. Ruchira Dhananjay Deshpande, Sonali Bodkhe (2017). An Efficient Approach towards Duplicate Detection System. International Journal for Research in Applied Science and Engineering Technology (IJRASET), 5(1), -. https://europub.co.uk/articles/-A-23038