An Efficient Approach towards Duplicate Detection System

Abstract

Information on the web is very huge in size and the tasks of search engines have become more and more complex as a single entity on the web have two or more representations in databases. The duplicate detection is the process of identifying the entities who has multiple representation of the same real world entity, as the duplicate detection methods has to process large datasets, the identification of duplicate document in a large database is a issue significantly with wide-spread applications. In this paper a review on various approaches of duplicate detection will be presented. Proposed system will compare two Duplication detection methods, the first is based on two novel progressive duplicate detection algorithms that significantly increase the efficiency of finding duplicates if the execution time is limited. The second is based on Secure Hashing Algorithm which will detect and delete duplicate data, the secure hash algorithm will perform data de-duplication task in order to overcome the issues of time and to reduce hash collision.

Authors and Affiliations

Miss. Ruchira Dhananjay Deshpande, Sonali Bodkhe

Keywords

Related Articles

Designing A Low Pass Fir Digital Filter By Using Rectangular Window and Blackman Window Technique

digital filter plays an important role in today’s field of communication. Without digital filter we cannot think about proper communication because noise occurs in channel. For removing of noise we use various type of d...

Application of Pinch Technology in Refrigerator Condenser Optimization –A Review

In the refrigerator condenser the study of condenser is main objective for the improvement of system output. Pinch technology and computational fluid dynamics is key for study the condenser and enhance the better option...

A Fusion Cloud Method for Protected Authorised Deduplication

Data de-duplication is one of important data compression techniques for eliminating duplicate copies of same data, and has been used in area of cloud storage to reduce the amount of storage space and saving the more ban...

Performance Evaluation of QOS Routing in Computer Network

This paper evaluates “Optimized Link State Routing Protocol” (OLSR) routing measurement performance analysis based on different simulation parameters. We have used NS-2 simulator tools for the performance of OLSR routin...

slugOptimization of Crankshaft Diameter Using Genetic Algorithm

The crankshaft of the four cylinder four stroke petrol engine is considered here for the design and optimization purpose.. The high performance of engines greatly depends on the overall dimension of the engine itself of...

Download PDF file
  • EP ID EP23038
  • DOI -
  • Views 265
  • Downloads 4

How To Cite

Miss. Ruchira Dhananjay Deshpande, Sonali Bodkhe (2017). An Efficient Approach towards Duplicate Detection System. International Journal for Research in Applied Science and Engineering Technology (IJRASET), 5(1), -. https://europub.co.uk/articles/-A-23038