Novel and Efficient approach for Duplicate Record Detection

Mrs. D. V. LalitaParameswari.; K. Mounika

Novel and Efficient approach for Duplicate Record Detection

( Vol-3,Issue-11,November 2016 ) OPEN ACCESS

Author(s):

Mrs. D. V. LalitaParameswari., K. Mounika

Keywords:

Data cleaning, Duplicate detection, Entity Resolution, Progressiveness.

Abstract:

Similarity check of real world entities is a necessary factor in these days which is named as Data Replica Detection. Time is an critical factor today in tracking Data Replica Detection for large data sets, without having impact over quality of Dataset. In this system primarily introduce two Data Replica Detection algorithms, where in these contribute enhanced procedural standards in finding Data Replication at limited execution periods. This system contribute better improvised state of time than conventional techniques. We propose two Data duplicate record detection algorithms namely progressive sorted neighbourhood method (PSNM), which performs best on small and almost clean datasets, progressive blocking (PB), and parallel sorted neighbourhood method which performs best on large and very grimy datasets. Both enhance the efficiency of duplicate detection even on very large datasets.