New Techniques to Enhance Data Deduplication using Content based-TTTD Chunking Algorithm

Abstract

Due to the fast indiscriminate increase of digital data, data reduction has acquired increasing concentration and became a popular approach in large-scale storage systems. One of the most effective approaches for data reduction is Data Deduplication technique in which the redundant data at the file or sub-file level is detected and identifies by using a hash algorithm. Data Deduplication showed that it was much more efficient than the conventional compression technique in large-scale storage systems in terms of space reduction. Two Threshold Two Divisor (TTTD) chunking algorithm is one of the popular chunking algorithm used in deduplication. This algorithm needs time and many system resources to compute its chunk boundary. This paper presents new techniques to enhance TTTD chunking algorithm using a new fingerprint function, a multi-level hashing and matching technique, new indexing technique to store the Metadata. These new techniques consist of four hashing algorithm to solve the collision problem and adding a new chunk condition to the TTTD chunking conditions in order to increase the number of the small chunks which leads to increasing the Deduplication Ratio. This enhancement improves the Deduplication Ratio produced by TTTD algorithm and reduces the system resources needed by this algorithm. The proposed algorithm is tested in terms of Deduplication Ratio, execution time, and Metadata size.

Authors and Affiliations

Hala AbdulSalam Jasim, Assmaa A. Fahad

Keywords

Related Articles

 Energy-Efficient, Noise-Tolerant CMOS Domino VLSI Circuits in VDSM Technology

 Compared to static CMOS logic, dynamic logic offers good performance. Wide fan-in dynamic logic such as domino is often used in performance critical paths, to achieve high speeds where static CMOS fails to meet per...

Improving Vertical Handoffs Using Mobility Prediction

The recent advances in wireless communications require integration of multiple network technologies in order to satisfy the increasing demand of mobile users. Mobility in such a heterogeneous environment entails that use...

 ICT for Education

 This paper presents the modeling, design and implementation of a learning platform in Cameroon. This platform contains structured knowledge acquisition modules as well as teaching, learning and assessment modules t...

The use of Harmonic Balance in Wave Concept Iterative Method for Nonlinear Radio Frequency Circuit Simulation

This paper presents the birth of the new hybrid method for the non-linear Radio frequency circuits’ simulation. This method is based on the combination of the wave concept iterative procedure (WCIP) and the harmonic bala...

Storage Consumption Reduction using Improved Inverted Indexing for Similarity Search on LINGO Profiles

Millions of compounds which exist in huge datasets are represented using Simplified Molecular-Input Line- Entry System (SMILES) representation. Fragmenting SMILES strings into overlapping substrings of a defined size cal...

Download PDF file
  • EP ID EP315574
  • DOI 10.14569/IJACSA.2018.090515
  • Views 115
  • Downloads 0

How To Cite

Hala AbdulSalam Jasim, Assmaa A. Fahad (2018). New Techniques to Enhance Data Deduplication using Content based-TTTD Chunking Algorithm. International Journal of Advanced Computer Science & Applications, 9(5), 116-121. https://europub.co.uk/articles/-A-315574