Published April 25, 2022 | Version v1
Journal article Open

Data Duplication Removal using File Checksum

Description

The project enables the user to check for any duplicates in the database by checking the hash value of the file uploaded. If the file already exists in the database, it won’t be stored otherwise the file will be saved in the database. The goal of the project is to develop software that uses file checksums to prevent data duplication. The project's main goal is to reduce the number of duplicates in the database, particularly the key-value store, to improve process performance so that the backup window is not impacted, and to design for horizontal scaling so that it can compete on a Cloud Platform.

Files

IJCT-V9I2P41.pdf

Files (975.6 kB)

Name Size Download all
md5:0b2da7f26dc329e936116a688430e22b
975.6 kB Preview Download