Published April 25, 2022
| Version v1
Journal article
Open
Data Duplication Removal using File Checksum
Authors/Creators
Description
The project enables the user to check for any duplicates in the database by checking the hash value of the file uploaded. If the file already exists in the database, it won’t be stored otherwise the file will be saved in the database. The goal of the project is to develop software that uses file checksums to prevent data duplication. The project's main goal is to reduce the number of duplicates in the database, particularly the key-value store, to improve process performance so that the backup window is not impacted, and to design for horizontal scaling so that it can compete on a Cloud Platform.
Files
IJCT-V9I2P41.pdf
Files
(975.6 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:0b2da7f26dc329e936116a688430e22b
|
975.6 kB | Preview Download |