Automatic detection of duplicate records in institutional repositories
Creators
- 1. Open University, United Kingdom
Description
The prevalence of multiple copies of articles in repositories presents a significant challenge in maintaining the integrity and clarity of the research graph. Issues such as processing errors and lack of communication between co-authors contribute to the existence of duplicates and near-duplicate records. The CORE Dashboard Versions and Duplicates module was developed to address this issue by providing an innovative tool to identify versions and duplicates within repositories. The system facilitates side-by-side comparison and labelling of versions and exact duplicates for removal.
This presentation will report on the experience and the collective feedback from repository managers and give an update on the efforts to integrate duplicate and near-duplicate matching into the deposit workflow.
Files
OR2024_dedup.pdf
Files
(2.0 MB)
Name | Size | Download all |
---|---|---|
md5:5e46f05f86544484752d80cc8e5029cc
|
2.0 MB | Preview Download |