Published June 14, 2022 | Version 1
Presentation Open

Synchronic Curation for Assessing Reuse and Integration Fitness of Multiple Data Collections

  • 1. University of Texas at Austin

Description

Data driven applications often require reusing data integrated from different collections. Each of the collections may be very large, evolve over time and present gaps. Between them they may overlap, have conflicting information, or they may complement each other. Thus, a curation need is to continuously evaluate if data from different collections are fit to be integrated for reuse. To assess multiple and large data collections at the same time, we propose a framework called Synchronic Curation (SC). SC involves a process and technical infrastructure to map different collections into a unifying data model. The framework includes a collection assessment and comparison components that allows curators and researchers to track data growth and updates, as well as to evaluate data quality by identifying gaps, changes, differences,, and irregularities in large and frequently updated data collections, and across multiple collections simultaneously. The unifying data model and the assessment results can be reviewed through interactive graphs. In this presentation we describe the SC framework as a multi-dataset diagnosis tool. We demonstrate its implementation with ASTRIAGraph, a space sustainability knowledge system.

Files

Files (10.5 MB)

Name Size Download all
md5:2c474d368e0939507cf652477ec5f42a
10.5 MB Download