Presentation Open Access

Documentation to Foster Sharing and Use of Open Earth Science Data: Quality Information

Downs, Robert R.; Peng, Ge; Moroni, David F.; Ramapriyan, Hampapuram K.; Wei, Yaxing

Providing capabilities to reuse open Earth science data offers opportunities to leverage observations from previously conducted research so that new research can be conducted. By supporting data producers in their efforts to share the data that they have collected, scientific data repositories enable broader audiences to reuse these data. However, simply providing access to data is not enough to facilitate understanding of the data by diverse users who did not collect the data and are not trained in the discipline represented by the data. Moreover, simply providing access to data is not enough to foster reuse by transdisciplinary users, practitioners, learners, and members of the general public who could benefit from reusing research data. If plans for the dissemination of data include reuse across disciplinary boundaries, including providing opportunities for non-scientific reuse, then the data must be packaged, described, and discoverable in a way that allows potential users to understand the data in terms of their own objectives. Effectively packaging, documenting, and indexing data offers opportunities to facilitate understanding of data and to foster reuse of data products and services by broader audiences as well as by those who are more familiar with the data. Furthermore, such data curation activities also can reduce the potential for misuse of the data.

What is often missing is the documentation of data quality that is needed to support the effective use of open data. In addition to ensuring that data are designated as open, the quality of datasets also must be described to foster more informed and proper usage. Information about the quality of open data should be clearly documented and discoverable such that it is available to all potential users of the data. When searching for open data to complete a project, potential users require information about the quality of the data to determine whether the quality of the data is sufficient to meet their data usage needs. Providing data quality information reduces the need to make assumptions when selecting among candidate data products and services for potential reuse. Similarly, data users require information on the data quality when deciding on the applicability of data and on the methods to be used for analyzing and interpreting the data. Information about the data quality should be clearly described so that it can be easily understood by potential users from cross-disciplinary fields, as well as by practitioners, such as operational forecasters, planners and decision makers.

Reuse scenarios demonstrate the need for providing understandable data quality information along with the data. For example, when identifying open data products for possible integration to create new data products or services, interoperable data integration workflows are developed to ensure that the quality of each candidate dataset is described. Data quality descriptions enable candidate data products to be assessed in terms of their compatibility with each other and their applicability for the purposes of the proposed data integration project. In these workflows, the quality of the resulting integrated dataset must be described in appropriately curated discovery metadata and in the data documentation to foster traceable, targeted, and efficient decision making about the potential of each data product and to support the data integration effort. The importance of documenting data quality is described and the value of effectively documenting and packaging data quality information is demonstrated with real-world research and operational use cases.

Presented by Dr. Robert R. Downs to the 16th International Digital Curation Conference, 19 April 2021, Edinburgh, Scotland and Virtual
Files (327.4 kB)
Name Size
Downs-IDCC21DocShareUseESDataQualityInfoSlides20210419.pdf
md5:17e127f3af8c669491a48053da437c2b
238.1 kB Download
IDCC21DownsDocShareUseESDataQualityInfoAbstract20210419.pdf
md5:de5a5b38c7ecd065f2580550ce66f2ce
89.3 kB Download
  • Wilkinson, et al. 2016. The FAIR Guiding Principles for Scientific Data Management and Stewardship. Sci Data 3. https://doi.org/10.1038/sdata.2016.18

  • Carroll, et al. 2020. The CARE Principles for Indigenous Data Governance. Data Science Journal, 19(1), DOI: http://doi.org/10.5334/dsj-2020-043

  • Lin, et al. 2020. The TRUST Principles for Digital Repositories. Scientific Data 7, 144. https://doi.org/10.1038/s41597-020-0486-7

  • GEOSS Data Sharing Principles. 2016. Group on Earth Observations. https://earthobservations.org/open_eo_data.php

  • GEOSS Data Management Principles. 2015. Group on Earth Observations. https://earthobservations.org/open_eo_data.php

100
57
views
downloads
All versions This version
Views 100100
Downloads 5757
Data volume 11.8 MB11.8 MB
Unique views 9393
Unique downloads 5050

Share

Cite as