Published June 22, 2022 | Version v1
Presentation Open

Bringing Clarity to Solid Earth Chemistry Data: A Standardized Database Derived From Cyberinfrastructures

Creators

Description

The fast development of statistical methods and machine learning algorithms applied to data from cyberinfrastructures offer genuine opportunities to reveal the secular evolution and chemistry of the solid Earth. However, in their present state, cyberinfrastructures are composed of raw data with missing categories, a non-negligible proportion of errors, including age information, and misplaced chemical compositions that may be inconsistent with publications. These unintended errors are caused mainly by the lack of standards to publish hard rock geochemistry data and the limitations of the optical character recognition techniques used to convert tablet data into readable documents. Furthermore, cleaning data and manually inputting data is tedious and time-consuming. In this study, we will present a standardized database that results from systematic data cleaning and data inputting in cyberinfrastructures through the joint efforts of geochemists. The high degree of reliability and comprehensiveness obtained from this work has direct benefits for the research community. To facilitate its access, the standardized database will be freely available and published as supplements in geochemistry science papers. In addition, we will upload the standardized data to GEOROC and EarthChem to improve the data quality of the cyberinfrastructures.

Files

Files (5.7 MB)

Name Size Download all
md5:ce63f92cb2576b6561fb59fa8f70401e
5.7 MB Download

Additional details