Presentation Open Access
The improvements in data citation in climate sciences are presented on the use case CMIP6 (Coupled Model Intercomparison Project Phase 6) together with its challenges and future directions. A special emphasis lies on the barriers for data citation and approaches to lower these barriers.
Within CMIP5 (Coupled Model Intercomparison Project Phase 5) the citation of the data was not possible prior its long-term archival in the IPCC Data Distribution Centre (DDC). The Reference Data Archive for AR5 (Assessment Report 5) was built up after the submission deadline for part 1 of the AR5. This was too late for many scientific articles. But even the AR5 data in the IPCC DDC is rarely cited in literature in spite of annual download volumes between one and three PBytes. On the other hand, the request for a citation possibility for the evolving CMIP6 data prior to long-term archival came from the CMIP6 data providers. The additional provision of data citations for the project input4MIPs (input data for CMIP6) could raise the scientists’ awareness of the discrepancy between the readiness to cite data and the desire to be cited and get credit.
The CMIP6 Citation Service is a pragmatic approach built on existing services and services under development, such as ESGF (Earth System Grid Federation) as data infrastructure component, DataCite as DOI registration agency, and Scholix services for tracking data usage information.
Other principles followed to overcome barriers of data citation are:
The CMIP6 Citation Service is an implementation only of the credit part of the RDA WGDC recommendation for the citation of dynamic data. The second part, the identification of the data subset underlying an article, is planned for CMIP7 as a data cart approach comprising multiple pre-defined CMIP6 DataCite DOIs. Additional policies on the long-term data availability are required.