Reuse of governmental data in research - A guide to open and FAIR data provision
Description
Governmental organizations collect and manage a variety of different types of data at different levels in order to fulfil their official tasks. These include geographical, environmental, meteorological, demographic, health, traffic, transport, financial and economic data. Access to this data has traditionally been severely restricted. Over the past ten years, however, there has been a global trend towards a more open data policy, which has been promoted by directives such as GeoIDG, the PSI Directive and INSPIRE. In Germany, the federal states and their authorities have also introduced an open data policy and make some of this data available to the public via platforms such as Destatis or GDI-DE (Open Government Data). This data is used for a variety of purposes, including determining location, analysing environmental trends, transport planning, health planning and more. Although this data is increasingly being used for scientific research, its full potential often remains unrealised, especially for large datasets. Despite the high quality of public authority data, further adaptation to the FAIR principles (Findable, Accessible, Interoperable, Reusable) is necessary to improve its reusability for research. However, data protection regulations and legal frameworks may impose restrictions that make it necessary to anonymise the data or comply with modern data standards. Nevertheless, government data is a valuable resource that makes a significant contribution to increasing knowledge in all scientific disciplines.
As part of a pilot project funded by NFDI4Earth, the Deutscher Wetterdienst (DWD) and the German Climate Computing Centre (DKRZ) worked together to facilitate access to data from public authorities, increase the visibility of this data and increase the number of users from various disciplines. The aim was to make the data available in standardised and FAIR-compliant formats for research and other public applications. The DWD's COSMO-REA6 reanalysis dataset (Kaspar et al. 2020), which is of central importance for climate modelling, analyses and energy applications in Europe, was selected as an application example. The standardisation process involved the conversion of regulatory data standards into domain-specific climate research standards and required close collaboration between DWD and DKRZ. After careful curation and quality checking, the dataset was made accessible via the ESGF infrastructure and archived in the WDCC for the long term, taking into account aspects of licensing and authorship.
The project's insights and lessons learned were incorporated into a blueprint (Anders et al. 2024), providing guidance on making data from other authorities accessible and usable for both research and the public. Overall, the entire process can be divided into 5 sub-steps: (1) determination and classification of the need, (2) survey of the feasibility, (3) implementation, (4) feedback and follow-up, (5) dissemination. This blueprint outlines generalizable steps and aspects applicable across domains and collaborators, offering a framework for optimizing the use of governmental data in diverse fields.
Notes
Files
E-science-Tage_2025_publish_Anders.pdf
Files
(1.1 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:1770b6c87ee0b3d355e214d916056c43
|
1.1 MB | Preview Download |
Additional details
Funding
References
- Kaspar, F., et al., 2020: Regional atmospheric reanalysis activities at Deutscher Wetterdienst: review of evaluation results and application examples with a focus on renewable energy, Adv. Sci. Res., 17, 115–128, https://doi.org/10.5194/asr-17-115-2020, 2020.
- Anders, I., et al.,2024: Standardisation and making public authority data available for research. Zenodo. https://doi.org/10.5281/zenodo.10948876