Thesis Open Access
PhD thesis compilation of articles, including an introduction and synopsis, and the defense presentation (one document with slides only and one with speaker notes).
Abstract
Reproducibility of computational research, i.e., research based on code and data, poses enormous challenges to all branches of science. In this dissertation, technologies and practices are developed to increase reproducibility and to connect it better with the process of scholarly communication with a particular focus on geography, geosciences, and GIScience. Based on containerisation, this body of work creates a platform that connects existing academic infrastructures with a newly established executable research compendium (ERC). It is shown how the ERC can improve transparency, understandability, reproducibility, and reusability of research outcomes, e.g., for peer review, by capturing all parts of a workflow for computational research. The core part of the ERC platform is software that can automatically capture the computing environment, requiring authors only to create computational notebooks, which are digital documents that combine text and analysis code. The work further investigates how containerisation can be applied independent of ERCs to package complex workflows using the example of remote sensing, to support data science in general, and to facilitate diverse use cases within the R language community. Based on these technical foundations, the work concludes that functioning practical solutions exist for making reproducibility possible through infrastructure and making reproducibility easy through user experience. Several downstream applications built on top of ERCs provide novel ways to discover and inspect the next generation of publications.
To understand why reproducible research has not been widely adopted and to contribute to the propagation of reproducible research practices, the dissertation continues to investigate the state of reproducibility in GIScience and develops and demonstrates workflows that can better integrate the execution of computational analyses into peer review procedures.
We make recommendations for how to (re)introduce reproducible research into peer reviewing
and how to make practices to achieve the highest possible reproducibility normative, rewarding, and, ultimately, required in science. These recommendations are rest upon over 100 GIScience papers which were assessed as irreproducible, the experiences from over 30 successful reproductions of workflows across diverse scientific fields, and the lessons learned from implementing the ERC.
Besides continuing the development of the contributed concepts and infrastructure, the dissertation points out broader topics of future work, such as surveying practices for code execution during peer review of manuscripts, or reproduction and replication studies of the fundamental works in the considered scientific disciplines. The technical and social barriers to higher reproducibility are strongly intertwined with other transformations in academia, and, therefore, improving reproducibility meets similar challenges around culture change and sustainability. However, we clearly show that reproducible research is achievable today using the newly developed infrastructures and practices. The transferability of cross-disciplinary lessons facilitates the establishment of reproducible research practices and, more than other transformations, the movement towards greater reproducibility can draw from accessible and convincing arguments both for individual researchers as well as for their communities.
Name | Size | |
---|---|---|
PhD Daniel Nüst - WWU Münster - 2022.pdf
md5:548d39b0f35ef107e9da88815a2144e3 |
24.1 MB | Download |
PhD Defense Daniel Nüst - WWU Münster - 2022-02-14.pdf
md5:3e85d1af2b7b79390a5e7fd6f673359b |
24.7 MB | Download |
PhD Defense Daniel Nüst incl speaker notes - WWU Münster - 2022-02-14.pdf
md5:0d4a9fe8b615e05bb2ed408d8a7a8187 |
24.9 MB | Download |
Knoth, C., & Nüst, D. (2017). Reproducibility and Practical Adoption of GEOBIA with Open-Source Software in Docker Containers. Remote Sensing, 9(3), 290. https://doi.org/10. 3390/rs9030290
Konkol, M., Nüst, D., & Goulier, L. (2020). Publishing computational research - a review of infrastructures for reproducible and transparent scholarly communication. Research Integrity and Peer Review, 5(1), 10. https://doi.org/10.1186/s41073-020-00095-y
Nüst, D. (2021). A web service for executable research compendia enables reproducible publications and transparent reviews in geospatial sciences. Zenodo. https://doi.org/10.5281/zenodo.5108218
Nüst, D., Eddelbuettel, D., Bennett, D., Cannoodt, R., Clark, D., Daróczi, G., Edmondson, M., Fay, C., Hughes, E., Kjeldgaard, L., Lopp, S., Marwick, B., Nolis, H., Nolis, J., Ooi, H., Ram, K., Ross, N., Shepherd, L., Sólymos, P., Swetnam, T. L., Turaga, N., Petegem, C. V., Williams, J., Willis, C., & Xiao, N. (2020). The Rockerverse: Packages and Applications for Containerisation with R. The R Journal, 12(1). https://doi.org/10.32614/RJ-2020-007
Nüst, D., & Hinz, M. (2019). containerit: Generating Dockerfiles for reproducible research with R. Journal of Open Source Software, 4(40), 1603. https://doi.org/10.21105/joss.01603
Nüst, D., Konkol, M., Pebesma, E., Kray, C., Schutzeichel, M., Przibytzin, H., & Lorenz, J. (2017). Opening the Publication Process with Executable Research Compendia. D-Lib Magazine, 23(1/2). https://doi.org/10.1045/january2017-nuest
Nüst, D., & Pebesma, E. (2021). Practical reproducibility in geography and geosciences. Annals of the American Association of Geographers, 111(5), 1300–1310. https://doi.org/10. 1080/24694452.2020.1806028
Nüst, D., Sochat, V., Marwick, B., Eglen, S. J., Head, T., Hirst, T., & Evans, B. D. (2020). Ten simple rules for writing Dockerfiles for reproducible data science. PLOS Computational Biology, 16(11), 1–24. https://doi.org/10.1371/journal.pcbi.1008316
Niers, T., & Nüst, D. (2020). Geospatial Metadata for Discovery in Scholarly Publishing. Septentrio Conference Series, 4. https://doi.org/10.7557/5.5590
Nüst, D., Boettiger, C., & Marwick, B. (2018). How to Read a Research Compendium. arXiv:1806.09525 [Cs]. http://arxiv.org/abs/1806.09525
Nüst, D., & Eglen, S. J. (2021). CODECHECK: An Open Science initiative for the independent execution of computations underlying research articles during peer review to improve reproducibility. F1000Research, 10, 253. https://doi.org/10.12688/f1000research.51738.1
Nüst, D., Granell, C., Hofer, B., Konkol, M., Ostermann, F. O., Sileryte, R., & Cerutti, V. (2018). Reproducible research and GIScience: An evaluation using AGILE conference papers. PeerJ, 6, e5072. https://doi.org/10.7717/peerj.5072
Nüst, D., Lohoff, L., Einfeldt, L., Gavish, N., Götza, M., Jaswal, S. T., Khalid, S., Meierkort, L., Mohr, M., Rendel, C., & Eek, A. van. (2019). Guerrilla Badges for Reproducible Geospatial Data Science. AGILE Short Papers. https://doi.org/10.31223/osf.io/xtsqh
Ostermann, F. O., Nüst, D., Granell, C., Hofer, B., & Konkol, M. (2020). Reproducible Research and GIScience: An evaluation using GIScience conference papers. EarthArXiv. https://doi.org/10.31223/X5ZK5V
All versions | This version | |
---|---|---|
Views | 900 | 900 |
Downloads | 389 | 389 |
Data volume | 9.4 GB | 9.4 GB |
Unique views | 720 | 720 |
Unique downloads | 332 | 332 |