hklmirs: Reproducible Research Compendium for "Improving Models to Predict Holocellulose and Klason Lignin Contents for Peat Soil Organic Matter with Mid Infrared Spectra" and "Predicting Absolute Holocellulose and Klason Lignin Contents for Peat Remains Challenging"
Creators
- 1. Biogeochemistry and Ecohydrology Group, Institue for Landscape Ecology, University of Münsterlo
Description
hklmirs
This repository contains the data and code for our two manuscripts (in preparation):
Henning Teickner and Klaus-Holger Knorr (in preparation): Improving Models to Predict Holocellulose and Klason Lignin Contents for Peat Soil Organic Matter with Mid Infrared Spectra.
Henning Teickner and Klaus-Holger Knorr (in preparation): Predicting Absolute Holocellulose and Klason Lignin Contents for Peat Remains Challenging.
How to cite
Please cite this compendium as:
Henning Teickner and Klaus-Holger Knorr, (2022). Compendium of R code and data for “Improving Models to Predict Holocellulose and Klason Lignin Contents for Peat Soil Organic Matter with Mid Infrared Spectra” and “Predicting Absolute Holocellulose and Klason Lignin Contents for Peat Remains Challenging”. Accessed 12 Mar 2022. Online at https://doi.org/10.5281/zenodo.6325760
Contents
The analysis directory contains:
-
:file_folder: paper: R Markdown source documents needed to reproduce the manuscript, including figures and tables. The main script is 001-paper-main.Rmd. This script produces both manuscripts and the corresponding supplementary information. Additional scripts are:
- 002-paper-m-original-models.Rmd: Computes the original models used in Hodgkins et al. (2018) and models with the same model structure, but as Bayesian models.
- 003-paper-m-gaussian-beta.Rmd: Computes models assuming a Beta distribution for holocellulose and Klason lignin contents and compares them to the original models.
- 004-paper-m-reduce-underfitting.Rmd: Extents the Beta regression models by including additional variables (additional peaks) or using a different approach (using measured spectral intensities of binned spectra instead of extracted peaks), and validates these models using LOO-CV.
- 005-paper-m-minerals.Rmd: Uses the models from
003-paper-m-gaussian-beta.Rmd
to test how accurate a model for holocellulose content is which is also calibrated on training samples with higher mineral contents. - 006-paper-m-prediction-domain.Rmd: Analyzes the prediction domain (Wadoux et al. 2021) of the original models and the modified models and identifie under which conditions models extrapolate for peat and vegetation smaples from Hodgkins et al. (2018).
- 007-paper-m-prediction-differences.Rmd: Compares predictions for the training data and the peat and vegetation data from Hodgkins et al. (2018) for the original models from Hodgkins et al. (2018) and the modified models from
004-paper-m-reduce-underfitting.Rmd
. - 008-paper-supplementary.Rmd: Computes supplementary analyses and figures for the first manuscript.
- 001-reply-main.Rmd: This is the main script for manuscript 2. It is run from within
001-paper-main.Rmd
and produces the supplementary information for manuscript 2. - 002-reply-main.Rmd: This script produces the document for manuscript 2. It is run from within
001-reply-main.Rmd
.
-
:file_folder: data: Data used in the analysis. Note that raw data is not stored in :file_folder: raw_data (empty folder), but in :file_folder: /inst/extdata. :file_folder: derived_data contains derived data computed from the scripts. The raw data are derived from Hodgkins et al. (2018).
-
:file_folder: stan_models: The Stan model used in
001-reply-main.Rmd
.
The other folders in this directory follow the standard naming scheme and function of folders in R packages. There are the following directories and files:
README.md
/README.Rmd
: Readme for the compendium.DESCRIPTION
: The R package DESCRIPTION file for the compendium.NAMESPACE
: The R package NAMESPACE file for the compendium.LICENSE.md
: Details on the license for the code in the compendium.CONTRIBUTING.md
andCONDUCT.md
: Files with information on how to contribute to the compendium.Dockerfile
: Dockerfile to build a Docker image for the compendium..Rbuildignore
,.gitignore
,.dockerignore
: Files to ignore during R package building, to ignore by Git, and to ignore while building a Docker image, respectively.renv.lock
: renv lock file (Lists all R package dependencies and versions and can be used to restore the R package library using renv).renv.lock
was created by runningrenv::snapshot()
in the R package directory and it uses the information included in theDESCRIPTION
file..Rprofile
: Code to run upon opening the R-project.R
,man
,inst
,data-raw
,data
,src
: Default folders for making the R package run.- Folder
inst/extdata
: Folder with the raw data used for the analyses. All files in this folder are derived from Hodgkins et al. (2018).
How to run in your broswer or download and run locally
You can download the compendium as a zip from from this URL: https://github.com/henningte/hklmirs/
Or you can install this compendium as an R package, hklmirs, from GitHub with:
remotes::install_github("henningte/hklmirs")
How to use
Reproduce the analyses
To reproduce the analyses for the paper, open the Rstudio project included in this research compendium and run the Rmarkdown script in analysis/paper/001-paper-main.rmd
.
Running the whole script takes about 12 hours and occupies additional disk space of ~2 Gb.
Alternatively, the Dockerfile can be used to build a Docker image from which all analyses can be reproduced. The Dockerfile ensures that all required dependencies are installed (e.g. specific R packages; this is managed using the R package renv).
The Dockerfile provides instructions how to build a Docker image from the Dockerfile and how to run the image in a Docker container. It occupies disk space of ~7 Gb.
When the Docker image runs in a container, go to localhost:8787
in your Browser. You will find an RStudio interface where you can log in with username rstudio
and password hkl
. Here you can find the Rmarkdown scripts (hklmirs/analysis/paper/001-paper-main.rmd
) as described above.
Licenses
Text and figures : CC-BY-4.0
Code : See the DESCRIPTION file
Data : CC-0 attribution requested in reuse. See the sources section for licenses for data derived from external sources and how to give credit to the original author(s) and the source.
Sources
All files in inst/extdata
are derived from Hodgkins et al. (2018). These data are licensed under the CC-BY 4.0 license (see https://www.nature.com/articles/s41467-018-06050-2#rightslink).
The format of this research compendium is inspired by Marwick, Boettiger, and Mullen (2018) and was created with rrtools (Marwick 2019). The Rmarkdown template for the main article is from the rticles package (Allaire et al. 2020).
Contributions
We welcome contributions from everyone. Before you get started, please see our contributor guidelines. Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.
Funding
This study was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) grant no. KN 929/23-1 to Klaus-Holger Knorr and grant no. PE 1632/18-1 to Edzer Pebesma. We acknowledge support from the Open Access Publication Fund of the University of Münster.
References
Allaire, JJ, Yihui Xie, R Foundation, Hadley Wickham, Journal of Statistical Software, Ramnath Vaidyanathan, Association for Computing Machinery, et al. 2020. Rticles: Article Formats for R Markdown. Manual.
Hodgkins, Suzanne B., Curtis J. Richardson, René Dommain, Hongjun Wang, Paul H. Glaser, Brittany Verbeke, B. Rose Winkler, et al. 2018. “Tropical Peatland Carbon Storage Linked to Global Latitudinal Trends in Peat Recalcitrance.” Nature Communications 9 (1): 3640. https://doi.org/10.1038/s41467-018-06050-2.
Marwick, Ben. 2019. “Rrtools: Creates a Reproducible Research Compendium.”
Marwick, Ben, Carl Boettiger, and Lincoln Mullen. 2018. “Packaging Data Analytical Work Reproducibly Using R (and Friends).” The American Statistician 72 (1): 80–88. https://doi.org/10.1080/00031305.2017.1375986.
Wadoux, Alexandre M. J.-C., Brendan Malone, Budiman Minasny, Mario Fajardo, and Alex B. McBratney. 2021. Soil Spectral Inference with R: Analysing Digital Soil Spectra Using the R Programming Environment. Progress in Soil Science. Cham: Springer International Publishing. https://doi.org/10.1007/978-3-030-64896-1.
Notes
Files
henningte/hklmirs-v0.1.2.zip
Files
(5.3 MB)
Name | Size | Download all |
---|---|---|
md5:462cfa7b623bd6704a426d335ba36406
|
5.3 MB | Preview Download |
Additional details
Related works
- Is supplement to
- https://github.com/henningte/hklmirs/tree/v0.1.2 (URL)