Requires a tibble (modern data.frame class) in a
specific format (see details & examples) to write the model configuration
file "config_*.json". Each row in the tibble corresponds to a model run.
The generated "config_*.json" is based on a cjson file
(e.g. "lpjml_config.cjson").
write_config(
x,
model_path,
sim_path = NULL,
output_list = c(),
output_list_timestep = "annual",
output_format = NULL,
cjson_filename = "lpjml_config.cjson",
parallel_cores = 4,
debug = FALSE,
params = NULL,
output_path = NULL,
js_filename = NULL
)A tibble in a defined format (see details).
Character string providing the path to LPJmL
(equal to LPJROOT environment variable).
Character string defining path where all simulation data
are written. Also an output, a restart and a configuration folder are
created in sim_path to store respective data. If NULL, model_path is
used.
Character vector containing the "id" of outputvars.
If defined, only these defined outputs will be written. Otherwise, all
outputs set in cjson_filename will be written. Defaults to NULL.
Single character string or character vector
defining what temporal resolution the defined outputs from output_list
should have. Either provide a single character string for all outputs or
a vector with the length of output_list defining each timestep
individually. Choose between "annual", "monthly" or "daily".
Character string defining the format of the output.
Defaults to NULL (use default from cjson file). Options: "raw",
"cdf" (NetCDF) or "clm" (file with header).
Character string providing the name of the main LPJmL
configuration file to be parsed. Defaults to "lpjml_config.cjson".
Integer defining the number of available CPU cores for
parallelization. Defaults to 4.
logical If TRUE, the inner parallelization is switched off
to enable tracebacks and all types of error messages. Defaults to FALSE.
Argument is deprecated as of version 1.0; use x instead.
Argument is deprecated as of version 1.0; use sim_path instead.
Argument is deprecated as of version 1.3; use cjson_filename instead.
tibble with at least one column named "sim_name".
Run parameters "order" and "dependency" are included if defined in
x. tibble in this format is required for
submit_lpjml().
Supply a tibble for x, in which each row represents
a configuration (config) for an LPJmL simulation.
Here a config refers to a precompiled "lpjml_config.cjson" file (or file
name provided as cjson_filename argument) which already contains all the
information from the mandatory cjson files.
The precompilation is done internally by write_config().write_config() uses the column names of param as keys for the config
json using the same syntax as lists, e.g. "k_temp" from "param.js"
can be accessed with "param$k_temp" or "param[["k_temp"]]" as the column
name. (The former point-style syntax - "param.k_temp" - is still valid but
deprecated)
For each run and thus each row, this value has to be specified in the
tibble. If the original value should instead be used, insert
NA.
Each run can be identified via the "sim_name", which is mandatory to
specify.
my_params1 <- tibble(
sim_name = c("scenario1", "scenario2"),
random_seed = c(42, 404),
`pftpar[[1]]$name` = c("first_tree", NA),
`param$k_temp` = c(NA, 0.03),
new_phenology = c(TRUE, FALSE)
)
my_params1
# A tibble: 2 x 5
# sim_name random_seed `pftpar[[1]]$name` `param$k_temp` new_phenology
# <chr> <dbl> <chr> <dbl> <lgl>
# 1 scenario1 42 first_tree NA TRUE
# 2 scenario2 404 NA 0.03 FALSETo set up spin-up and transient runs, where transient runs are dependent on
the spin-up(s), a parameter "dependency" has to be defined as a column in
the tibble that links simulations with each other using the
"sim_name".
Do not manually set "-DFROM_RESTART" when using "dependency". The same
applies for LPJmL config settings "restart", "write_restart",
"write_restart_filename", "restart_filename", which are set automatically
by this function.
This way multiple runs can be performed in succession and build a
conceivably endless chain or tree.
# With dependent runs.
my_params3 <- tibble(
sim_name = c("scen1_spinup", "scen1_transient"),
random_seed = c(42, 404),
dependency = c(NA, "scen1_spinup")
)
my_params3
# A tibble: 2 x 4
# sim_name random_seed order dependency
# <chr> <int> <lgl> <chr>
# 1 scen1_spinup 42 FALSE NA
# 2 scen1_transient 404 TRUE scen1_spinupAnother feature is to define SLURM options for each simulation (row)
separately. For example, users may want to set a lower wall clock limit
(wtime) for the transient run than the spin-up run to get a higher priority
in the SLURM queue. This can be achieved by supplying this option as a
parameter to param.
6 options are available, namely sclass, ntasks, wtime, blocking,
constraint and slurm_options. Use as arguments for [submit_lpjml()].\cr If specified in param, they overwrite the corresponding function arguments in [submit_lpjml()`].
my_params4 <- tibble(
sim_name = c("scen1_spinup", "scen1_transient"),
random_seed = c(42, 404),
dependency = c(NA, "scen1_spinup"),
wtime = c("8:00:00", "2:00:00")
)
my_params4
# A tibble: 2 x 5
# sim_name random_seed order dependency wtime
# <chr> <int> <lgl> <chr> <chr>
# 1 scen1_spinup 42 FALSE NA 8:00:00
# 2 scen1_transient 404 TRUE scen1_spinup 2:00:00To set a macro (e.g. "MY_MACRO" or "CHECKPOINT") provide it as a column of
the tibble as you would do with a flag in the shell:
"-DMY_MACRO" "-DCHECKPOINT".
Wrap macros in backticks or tibble will raise an error, as
starting an object definition with "-" is not allowed in R.
write_config() creates subdirectories within the sim_path directory
"./configurations" to store the config files.
"./output" to store the output within subdirectories for each
sim_name.
"./restart" to store the restart files within subdirectories for each
sim_name.
The list syntax (e.g. pftpar[[1]]$name) allows to create column names and
thus keys for accessing values in the config json.
The column "sim_name" is mandatory (used as an identifier).
The run parameter "dependency" is optional but enables interdependent
consecutive runs using submit_lpjml().
SLURM options in param allow to use different values per run.
If NA is specified as cell value the original value is used.
R booleans/logical constants TRUE and FALSE are to be used for
boolean parameters in the config json.
Value types need to be set correctly, e.g. no strings where numeric values are expected.
if (FALSE) {
library(tibble)
model_path <- "./LPJmL_internal"
sim_path <-"./my_runs"
# Basic usage
my_params <- tibble(
sim_name = c("scen1", "scen2"),
random_seed = c(12, 404),
`pftpar[[1]]$name` = c("first_tree", NA),
`param$k_temp` = c(NA, 0.03),
new_phenology = c(TRUE, FALSE)
)
config_details <- write_config(
x = my_params,
model_path = model_path,
sim_path = sim_path
)
config_details
# A tibble: 2 x 1
# sim_name
# <chr>
# 1 scen1
# 2 scen2
# Usage with dependency
my_params <- tibble(
sim_name = c("scen1_spinup", "scen1_transient"),
random_seed = c(42, 404),
dependency = c(NA, "scen1_spinup")
)
config_details <- write_config(
x = my_params,
model_path = model_path,
sim_path = sim_path
)
config_details
# A tibble: 2 x 3
# sim_name order dependency
# <chr> <dbl> <chr>
# 1 scen1_spinup 1 NA
# 2 scen1_transient 2 scen1_spinup
my_params <- tibble(
sim_name = c("scen1_spinup", "scen1_transient"),
random_seed = c(42, 404),
dependency = c(NA, "scen1_spinup"),
wtime = c("8:00:00", "2:00:00")
)
config_details <- write_config(
x = my_params,
model_path = model_path,
sim_path = sim_path
)
config_details
# A tibble: 2 x 4
# sim_name order dependency wtime
# <chr> <dbl> <chr> <chr>
# 1 scen1_spinup 1 NA 8:00:00
# 2 scen1_transient 2 scen1_spinup 2:00:00
}