Planned intervention: On Wednesday April 3rd 05:30 UTC Zenodo will be unavailable for up to 2-10 minutes to perform a storage cluster upgrade.

There is a newer version of the record available.

Published January 16, 2023 | Version v0.2.0
Software Open

epinowcast/epinowcast: Epinowcast 0.2.0

  • 1. @epiforecasts @cmmid
  • 2. ETH Zürich
  • 3. London School of Hygiene & Tropical Medicine
  • 4. University of Turin
  • 5. data.org
  • 6. London School of Hygiene and Tropical Medicine

Description

This release adds several extensions to our modelling framework, including modelling of missing data, flexible modelling of the generative process underlying case counts, an optional renewal equation-based generative process (enabling direct estimation of the effective reproduction number), and convolution-based latent reporting delays (enabling the modelling of both directly observed and unobserved delays as well as partial ascertainment). Much of the methodology used in these extensions is based on work done by Adrian Lison and is currently being evaluated.

On top of model extensions this release also adds a range of quality of life features, such as a helper functions for constructing convolution matrices and combining probability mass functions. It also comes with improved computational efficiency, thanks to a refactoring of the hazard model computations to the log scale and extended parallelisation of the likelihood that is optimised for the structure of the input data. We have also extended the package documentation and streamlined the contribution process.

As a large-scale project, this package remains in an experimental state, although it is sufficiently stable for both research and production usage. More core development is needed to improve post-proccessing, pre-processing, and documentation coverage. Moreover, the optimal configuration for different settings still needs to be further explored and is currently mainly the responsibility of the user. Please see our community site, contributing guide, and list of issues/proposed features if you are interested in getting involved. Any scale of contribution is warmly welcomed including user feedback, requests to extend our functionality to cover your setting, and evaluations of the package in your context. This is a community project that needs support from its users in order to provide improved tools for real-time infectious disease surveillance.

We thank @adrian-lison, @choi-hannah, @sbfnk, @Bisaloo, @seabbs, @pearsonca, and @pratikunterwegs for code contributions to this release. We also thank all community members for their contributions including @jhellewell14, @FelixGuenther, @parksw3, and @jbracher.

Full details on the changes in this release can be found in the following sections.

Package
  • Added .Rhistory to the .gitignore file. See #132 by @choi-hannah.
  • Fixed indentations for authors and contributors in the DESCRIPTION file. See #132 by @choi-hannah.
  • Renamed enw_new_reports() to enw_cumulative_to_incidence() and added the reverse function enw_incidence_to_cumulative() both functions use a by argument to allow specification of variable groupings. See #157 by @seabbs.
  • Switched class checking to inherits(x, "class") rather than class(x) %in% "class". See #155 by @Bisaloo.
  • Changed enw_add_metaobs_features() interface to have holidays argument as a series of dates. Changed interface of enw_preprocess_data() to pass ... to enw_add_metaobs_features(). Interface changes come with internal rewrite and unit tests. As part of internal rewrite, introduces coerce_date() to R/utils.R, which wraps data.table::as.IDate() with error handling. See #151 by @pearsonca.
    • Changed the style of using match.arg for validating inputs. Briefly, the preference is now to define options via function arguments and validate with automatic match.arg idiom with corresponding enumerated documentation of the options. For this idiom, the first item in the definition is the default. This approach only applies to string-based arguments; different types of arguments cannot be matched this way, nor can arguments that allow for vector-valued options (e.g., if somearg = c("option1", "option2") were a legal argument indicating to use both options). See #162 by @pearsonca addressing issue #156 by @Bisaloo.
  • Refined the use of data ordering throughout the preprocessing functions. See #147 by @seabbs.
  • Skipped tests that use cmdstan locally to improve the developer/contributor experience. See #147 by @seabbs and @adrian-lison.
  • Added a basic simulator function for missing reference data. See #147 by @seabbs and @adrian-lison.
  • Added support for right hand side interactions as syntax sugar for random effects. This allows the specification of, for example, independent random effects by day for each strata of another variable. See #169 by @seabbs.
  • Added support for passing cpp_options to cmdstanr::cmdstan_model(). See #182 by @seabbs.
  • Add a functon, convolution_matrix() for constructing convolution matrices. See #183 by @seabbs.
  • Add a pass through from enw_model() to write_stan_files_no_profile() for the target_dir argument. This allows users to compile the model once and then share the compiled model across sessions rather than having to recompile each time the temporary directory is cleared. See #185 by @seabbs.
  • Added add_pmfs(), to sum probability mass functions into a new probability mass function. Initial implementation by @seabbs in #183, refactored by @pratikunterwegs in #187, following a suggestion in issue #186 by @pearsonca.
  • Added a warning when the observed empirical maximum delay is less than the specified maximum delay. See #190 by @seabbs.
  • Added nested support for converting array syntax in convert_cmdstan_to_rstan. See #192 by @sbfnk.
Model
  • Added support for parametric log-logistic delay distributions. See #128 by @adrian-lison.
  • Implemented direct specification of parametric baseline hazards. See #134 by @adrian-lison.
  • Refactored the observation model, the combination of logit hazards, and the effects priors to be contained in generic functions to make extending package functionality easier. See #137 by @seabbs.
  • Implemented specification of the parametric baseline hazards and probabilities on the log scale to increase robustness and efficiency. Also includes refactoring of these functions and reorganisation of inst/stan/epinowcast.stan to increase modularity and clarity. See #140 by @seabbs.
  • Introduced two new delay likelihoods delay_snap_lmpf and delay_group_lmpf. These stratify by either snapshots or groups. This is helpful for some models (such as the missingness module). The ability to choose which function is used has been exposed to the user in enw_fit_opts() via the likelihood_aggregation argument. Both of these functions rely on a newly added expected_obs_from_snaps function which vectorises expected_obs_from_index. See #138 by @seabbs and @adrian-lison.
  • Added support for supplying missingness model parameters to the model as well as optional priors and effect estimation. See #138 by @seabbs and @adrian-lison.
  • Refactored model generated quantities to be functional. See #138 by @seabbs and @adrian-lison.
  • Added support for modelling missing reference dates to the likelihood. See #147 by @seabbs and @adrian-lison.
  • Added additional functionality to delay_group_lmpf to support modelling observations missing reference dates. Also updated the generated quantities to support this mode. See #147 by @seabbs and @adrian-lison based on #64 by @adrian-lison.
  • Added a flexible expectation process on the growth rate scale. The default expectation model has been updated to a group-wise random walk on the growth rate. See #152 by @seabbs and @adrian-lison.
  • Added a deterministic renewal equation, and latent reporting process. See #152 and #183 by @seabbs and @adrian-lison.
  • Added support for no intercept in the expectation model and more general formula support to enable this as a feature in other modules going forward. See #170 by @seabbs.
Documentation
  • Removed explicit links to authors and issues in the NEWS.md file. See #132 by @choi-hannah.
  • Added a new example using simulated data and the enw_missing() model module. See #138 by @seabbs and @adrian-lison.
  • Update the model definition vignette to include the missing reference date model. See #147 by @seabbs and @adrian-lison.
  • Added the use of an expectation model to the "Hierarchical nowcasting of age stratified COVID-19 hospitalisations in Germany" vignette. See #193 by @seabbs.
Bugs
  • The probability-only model (i.e only a parametric distribution is used and hence the hazard scale is not needed) was not used due to a mistake specifying ref_as_p in the stan code. There was an additional issue in that the enw_report() module currently self-declares as on regardless of it is or not. This bug had no impact on results but would have increased runtimes for simple models. Both of these issues were fixed in #142 by @seabbs.
  • The addition of meta features week and month did not properly sequentially number weeks and months when time series crossed year boundaries. This would impact models that included effects expecting those to in fact be sequentially numbered (e.g. random walks). Fixed in #151 by @pearsonca.
    • #151 also corrects a minor issue with enw_example() pointing at an old file name when type="script". By @pearsonca.
What's Changed New Contributors

Full Changelog: https://github.com/epinowcast/epinowcast/compare/v0.1.0...v0.2.0

Files

epinowcast/epinowcast-v0.2.0.zip

Files (14.3 MB)

Name Size Download all
md5:42755ca83eecc312fca1262fd0fa8bf4
14.3 MB Preview Download

Additional details

Related works