Targeted Data-Adaptive Estimation and Inference for Differential Methylation Analysis
Author: Nima Hejazi
methyvim
?methyvim
is an R package that provides facilities for differential methylation analysis based on variable importance measures (VIMs), a class of statistically estimable target parameters that arise in causal inference.
The statistical methodology implemented computes targeted minimum loss-based estimates of several well-characterized variable importance measures:
For discrete-valued treatments or exposures:
The average treatment effect (ATE): The effect of a binary exposure or treatment on the observed methylation at a target CpG site is estimated, controlling for the observed methylation at all other CpG sites in the same neighborhood as the target site, based on an additive form. In particular, the parameter estimate represents the additive difference in methylation that would have been observed at the target site had all observations received the treatment versus the scenario in which none received the treatment.
The relative risk (RR): The effect of a binary exposure or treatment on the observed methylation at a target CpG site is estimated, controlling for the observed methylation at all other CpG sites in the same neighborhood as the target site, based on an geometric form. In particular, the parameter estimate represents the multiplicative difference in methylation that would have been observed at the target site had all observations received the treatment versus the scenario in which none received the treatment.
For continous-valued treatments or exposures:
In all cases, an estimator of the target parameter is constructed via targeted minimum loss-based estimation.
These methods allow differential methylation effects to be quantified in a manner that is largely free of assumptions, especially of the variety exploited in parametric models. The statistical algorithm consists in several major steps:
limma
, tmle.npvi
.tmle.npvi
and tmle
R packages.For a general discussion of the framework of targeted minimum loss-based estimation, the many applications of this methodology, and the role the framework plays in statistical causal inference, the recommended references are van der Laan and Rose (2011) and van der Laan and Rose (2017). Hernan and Robins (2018) and Pearl (2009) may be of interest to those desiring a more general introduction to statistical causal inference.
For standard use, install from Bioconductor:
To contribute, install the bleeding-edge development version from GitHub via devtools
:
Current and prior Bioconductor releases are available under branches with numbers prefixed by “RELEASE_”. For example, to install the version of this package available via Bioconductor 3.6, use
For details on how to best use the methyvim
R package, please consult the most recent package vignette available through the Bioconductor project.
Contributions are very welcome. Interested contributors should consult our contribution guidelines prior to submitting a pull request.
After using the methyvim
R package, please cite the following:
@article{hejazi2018methyvim,
doi = {},
url = {},
year = {2018},
month = {},
publisher = {},
volume = {},
author = {Hejazi, Nima S and Phillips, Rachael V and Hubbard, Alan E
and {van der Laan}, Mark J},
title = {methyvim: Targeted and model-free differential methylation
analysis in R},
journal = {}
}
The development of this software was supported in part through grants from the National Institutes of Health: T32 LM012417-02, R01 ES021369-05, and P42 ES004705-29.
© 2017-2018 Nima S. Hejazi
The contents of this repository are distributed under the MIT license. See file LICENSE
for details.
Benjamini, Yoav, and Yosef Hochberg. 1995. “Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing.” Journal of the Royal Statistical Society. Series B (Methodological). JSTOR, 289–300.
Chambaz, Antoine, Pierre Neuvial, and Mark J van der Laan. 2012. “Estimation of a Non-Parametric Variable Importance Measure of a Continuous Exposure.” Electronic Journal of Statistics 6. NIH Public Access:1059.
Hernan, Miguel A, and James M Robins. 2018. Causal Inference. Chapman & Hall / CRC Texts in Statistical Science. Taylor & Francis.
Pearl, Judea. 2009. Causality: Models, Reasoning, and Inference. Cambridge University Press.
Tuglus, Catherine, and Mark J van der Laan. 2009. “Modified FDR Controlling Procedure for Multi-Stage Analyses.” Statistical Applications in Genetics and Molecular Biology 8 (1). Walter de Gruyter:1–15. https://doi.org/10.2202/1544-6115.1397.
van der Laan, Mark J, and Sherri Rose. 2011. Targeted Learning: Causal Inference for Observational and Experimental Data. Springer Science & Business Media.
———. 2017. Targeted Learning in Data Science: Causal Inference for Complex Longitudinal Studies. Springer Science & Business Media.