The CovBat Family extends the original CovBat methodology to enable flexible covariate modeling, leveraging efficient R implementations of regression models. A method that belongs in the CovBat Family satisfies the following conditions:
CovBat Family defaults to the original linear model and parameters used in CovBat
suppressPackageStartupMessages({
library(CovBat)
library(ComBatFamily)
})
# generate toy dataset
set.seed(8888)
n <- 20
p <- 5
bat <- as.factor(c(rep("a", n/2), rep("b", n/2)))
q <- 2
covar <- matrix(rnorm(n*q), n, q)
colnames(covar) <- paste0("x", 1:q)
data <- data.frame(matrix(rnorm(n*p), n, p))
cf <- covfam(data, bat, covar, lm, formula = y ~ x1 + x2)
c <- covbat(t(data), bat, covar)
max(cf$dat.covbat - t(c$dat.covbat))
#> [1] 3.108624e-15Modeling covariates via a general additive model (GAM) is part of the CovBat family and can be easily implemented.
CovBat Family extends the original CovBat method by enabling modeling of covariate effects in the principal component (PC) scores. The model is estimated separately from the model used in the standardization step.
Chen, A. A., Beer, J. C., Tustison, N. J., Cook, P. A., Shinohara, R. T., Shou, H., & Initiative, T. A. D. N. (2022). Mitigating site effects in covariance for machine learning in neuroimaging data. Human Brain Mapping, 43(4), 1179–1195. https://doi.org/10.1002/hbm.25688