Published January 1, 1998 | Version v1
Journal article Open

Methods of variable selection in regression modeling. Comm Stat Simul

Description

Simulation was used to evaluate the performances of several methods of variable selection in regression modeling: stepwise regression based on partial F-tests, stepwise minimization of Mallows' C, statistic and Schwarz's Bayes Information Criterion (BIC), and regression trees constructed with two kinds of pruning. Five to 25 covariates were generated in multivariate clusters, and responses were obtained from an ordinary linear regression model involving three of the covariates; each data set had 50 observations. The regression- tree approaches were markedly inferior to the other methods in discriminating between informative and noninformative covariates, and their predictions of responses in "new" data sets were much more variable and less accurate than those of the other methods. The F-test, C, and BIC approaches were similar in their overall frequencies of "correct" decisions about inclusion or exclusion of covariates, with the C, method leading to the largest models and the BIC

Files

article.pdf

Files (2.8 MB)

Name Size Download all
md5:35472f2426e9ad3ebb4e9ca4b4922daf
2.8 MB Preview Download