Published September 1, 2009 | Version v1
Journal article Open

Murtaugh, P. A. Performance of several variable-selection methods applied to real ecological data. Ecology Letters

Description

I evaluated the predictive ability of statistical models obtained by applying seven methods of variable selection to 12 ecological and environmental data sets. Cross-validation, involving repeated splits of each data set into training and validation subsets, was used to obtain honest estimates of predictive ability that could be fairly compared among methods. There was surprisingly little difference in predictive ability among five methods based on multiple linear regression. Stepwise methods performed similarly to exhaustive algorithms for subset selection, and the choice of criterion for comparing models (Akaike's information criterion, Schwarz's Bayesian information criterion or F statistics) had little effect on predictive ability. For most of the data sets, two methods based on regression trees yielded models with substantially lower predictive ability. I argue that there is no 'best' method of variable selection and that any of the regression-based approaches discussed here is capable of yielding useful predictive models.

Files

article.pdf

Files (136.6 kB)

Name Size Download all
md5:9463fb1617ccc46b1c8de0df39f5ac40
136.6 kB Preview Download