R/non_collinear_vars.R
non_collinear_vars.Rd
Select a set of predictors with minimal multicollinearity using the variance
inflation factor (VIF) as criteria to remove collinear variables. The
algorithm will: (i) compute the VIF value of the correlation matrix
containing the variables selected in ...
; (ii) arrange the
VIF values and delete the variable with the highest VIF; and (iii)
iterate step ii until VIF value is less than or equal to
max_vif
.
non_collinear_vars( .data, ..., max_vif = 10, missingval = "pairwise.complete.obs" )
.data | The data set containing the variables. |
---|---|
... | Variables to be submitted to selection. If |
max_vif | The maximum value for the Variance Inflation Factor (threshold) that will be accepted in the set of selected predictors. |
missingval | How to deal with missing values. For more information,
please see |
A data frame showing the number of selected predictors, maximum VIF value, condition number, determinant value, selected predictors and removed predictors from the original set of variables.
#> Parameter values #> 1 Predictors 10 #> 2 VIF 7.16 #> 3 Condition Number 56.797 #> 4 Determinant 0.0008810515 #> 5 Selected PERK, EP, CDED, NKR, PH, NR, TKW, EL, CD, ED #> 6 Removed EH, CL, CW, KW, NKE# Select variables and choose a VIF threshold to 5 non_collinear_vars(data_ge2, EH, CL, CW, KW, NKE, max_vif = 5)#> Parameter values #> 1 Predictors 4 #> 2 VIF 2.934 #> 3 Condition Number 11.248 #> 4 Determinant 0.2400583901 #> 5 Selected NKE, EH, CL, CW #> 6 Removed KW# }