robust() computes robust standard error for regression models. This method wraps the coeftest-function with robust covariance matrix estimators based on the vcovHC-function, and returns the result as tidy data frame.

svy() is intended to compute standard errors for survey designs (complex samples) fitted with regular lm or glm functions, as alternative to the survey-package. It simulates sampling weights by adjusting the residual degrees of freedom based on the precision weights used to fit x, and then calls robust() with the adjusted model.

robust(x, vcov = c("HC3", "const", "HC", "HC0", "HC1", "HC2", "HC4", "HC4m",
  "HC5"), conf.int = FALSE, exponentiate = FALSE)

svy(x, vcov = c("HC1", "const", "HC", "HC0", "HC2", "HC3", "HC4", "HC4m",
  "HC5"), conf.int = FALSE, exponentiate = FALSE)

Arguments

x

A fitted model of any class that is supported by the coeftest()-function. For svy(), x must be lm object, fitted with weights.

vcov

Character vector, specifying the estimation type for the heteroskedasticity-consistent covariance matrix estimation (see vcovHC for details).

conf.int

Logical, TRUE if confidence intervals based on robust standard errors should be included.

exponentiate

Logical, whether to exponentiate the coefficient estimates and confidence intervals (typical for logistic regression).

Value

A summary of the model, including estimates, robust standard error, p-value and - optionally - the confidence intervals.

Note

svy() simply calls robust(), but first adjusts the residual degrees of freedom based on the model weights. Hence, for svy(), x should be fitted with weights. This simulates sampling weights like in survey designs, though lm and glm implement precision weights. The results from svy() are usually more accurate than simple weighted standard errors for complex samples. However, results from the survey package are still more exactly, especially regarding the estimates.

vcov for svy() defaults to "HC1", because standard errors with this estimation type come closest to the standard errors from the survey-package.

Currently, svy() only works for objects of class lm.

Examples

data(efc) fit <- lm(barthtot ~ c160age + c12hour + c161sex + c172code, data = efc) summary(fit)
#> #> Call: #> lm(formula = barthtot ~ c160age + c12hour + c161sex + c172code, #> data = efc) #> #> Residuals: #> Min 1Q Median 3Q Max #> -74.639 -15.246 4.251 19.009 73.327 #> #> Coefficients: #> Estimate Std. Error t value Pr(>|t|) #> (Intercept) 90.06448 6.17237 14.592 <2e-16 *** #> c160age -0.22156 0.07111 -3.116 0.0019 ** #> c12hour -0.27810 0.01865 -14.915 <2e-16 *** #> c161sex -0.26178 2.08649 -0.125 0.9002 #> c172code -0.76215 1.41971 -0.537 0.5915 #> --- #> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 #> #> Residual standard error: 25.35 on 816 degrees of freedom #> (87 observations deleted due to missingness) #> Multiple R-squared: 0.2696, Adjusted R-squared: 0.266 #> F-statistic: 75.28 on 4 and 816 DF, p-value: < 2.2e-16 #>
robust(fit)
#> # A tibble: 5 x 5 #> term estimate std.error statistic p.value #> <chr> <dbl> <dbl> <dbl> <dbl> #> 1 (Intercept) 90.1 5.58 16.1 4.83e-51 #> 2 c160age -0.222 0.0711 -3.12 1.89e- 3 #> 3 c12hour -0.278 0.0212 -13.1 8.01e-36 #> 4 c161sex -0.262 1.92 -0.136 8.92e- 1 #> 5 c172code -0.762 1.39 -0.547 5.85e- 1
confint(fit)
#> 2.5 % 97.5 % #> (Intercept) 77.9488902 102.18006829 #> c160age -0.3611297 -0.08198647 #> c12hour -0.3146997 -0.24150107 #> c161sex -4.3573007 3.83374416 #> c172code -3.5488594 2.02455439
robust(fit, conf.int = TRUE)
#> # A tibble: 5 x 7 #> term estimate std.error statistic p.value conf.low conf.high #> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 (Intercept) 90.1 5.58 16.1 4.83e-51 79.1 101. #> 2 c160age -0.222 0.0711 -3.12 1.89e- 3 -0.361 -0.0821 #> 3 c12hour -0.278 0.0212 -13.1 8.01e-36 -0.320 -0.236 #> 4 c161sex -0.262 1.92 -0.136 8.92e- 1 -4.03 3.50 #> 5 c172code -0.762 1.39 -0.547 5.85e- 1 -3.50 1.98
robust(fit, vcov = "HC1", conf.int = TRUE) # "HC1" should be Stata default
#> # A tibble: 5 x 7 #> term estimate std.error statistic p.value conf.low conf.high #> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 (Intercept) 90.1 5.55 16.2 1.79e-51 79.2 101. #> 2 c160age -0.222 0.0707 -3.14 1.78e- 3 -0.360 -0.0829 #> 3 c12hour -0.278 0.0210 -13.2 2.73e-36 -0.319 -0.237 #> 4 c161sex -0.262 1.91 -0.137 8.91e- 1 -4.01 3.49 #> 5 c172code -0.762 1.39 -0.549 5.83e- 1 -3.49 1.96
library(sjmisc) # dichtomozize service usage by "service usage yes/no" efc$services <- sjmisc::dicho(efc$tot_sc_e, dich.by = 0) fit <- glm(services ~ neg_c_7 + c161sex + e42dep, data = efc, family = binomial(link = "logit")) robust(fit)
#> # A tibble: 4 x 5 #> term estimate std.error statistic p.value #> <chr> <dbl> <dbl> <dbl> <dbl> #> 1 (Intercept) -0.520 0.380 -1.37 0.172 #> 2 neg_c_7 0.0419 0.0204 2.06 0.0396 #> 3 c161sex -0.219 0.163 -1.35 0.178 #> 4 e42dep 0.213 0.0797 2.68 0.00740
robust(fit, conf.int = TRUE, exponentiate = TRUE)
#> # A tibble: 4 x 7 #> term estimate std.error statistic p.value conf.low conf.high #> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 (Intercept) 0.595 0.380 -1.37 0.172 0.282 1.25 #> 2 neg_c_7 1.04 0.0204 2.06 0.0396 1.00 1.09 #> 3 c161sex 0.803 0.163 -1.35 0.178 0.584 1.11 #> 4 e42dep 1.24 0.0797 2.68 0.00740 1.06 1.45