Weighted statistics for variables

wtd_sd(), wtd_se(), wtd_mean() and wtd_median() compute weighted standard deviation, standard error, mean or median for a variable or for all variables of a data frame. svy_md() computes the median for a variable in a survey-design (see svydesign). wtd_cor() computes a weighted correlation for a two-sided alternative hypothesis.

Weighted tests

wtd_ttest() computes a weighted t-test, while wtd_mwu() computes a weighted Mann-Whitney-U test or a Kruskal-Wallis test (for more than two groups). wtd_chisqtest() computes a weighted Chi-squared test for contigency tables.

wtd_sd(x, weights = NULL)

wtd_mean(x, weights = NULL)

wtd_se(x, weights = NULL)

wtd_median(x, weights = NULL)

svy_md(x, design)

wtd_chisqtest(data, ...)

# S3 method for default
wtd_chisqtest(data, x, y, weights, ...)

# S3 method for formula
wtd_chisqtest(formula, data, ...)

wtd_ttest(data, ...)

# S3 method for default
wtd_ttest(data, x, y = NULL, weights, mu = 0,
  paired = FALSE, ci.lvl = 0.95, alternative = c("two.sided", "less",
  "greater"), ...)

# S3 method for formula
wtd_ttest(formula, data, mu = 0, paired = FALSE,
  ci.lvl = 0.95, alternative = c("two.sided", "less", "greater"), ...)

wtd_mwu(data, ...)

# S3 method for default
wtd_mwu(data, x, grp, weights, ...)

# S3 method for formula
wtd_mwu(formula, data, ...)

wtd_cor(data, ...)

# S3 method for default
wtd_cor(data, x, y, weights, ci.lvl = 0.95, ...)

# S3 method for formula
wtd_cor(formula, data, ci.lvl = 0.95, ...)

Arguments

x

(Numeric) vector or a data frame. For svy_md(), wtd_ttest(), wtd_mwu() and wtd_chisqtest() the bare (unquoted) variable name, or a character vector with the variable name.

weights

Bare (unquoted) variable name, or a character vector with the variable name of the numeric vector of weights. If weights = NULL, unweighted statistic is reported.

design

An object of class svydesign, providing a specification of the survey design.

data

A data frame.

...

For wtd_ttest() and wtd_mwu(), currently not used. For wtd_chisqtest(), further arguments passed down to chisq.test.

y

Optional, bare (unquoted) variable name, or a character vector with the variable name.

formula

A formula of the form lhs ~ rhs1 + rhs2 where lhs is a numeric variable giving the data values and rhs1 a factor with two levels giving the corresponding groups and rhs2 a variable with weights.

mu

A number indicating the true value of the mean (or difference in means if you are performing a two sample test).

paired

Logical, whether to compute a paired t-test.

ci.lvl

Confidence level of the interval.

alternative

A character string specifying the alternative hypothesis, must be one of "two.sided" (default), "greater" or "less". You can specify just the initial letter.

grp

Bare (unquoted) name of the cross-classifying variable, where x is grouped into the categories represented by grp, or a character vector with the variable name.

Value

The weighted (test) statistic.

Note

wtd_chisq() is a convenient wrapper for xtab_statistics. For a weighted one-way Anova, use grpmean() with weights-argument.

Examples

# weighted sd and se ---- wtd_sd(rnorm(n = 100, mean = 3), runif(n = 100))
#> [1] 0.8122863
data(efc) wtd_sd(efc[, 1:3], runif(n = nrow(efc)))
#> c12hour e15relat e16sex #> 50.2236821 2.0569434 0.4654878
wtd_se(efc[, 1:3], runif(n = nrow(efc)))
#> c12hour e15relat e16sex #> 1.68858848 0.06873935 0.01573668
# svy_md ---- # median for variables from weighted survey designs library(survey) data(nhanes_sample) des <- svydesign( id = ~SDMVPSU, strat = ~SDMVSTRA, weights = ~WTINT2YR, nest = TRUE, data = nhanes_sample ) svy_md(total, des)
#> [1] 6
svy_md("total", des)
#> [1] 6
# weighted t-test ---- efc$weight <- abs(rnorm(nrow(efc), 1, .3)) wtd_ttest(efc, e17age, weights = weight)
#> #> One Sample t-test (two.sided) #> # t=292.76 df=890 p-value=0.000 #> #> mean of e17age: 79.222 [78.691 79.753] #>
wtd_ttest(efc, e17age, c160age, weights = weight)
#> #> Two-Sample t-test (two.sided) #> #> # comparison between e17age and c160age #> # t=49.62 df=1469 p-value=0.000 #> #> mean of e17age : 79.221 #> mean of c160age : 53.399 #> difference of mean: 25.822 [24.801 26.843] #>
wtd_ttest(e17age ~ e16sex + weight, efc)
#> #> Two-Sample t-test (two.sided) #> #> # comparison of e17age by e16sex #> # t=-7.86 df=598 p-value=0.000 #> #> mean in group [1] male : 76.276 #> mean in group [2] female: 80.618 #> difference of mean : -4.342 [-5.426 -3.257] #>
# weighted Mann-Whitney-U-test ---- wtd_mwu(c12hour ~ c161sex + weight, efc)
#> #> Weighted Mann-Whitney-U test (two.sided) #> #> # comparison of c12hour by c161sex #> # Chisq=2.25 df=899 p-value=0.025 #> #> difference in mean rank score: 0.054 #>
# weighted Chi-squared-test ---- wtd_chisqtest(efc, c161sex, e16sex, weights = weight, correct = FALSE)
#> #> # Measure of Association for Contingency Tables #> #> Chi-squared: 2.0244 #> Phi: 0.0475 #> p-value: 0.1548
wtd_chisqtest(c172code ~ c161sex + weight, efc)
#> #> # Measure of Association for Contingency Tables #> #> Chi-squared: 4.4437 #> Cramer's V: 0.0728 #> p-value: 0.1084