desc_stat()
Computes the most used measures of central tendency,
position, and dispersion.
desc_wider()
is useful to put the variables in columns and grouping
variables in rows. The table is filled with a statistic chosen with the
argument stat
.
desc_stat( .data = NULL, ..., by = NULL, stats = "main", hist = FALSE, level = 0.95, digits = 4, na.rm = FALSE, verbose = TRUE, plot_theme = theme_metan() ) desc_wider(.data, which)
.data | The data to be analyzed. It can be a data frame (possible with
grouped data passed from |
---|---|
... | A single variable name or a comma-separated list of unquoted
variables names. If no variable is informed, all the numeric variables from
|
by | One variable (factor) to compute the function by. It is a shortcut
to |
stats | The descriptive statistics to show. This is used to filter the
output after computation. Defaults to
Use a names to select the statistics. For example, |
hist | Logical argument defaults to |
level | The confidence level to compute the confidence interval of mean. Defaults to 0.95. |
digits | The number of significant digits. |
na.rm | Logical. Should missing values be removed? Defaults to |
verbose | Logical argument. If |
plot_theme | The graphical theme of the plot. Default is
|
which | A statistic to fill the table. |
desc_stats()
returns a tibble with the statistics in the columns and
variables (with possible grouping factors) in rows.
desc_wider()
returns a tibble with variables in columns and grouping
factors in rows.
Tiago Olivoto tiagoolivoto@gmail.com
# \donttest{ library(metan) #===============================================================# # Example 1: main statistics (coefficient of variation, maximum,# # mean, median, minimum, sample standard deviation, standard # # error and confidence interval of the mean) for all numeric # # variables in data # #===============================================================# desc_stat(data_ge2)#> # A tibble: 15 x 9 #> variable cv max mean median min sd.amo se ci #> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 CD 7.34 18.6 16.0 16 12.9 1.17 0.0939 0.186 #> 2 CDED 5.71 0.694 0.586 0.588 0.495 0.0334 0.0027 0.0053 #> 3 CL 7.95 34.7 29.0 28.7 23.5 2.31 0.185 0.365 #> 4 CW 25.2 38.5 24.8 24.5 11.1 6.26 0.501 0.99 #> 5 ED 5.58 54.9 49.5 49.9 43.5 2.76 0.221 0.437 #> 6 EH 21.2 1.88 1.34 1.41 0.752 0.284 0.0228 0.045 #> 7 EL 8.28 17.9 15.2 15.1 11.5 1.26 0.101 0.199 #> 8 EP 10.5 0.660 0.537 0.544 0.386 0.0564 0.0045 0.0089 #> 9 KW 18.9 251. 173. 175. 106. 32.8 2.62 5.18 #> 10 NKE 14.2 697. 512. 509. 332. 72.6 5.82 11.5 #> 11 NKR 10.7 42 32.2 32 23.2 3.47 0.277 0.548 #> 12 NR 10.2 21.2 16.1 16 12.4 1.64 0.131 0.259 #> 13 PERK 2.17 91.8 87.4 87.5 81.2 1.90 0.152 0.300 #> 14 PH 13.4 3.04 2.48 2.52 1.71 0.334 0.0267 0.0528 #> 15 TKW 13.9 452. 339. 342. 218. 47.1 3.77 7.44#===============================================================# #Example 2: robust statistics using a numeric vector as input # # data #===============================================================# vect <- data_ge2$TKW desc_stat(vect, stats = "robust")#> # A tibble: 1 x 5 #> variable n median iqr ps #> <chr> <dbl> <dbl> <dbl> <dbl> #> 1 val 156 342. 57.8 42.8#===============================================================# # Example 3: Select specific statistics. In this example, NAs # # are removed before analysis with a warning message # #===============================================================# desc_stat(c(12, 13, 19, 21, 8, NA, 23, NA), stats = c('mean, se, cv, n, valid.n'), na.rm = TRUE)#> # A tibble: 1 x 6 #> variable mean se cv n valid.n #> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 val 16 2.39 36.7 8 6#===============================================================# # Example 4: Select specific variables and compute statistics by# # levels of a factor variable (GEN) # #===============================================================# stats <- desc_stat(data_ge2, EP, EL, EH, ED, PH, CD, by = GEN) stats#> # A tibble: 78 x 10 #> GEN variable cv max mean median min sd.amo se ci #> <fct> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 H1 CD 6.44 17.9 15.7 15.7 14.5 1.01 0.292 0.643 #> 2 H1 ED 2.66 53.3 51.2 50.8 49.2 1.36 0.393 0.864 #> 3 H1 EH 19.5 1.88 1.50 1.56 1.05 0.294 0.0848 0.187 #> 4 H1 EL 6.27 16.9 15.1 15.1 13.7 0.947 0.273 0.602 #> 5 H1 EP 9.91 0.658 0.570 0.574 0.492 0.0565 0.0163 0.0359 #> 6 H1 PH 11.7 3.00 2.62 2.70 2.11 0.307 0.0885 0.195 #> 7 H10 CD 6.32 17.5 15.9 15.7 14.4 1.00 0.290 0.638 #> 8 H10 ED 7.70 54.1 48.4 47.7 43.7 3.73 1.08 2.37 #> 9 H10 EH 23.2 1.71 1.26 1.25 0.888 0.293 0.0845 0.186 #> 10 H10 EL 6.83 16.7 15.1 14.9 13.6 1.03 0.298 0.656 #> # ... with 68 more rows# To get a 'wide' format with the maximum values for all variables desc_wider(stats, max)#> # A tibble: 13 x 7 #> GEN CD ED EH EL EP PH #> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 H1 17.9 53.3 1.88 16.9 0.658 3.00 #> 2 H10 17.5 54.1 1.71 16.7 0.660 2.83 #> 3 H11 18.0 52.3 1.67 17.4 0.600 2.77 #> 4 H12 16.2 52.7 1.58 15.7 0.616 2.79 #> 5 H13 17.8 54.0 1.77 16.3 0.615 2.93 #> 6 H2 17.0 53.6 1.87 16.1 0.615 3.03 #> 7 H3 18.0 52.2 1.80 17.6 0.640 3.04 #> 8 H4 17.7 52.8 1.82 16.8 0.617 3.02 #> 9 H5 17.4 52.7 1.76 16.6 0.632 2.90 #> 10 H6 18.3 54.9 1.69 17.9 0.631 2.94 #> 11 H7 18.6 52.1 1.67 17.5 0.617 2.87 #> 12 H8 18.4 53.3 1.57 17.7 0.585 2.76 #> 13 H9 18.1 53.6 1.71 17.5 0.630 3.00#===============================================================# # Example 5: Compute all statistics for all numeric variables # # by two or more factors. Note that group_by() was used to pass # # grouped data to the function desc_stat() # #===============================================================# data_ge2 %>% group_by(ENV, GEN) %>% desc_stat()#> # A tibble: 780 x 11 #> ENV GEN variable cv max mean median min sd.amo se #> <fct> <fct> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 A1 H1 CD 6.91 16.4 15.7 16.3 14.5 1.09 0.627 #> 2 A1 H1 CDED 2.04 0.561 0.550 0.551 0.538 0.0112 0.0065 #> 3 A1 H1 CL 1.48 28.4 28.1 28.1 27.6 0.415 0.239 #> 4 A1 H1 CW 7.93 25.1 23.5 24.0 21.4 1.86 1.08 #> 5 A1 H1 ED 1.98 52.2 51.1 50.7 50.3 1.01 0.583 #> 6 A1 H1 EH 5.36 1.76 1.68 1.71 1.58 0.0902 0.0521 #> 7 A1 H1 EL 7.15 16.1 15.4 16.0 14.2 1.10 0.637 #> 8 A1 H1 EP 5.34 0.658 0.626 0.628 0.591 0.0334 0.0193 #> 9 A1 H1 KW 8.31 217. 203. 208. 184. 16.8 9.72 #> 10 A1 H1 NKE 6.80 565. 527. 521. 494. 35.8 20.7 #> # ... with 770 more rows, and 1 more variable: ci <dbl># }