Interpreting missingness results from wide datasets is difficult. This function helps interpret missingness output by summarizing this output by listing: the percent of variables that contain missingness, the variable name of the variable with the maximum amount of missingness along with its percent of observations containing missing values, and a tibble that lists the top 5 missingness levels with the count of the number of variables associated with each level (0 missingness level is ignored). If there are no variables with missingness, a message that reports no missingness is printed and NULL is returned instead.

# S3 method for missingness
summary(object, ...)

Arguments

object

Data frame from missingness

...

Unused

Value

a tibble of the top 5 missingness percentage levels with the count of the number of variables associated with each level. If no missingness is found, NULL is returned instead.

Examples

missingness(pima_diabetes) %>% summary()
#> Missingness summary: #> 50% of data variables contain missingness. #> `insulin` contains the most missingness with 48.7% of observations containing missing values. #> #> Number of variables with levels of missingness: #> # A tibble: 5 x 2 #> percent_missing n_variables #> <dbl> <int> #> 1 48.7 1 #> 2 29.6 1 #> 3 4.56 1 #> 4 1.43 1 #> 5 0.651 1