inspect()
scans a data.frame object for errors that may affect the use
of functions in metan
. By default, all variables are checked regarding
the class (numeric or factor), missing values, and presence of possible
outliers. The function will return a warning if the data looks like
unbalanced, has missing values or possible outliers.
inspect(.data, ..., plot = FALSE, threshold = 15, verbose = TRUE)
.data | The data to be analyzed |
---|---|
... | The variables in |
plot | Create a plot to show the check? Defaults to |
threshold | Maximum number of levels allowed in a character / factor column to produce a plot. Defaults to 15. |
verbose | Logical argument. If |
A tibble with the following variables:
Variable The name of variable
Class The class of the variable
Missing Contains missing values?
Levels The number of levels of a factor variable
Valid_n Number of valid n (omit NAs)
Outlier Contains possible outliers?
Tiago Olivoto tiagoolivoto@gmail.com
#> # A tibble: 5 x 9 #> Variable Class Missing Levels Valid_n Min Median Max Outlier #> <chr> <chr> <chr> <chr> <int> <dbl> <dbl> <dbl> <dbl> #> 1 ENV factor No 14 420 NA NA NA NA #> 2 GEN factor No 10 420 NA NA NA NA #> 3 REP factor No 3 420 NA NA NA NA #> 4 GY numeric No - 420 0.67 2.61 5.09 0 #> 5 HM numeric No - 420 38 48 58 0#># Create a toy example with messy data df <- data_ge2[-c(2, 30, 45, 134), c(1:5)] df[c(1, 20, 50), c(4, 5)] <- NA df[40, 5] <- df[40, 5] * 2 inspect(df, plot = TRUE)#> # A tibble: 5 x 9 #> Variable Class Missing Levels Valid_n Min Median Max Outlier #> <chr> <chr> <chr> <chr> <int> <dbl> <dbl> <dbl> <dbl> #> 1 ENV factor No 4 152 NA NA NA NA #> 2 GEN factor No 13 152 NA NA NA NA #> 3 REP factor No 3 152 NA NA NA NA #> 4 PH numeric Yes - 149 1.71 2.52 3.04 0 #> 5 EH numeric Yes - 149 0.75 1.41 3.35 1#> Warning: Considering the levels of factors, .data should have 156 rows, but it has 152. Use 'to_factor()' for coercing a variable to a factor.#> Warning: Missing values in variable(s) PH, EH.#> Warning: Possible outliers in variable(s) EH. Use 'find_outliers()' for more details.# }