NAs and zeros can increase the noise in multi-environment trial analysis. This collection of functions will make it easier to deal with them.

  • has_na(), has_zero() : Check for NAs and 0s in the data and return a logical value.

  • random_na(): Generate random NA values in a two-way table based on a desired proportion.

  • remove_cols_na(), remove_cols_zero(): Remove columns with NAs and 0s, respectively.

  • remove_rows_na(), remove_rows_zero(): Remove rows with NAs and 0s, respectively.

  • select_cols_na(), select_cols_zero(): Select columns with NAs and 0s, respectively.

  • select_rows_na(), select_rows_zero(): Select rows with NAs and 0s, respectively.

  • replace_na(), replace_zero() Replace NAs and 0s, respectively, with a replacement value.

has_na(.data)

remove_rows_na(.data, verbose = TRUE)

remove_cols_na(.data, verbose = TRUE)

select_cols_na(.data, verbose = TRUE)

select_rows_na(.data, verbose = TRUE)

replace_na(.data, ..., replace = 0, replacement = 0)

random_na(.data, prop)

has_zero(.data)

remove_rows_zero(.data, verbose = TRUE)

remove_cols_zero(.data, verbose = TRUE)

select_cols_zero(.data, verbose = TRUE)

select_rows_zero(.data, verbose = TRUE)

replace_zero(.data, ..., replacement = NA)

Arguments

.data

A data frame or tibble

verbose

Logical argument. If TRUE (default) shows in console the rows or columns deleted.

...

Variables to replace NAs in replace_na() or zeros in replace_zero() . If ... is null then all variables with NA or 0 will be replaced by the value in replacement argument of such functions. It must be a single variable name or a comma-separated list of unquoted variables names. Select helpers are also allowed.

replace

Deprecated argument as of 1.8.0. Use replacement instead.

replacement

The value used for replacement. Defaults to 0. Use replacement. = "colmean" to replace missing values with column mean.

prop

The proportion (percentage) of NA values to generate in .data.

Value

A data frame with rows or columns with NA values deleted.

Author

Tiago Olivoto tiagoolivoto@gmail.com

Examples

# \donttest{ library(metan) data_naz <- data_g data_naz[c(1, 5, 10), c(3:5, 10:15)] <- NA data_naz[c(2, 6, 9), c(6:7, 12:13)] <- 0 has_na(data_naz)
#> [1] TRUE
has_zero(data_naz)
#> [1] TRUE
# Remove columns remove_cols_na(data_naz)
#> Warning: Column(s) PH, EH, EP, CW, KW, NR, NKR, CDED, PERK with NA values deleted.
#> # A tibble: 39 x 8 #> GEN REP EL ED CL CD TKW NKE #> <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 H1 1 15.7 49.9 30.5 16.6 347. 458. #> 2 H1 2 0 0 30.5 14.7 337. 386. #> 3 H1 3 15.1 52.6 31.7 16.2 422. 431. #> 4 H10 1 13.9 44.1 26.2 15.0 258. 446. #> 5 H10 2 13.6 43.9 23.5 14.4 233. 496. #> 6 H10 3 0 0 24.6 16.1 251. 524. #> 7 H11 1 15.5 45.2 25.0 16.7 264. 535. #> 8 H11 2 12.2 46.9 26.5 14.3 288. 397 #> 9 H11 3 0 0 27.5 15.2 315. 532. #> 10 H12 1 14.4 49.2 28.4 15 291. 525. #> # ... with 29 more rows
remove_cols_zero(data_naz)
#> Warning: Column(s) EL, ED, NR, NKR with 0s deleted.
#> # A tibble: 39 x 13 #> GEN REP PH EH EP CL CD CW KW CDED PERK TKW #> <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 H1 1 NA NA NA 30.5 16.6 NA NA NA NA 347. #> 2 H1 2 2.20 1.09 0.492 30.5 14.7 22.3 130. 0.619 85.2 337. #> 3 H1 3 2.29 1.15 0.502 31.7 16.2 29.6 176. 0.603 85.9 422. #> 4 H10 1 1.79 0.888 0.514 26.2 15.0 12.9 116. 0.596 89.8 258. #> 5 H10 2 NA NA NA 23.5 14.4 NA NA NA NA 233. #> 6 H10 3 2.27 1.11 0.491 24.6 16.1 12.5 128. 0.566 90.7 251. #> 7 H11 1 1.71 0.808 0.489 25.0 16.7 15.2 140. 0.552 90.3 264. #> 8 H11 2 2.09 1.06 0.509 26.5 14.3 13.5 114. 0.566 89.3 288. #> 9 H11 3 2.5 1.44 0.577 27.5 15.2 19.4 168. 0.562 89.6 315. #> 10 H12 1 NA NA NA 28.4 15 NA NA NA NA 291. #> # ... with 29 more rows, and 1 more variable: NKE <dbl>
remove_rows_na(data_naz)
#> Warning: Row(s) 1, 5, 10 with NA values deleted.
#> # A tibble: 36 x 17 #> GEN REP PH EH EP EL ED CL CD CW KW NR NKR #> <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 H1 2 2.20 1.09 0.492 0 0 30.5 14.7 22.3 130. 0 0 #> 2 H1 3 2.29 1.15 0.502 15.1 52.6 31.7 16.2 29.6 176. 15.6 29.2 #> 3 H10 1 1.79 0.888 0.514 13.9 44.1 26.2 15.0 12.9 116. 14.8 33 #> 4 H10 3 2.27 1.11 0.491 0 0 24.6 16.1 12.5 128. 0 0 #> 5 H11 1 1.71 0.808 0.489 15.5 45.2 25.0 16.7 15.2 140. 15.6 36 #> 6 H11 2 2.09 1.06 0.509 12.2 46.9 26.5 14.3 13.5 114. 16.8 26.2 #> 7 H11 3 2.5 1.44 0.577 0 0 27.5 15.2 19.4 168. 0 0 #> 8 H12 2 2.77 1.58 0.572 13.8 46.5 23.8 14.6 16.3 153. 17.6 31.4 #> 9 H12 3 2.00 0.782 0.386 13.7 47.5 25.3 14.3 18.9 139. 14.8 25.4 #> 10 H13 1 2.52 1.09 0.434 16.1 51.7 28.2 16.6 23.9 199. 18 30.8 #> # ... with 26 more rows, and 4 more variables: CDED <dbl>, PERK <dbl>, #> # TKW <dbl>, NKE <dbl>
remove_rows_zero(data_naz)
#> Warning: Row(s) 2, 6, 9 with 0s deleted.
#> # A tibble: 36 x 17 #> GEN REP PH EH EP EL ED CL CD CW KW NR #> <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 H1 1 NA NA NA 15.7 49.9 30.5 16.6 NA NA NA #> 2 H1 3 2.29 1.15 0.502 15.1 52.6 31.7 16.2 29.6 176. 15.6 #> 3 H10 1 1.79 0.888 0.514 13.9 44.1 26.2 15.0 12.9 116. 14.8 #> 4 H10 2 NA NA NA 13.6 43.9 23.5 14.4 NA NA NA #> 5 H11 1 1.71 0.808 0.489 15.5 45.2 25.0 16.7 15.2 140. 15.6 #> 6 H11 2 2.09 1.06 0.509 12.2 46.9 26.5 14.3 13.5 114. 16.8 #> 7 H12 1 NA NA NA 14.4 49.2 28.4 15 NA NA NA #> 8 H12 2 2.77 1.58 0.572 13.8 46.5 23.8 14.6 16.3 153. 17.6 #> 9 H12 3 2.00 0.782 0.386 13.7 47.5 25.3 14.3 18.9 139. 14.8 #> 10 H13 1 2.52 1.09 0.434 16.1 51.7 28.2 16.6 23.9 199. 18 #> # ... with 26 more rows, and 5 more variables: NKR <dbl>, CDED <dbl>, #> # PERK <dbl>, TKW <dbl>, NKE <dbl>
# Select columns select_cols_na(data_naz)
#> Warning: Column(s) with NAs: PH, EH, EP, CW, KW, NR, NKR, CDED, PERK
#> # A tibble: 39 x 9 #> PH EH EP CW KW NR NKR CDED PERK #> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 NA NA NA NA NA NA NA NA NA #> 2 2.20 1.09 0.492 22.3 130. 0 0 0.619 85.2 #> 3 2.29 1.15 0.502 29.6 176. 15.6 29.2 0.603 85.9 #> 4 1.79 0.888 0.514 12.9 116. 14.8 33 0.596 89.8 #> 5 NA NA NA NA NA NA NA NA NA #> 6 2.27 1.11 0.491 12.5 128. 0 0 0.566 90.7 #> 7 1.71 0.808 0.489 15.2 140. 15.6 36 0.552 90.3 #> 8 2.09 1.06 0.509 13.5 114. 16.8 26.2 0.566 89.3 #> 9 2.5 1.44 0.577 19.4 168. 0 0 0.562 89.6 #> 10 NA NA NA NA NA NA NA NA NA #> # ... with 29 more rows
select_cols_zero(data_naz)
#> Warning: Column(s) with 0s: EL, ED, NR, NKR
#> # A tibble: 39 x 4 #> EL ED NR NKR #> <dbl> <dbl> <dbl> <dbl> #> 1 15.7 49.9 NA NA #> 2 0 0 0 0 #> 3 15.1 52.6 15.6 29.2 #> 4 13.9 44.1 14.8 33 #> 5 13.6 43.9 NA NA #> 6 0 0 0 0 #> 7 15.5 45.2 15.6 36 #> 8 12.2 46.9 16.8 26.2 #> 9 0 0 0 0 #> 10 14.4 49.2 NA NA #> # ... with 29 more rows
select_rows_na(data_naz)
#> Warning: Rows(s) with NAs: 1, 5, 10
#> # A tibble: 3 x 17 #> GEN REP PH EH EP EL ED CL CD CW KW NR NKR #> <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 H1 1 NA NA NA 15.7 49.9 30.5 16.6 NA NA NA NA #> 2 H10 2 NA NA NA 13.6 43.9 23.5 14.4 NA NA NA NA #> 3 H12 1 NA NA NA 14.4 49.2 28.4 15 NA NA NA NA #> # ... with 4 more variables: CDED <dbl>, PERK <dbl>, TKW <dbl>, NKE <dbl>
select_rows_zero(data_naz)
#> Warning: Rows(s) with 0s: 1, 2, 3
#> # A tibble: 3 x 17 #> GEN REP PH EH EP EL ED CL CD CW KW NR NKR #> <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 H1 2 2.20 1.09 0.492 0 0 30.5 14.7 22.3 130. 0 0 #> 2 H10 3 2.27 1.11 0.491 0 0 24.6 16.1 12.5 128. 0 0 #> 3 H11 3 2.5 1.44 0.577 0 0 27.5 15.2 19.4 168. 0 0 #> # ... with 4 more variables: CDED <dbl>, PERK <dbl>, TKW <dbl>, NKE <dbl>
# Replace values replace_na(data_naz)
#> # A tibble: 39 x 17 #> GEN REP PH EH EP EL ED CL CD CW KW NR NKR #> <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 H1 1 0 0 0 15.7 49.9 30.5 16.6 0 0 0 0 #> 2 H1 2 2.20 1.09 0.492 0 0 30.5 14.7 22.3 130. 0 0 #> 3 H1 3 2.29 1.15 0.502 15.1 52.6 31.7 16.2 29.6 176. 15.6 29.2 #> 4 H10 1 1.79 0.888 0.514 13.9 44.1 26.2 15.0 12.9 116. 14.8 33 #> 5 H10 2 0 0 0 13.6 43.9 23.5 14.4 0 0 0 0 #> 6 H10 3 2.27 1.11 0.491 0 0 24.6 16.1 12.5 128. 0 0 #> 7 H11 1 1.71 0.808 0.489 15.5 45.2 25.0 16.7 15.2 140. 15.6 36 #> 8 H11 2 2.09 1.06 0.509 12.2 46.9 26.5 14.3 13.5 114. 16.8 26.2 #> 9 H11 3 2.5 1.44 0.577 0 0 27.5 15.2 19.4 168. 0 0 #> 10 H12 1 0 0 0 14.4 49.2 28.4 15 0 0 0 0 #> # ... with 29 more rows, and 4 more variables: CDED <dbl>, PERK <dbl>, #> # TKW <dbl>, NKE <dbl>
replace_zero(data_naz)
#> [1] FALSE
#> # A tibble: 39 x 17 #> GEN REP PH EH EP EL ED CL CD CW KW NR #> <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 H1 1 NA NA NA 15.7 49.9 30.5 16.6 NA NA NA #> 2 H1 2 2.20 1.09 0.492 NA NA 30.5 14.7 22.3 130. NA #> 3 H1 3 2.29 1.15 0.502 15.1 52.6 31.7 16.2 29.6 176. 15.6 #> 4 H10 1 1.79 0.888 0.514 13.9 44.1 26.2 15.0 12.9 116. 14.8 #> 5 H10 2 NA NA NA 13.6 43.9 23.5 14.4 NA NA NA #> 6 H10 3 2.27 1.11 0.491 NA NA 24.6 16.1 12.5 128. NA #> 7 H11 1 1.71 0.808 0.489 15.5 45.2 25.0 16.7 15.2 140. 15.6 #> 8 H11 2 2.09 1.06 0.509 12.2 46.9 26.5 14.3 13.5 114. 16.8 #> 9 H11 3 2.5 1.44 0.577 NA NA 27.5 15.2 19.4 168. NA #> 10 H12 1 NA NA NA 14.4 49.2 28.4 15 NA NA NA #> # ... with 29 more rows, and 5 more variables: NKR <dbl>, CDED <dbl>, #> # PERK <dbl>, TKW <dbl>, NKE <dbl>
# }