Helper function that rescales a continuous variable to have specified minimum and maximum values.
resca( .data = NULL, ..., values = NULL, new_min = 0, new_max = 100, na.rm = TRUE, keep = TRUE )
.data | The dataset. Grouped data is allowed. |
---|---|
... | Comma-separated list of unquoted variable names that will be rescaled. |
values | Optional vector of values to rescale |
new_min | The minimum value of the new scale. Default is 0. |
new_max | The maximum value of the new scale. Default is 100 |
na.rm | Remove |
keep | Should all variables be kept after rescaling? If false, only rescaled variables will be kept. |
A numeric vector if values
is used as input data or a tibble
if a data frame is used as input in .data
.
The function rescale a continuous variable as follows: $$Rv_i = (Nmax - Nmin)/(Omax - Omin) * (O_i - Omax) + Nmax$$ Where \(Rv_i\) is the rescaled value of the ith position of the variable/ vector; \(Nmax\) and \(Nmin\) are the new maximum and minimum values; \(Omax and Omin\) are the maximum and minimum values of the original data, and \(O_i\) is the ith value of the original data.
There are basically two options to use resca
to rescale a variable.
The first is passing a data frame to .data
argument and selecting one
or more variables to be scaled using ...
. The function will return the
original variables in .data
plus the rescaled variable(s) with the
prefix _res
. By using the function group_by
from dplyr
package it is possible to rescale the variable(s) within each level of the
grouping factor. The second option is pass a numeric vector in the argument
values
. The output, of course, will be a numeric vector of rescaled
values.
Tiago Olivoto tiagoolivoto@gmail.com
#> [1] 0 25 50 75 100#> # A tibble: 6 x 7 #> ENV GEN REP GY HM GY_res HM_res #> <fct> <fct> <fct> <dbl> <dbl> <dbl> <dbl> #> 1 E1 G1 1 2.17 44.9 0.338 0.346 #> 2 E1 G1 2 2.50 46.9 0.414 0.445 #> 3 E1 G1 3 2.43 47.8 0.397 0.487 #> 4 E1 G2 1 3.21 45.2 0.574 0.36 #> 5 E1 G2 2 2.93 45.3 0.512 0.365 #> 6 E1 G2 3 2.56 45.5 0.428 0.375# Rescale within factors; # Select variables that stats with 'N' and ends with 'L'; # Compute the mean of these variables by ENV and GEN; # Rescale the variables that ends with 'L' whithin ENV; data_ge2 %>% select(ENV, GEN, starts_with("N"), ends_with("L")) %>% means_by(ENV, GEN) %>% group_by(ENV) %>% resca(ends_with("L")) %>% head(n = 13)#> # A tibble: 13 x 9 #> # Groups: ENV [1] #> ENV GEN NR NKR NKE EL CL EL_res CL_res #> <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 A1 H1 16.3 33.3 527. 15.4 28.1 34.4 2.50e+ 1 #> 2 A1 H10 16.7 31.2 515. 16.1 31.4 69.9 8.17e+ 1 #> 3 A1 H11 15.2 34.6 530. 16.6 29.0 98.2 4.09e+ 1 #> 4 A1 H12 17.3 32.7 553. 15.2 29.8 20.9 5.52e+ 1 #> 5 A1 H13 18.7 32.9 611. 14.8 31.2 0 7.85e+ 1 #> 6 A1 H2 19.2 33.5 622. 15.0 26.6 11.0 1.42e-14 #> 7 A1 H3 18.5 34.1 610. 15.5 28.0 35.5 2.41e+ 1 #> 8 A1 H4 16.4 38.3 603. 16.0 27.4 63.1 1.42e+ 1 #> 9 A1 H5 14.5 37.4 539. 15.8 28.3 51.8 2.94e+ 1 #> 10 A1 H6 16.8 35.5 569. 16.7 31.7 100 8.74e+ 1 #> 11 A1 H7 17.9 31.5 544 15.4 30.6 34.8 6.94e+ 1 #> 12 A1 H8 17.2 32.8 542. 15.1 32.4 18.8 1.00e+ 2 #> 13 A1 H9 14.9 32.3 489. 15.5 32.0 38.7 9.17e+ 1# }