workflowTransfer.Rd
Transfers an attribute (generally time/age, but any others are possible) from one sequence (defined by the argument transfer.from
) to another (defined by the argument transfer.to
) lacking it. The transference of the attribute is based on the following assumption: similar samples have similar attributes. This assumption might not hold for noisy multivariate time-series. Attribute transference can be done in two different ways (defined by the mode
argument):
Direct: transfers the selected attribute between samples with the maximum similarity. This option will likely generate duplicated attribute values in the output.
Interpolate: obtains new attribute values through weighted interpolation, being the weights derived from the distances between samples
workflowTransfer( sequences = NULL, grouping.column = NULL, time.column = NULL, exclude.columns = NULL, method = "manhattan", transfer.what = NULL, transfer.from = NULL, transfer.to = NULL, mode = "direct", plot = FALSE )
sequences | dataframe with multiple sequences identified by a grouping column generated by |
---|---|
grouping.column | character string, name of the column in |
time.column | character string, name of the column with time/depth/rank data. |
exclude.columns | character string or character vector with column names in |
method | character string naming a distance metric. Valid entries are: "manhattan", "euclidean", "chi", and "hellinger". Invalid entries will throw an error. |
transfer.what | character string, column of |
transfer.from | character string, group available in |
transfer.to | character string, group available in |
mode | character string, one of: "direct" (default), "interpolate". |
plot | boolean, if |
A dataframe with the sequence transfer.to
, with a column named after transfer.what
with the attribute values.
#loading sample dataset data(pollenGP) #subset pollenGP to make a shorter dataset pollenGP <- pollenGP[1:50, ] #generating a subset of pollenGP set.seed(10) pollenX <- pollenGP[sort(sample(1:50, 40)), ] #we separate the age column pollenX.age <- pollenX$age #and remove the age values from pollenX pollenX$age <- NULL pollenX$depth <- NULL #removing some samples from pollenGP #so pollenX is not a perfect subset of pollenGP pollenGP <- pollenGP[-sample(1:50, 10), ] #prepare sequences GP.X <- prepareSequences( sequence.A = pollenGP, sequence.A.name = "GP", sequence.B = pollenX, sequence.B.name = "X", grouping.column = "id", time.column = "age", exclude.columns = "depth", transformation = "none" )#> Warning: I couldn't find 'time.column' in 'sequenceB'. Added one and filled it with NA.#transferring age X.new <- workflowTransfer( sequences = GP.X, grouping.column = "id", time.column = "age", method = "manhattan", transfer.what = "age", transfer.from = "GP", transfer.to = "X", mode = "interpolated" )