workflowTransfer.RdTransfers an attribute (generally time/age, but any others are possible) from one sequence (defined by the argument transfer.from) to another (defined by the argument transfer.to) lacking it. The transference of the attribute is based on the following assumption: similar samples have similar attributes. This assumption might not hold for noisy multivariate time-series. Attribute transference can be done in two different ways (defined by the mode argument):
Direct: transfers the selected attribute between samples with the maximum similarity. This option will likely generate duplicated attribute values in the output.
Interpolate: obtains new attribute values through weighted interpolation, being the weights derived from the distances between samples
workflowTransfer( sequences = NULL, grouping.column = NULL, time.column = NULL, exclude.columns = NULL, method = "manhattan", transfer.what = NULL, transfer.from = NULL, transfer.to = NULL, mode = "direct", plot = FALSE )
| sequences | dataframe with multiple sequences identified by a grouping column generated by |
|---|---|
| grouping.column | character string, name of the column in |
| time.column | character string, name of the column with time/depth/rank data. |
| exclude.columns | character string or character vector with column names in |
| method | character string naming a distance metric. Valid entries are: "manhattan", "euclidean", "chi", and "hellinger". Invalid entries will throw an error. |
| transfer.what | character string, column of |
| transfer.from | character string, group available in |
| transfer.to | character string, group available in |
| mode | character string, one of: "direct" (default), "interpolate". |
| plot | boolean, if |
A dataframe with the sequence transfer.to, with a column named after transfer.what with the attribute values.
#loading sample dataset data(pollenGP) #subset pollenGP to make a shorter dataset pollenGP <- pollenGP[1:50, ] #generating a subset of pollenGP set.seed(10) pollenX <- pollenGP[sort(sample(1:50, 40)), ] #we separate the age column pollenX.age <- pollenX$age #and remove the age values from pollenX pollenX$age <- NULL pollenX$depth <- NULL #removing some samples from pollenGP #so pollenX is not a perfect subset of pollenGP pollenGP <- pollenGP[-sample(1:50, 10), ] #prepare sequences GP.X <- prepareSequences( sequence.A = pollenGP, sequence.A.name = "GP", sequence.B = pollenX, sequence.B.name = "X", grouping.column = "id", time.column = "age", exclude.columns = "depth", transformation = "none" )#> Warning: I couldn't find 'time.column' in 'sequenceB'. Added one and filled it with NA.#transferring age X.new <- workflowTransfer( sequences = GP.X, grouping.column = "id", time.column = "age", method = "manhattan", transfer.what = "age", transfer.from = "GP", transfer.to = "X", mode = "interpolated" )