prepareSequences.Rd
This function prepares two or more multivariate time-series that are to be compared. It can work on two different scenarios:
Two dataframes: The user provides two separated dataframes, each containing a multivariate time series. These time-series can be regular or irregular, aligned or unaligned, but must have at least a few columns with the same names (pay attention to differences in case between column names representing the same entity) and units. This mode uses exclusively the following arguments: sequence.A
, sequence.A.name
(optional), sequence.B
, sequence.B.name
(optional), and merge.model
.
One long dataframe: The user provides a single dataframe, through the sequences
argument, with two or more multivariate time-series identified by a grouping.column
.
prepareSequences( sequence.A = NULL, sequence.A.name = "A", sequence.B = NULL, sequence.B.name = "B", merge.mode = "complete", sequences = NULL, grouping.column = NULL, time.column = NULL, exclude.columns = NULL, if.empty.cases = "zero", transformation = "none", paired.samples = FALSE, same.time = FALSE )
sequence.A | dataframe containing a multivariate time-series. |
---|---|
sequence.A.name | character string with the name of |
sequence.B | dataframe containing a multivariate time-series. Must have overlapping columns with |
sequence.B.name | character string with the name of |
merge.mode | character string, one of: "overlap", "complete" (default option). If "overlap", |
sequences | dataframe with multiple sequences identified by a grouping column. |
grouping.column | character string, name of the column in |
time.column | character string, name of the column with time/depth/rank data. If |
exclude.columns | character string or character vector with column names in |
if.empty.cases | character string with two possible values: "omit", or "zero". If "zero" (default), |
transformation | character string. Defines what data transformation is to be applied to the sequences. One of: "none" (default), "percentage", "proportion", "hellinger", and "scale" (the latter centers and scales the data using the |
paired.samples | boolean. If |
same.time | boolean. If |
A dataframe with the multivariate time series. If squence.A
and sequence.B
are provided, the column identifying the sequences is named "id". If sequences
is provided, the time-series are identified by grouping.column
.
#two sequences as inputs data(sequenceA) data(sequenceB) AB.sequences <- prepareSequences( sequence.A = sequenceA, sequence.A.name = "A", sequence.B = sequenceB, sequence.B.name = "B", merge.mode = "complete", if.empty.cases = "zero", transformation = "hellinger" ) #several sequences in a single dataframe data(sequencesMIS) MIS.sequences <- prepareSequences( sequences = sequencesMIS, grouping.column = "MIS", if.empty.cases = "zero", transformation = "hellinger" )