prepareSequences.RdThis function prepares two or more multivariate time-series that are to be compared. It can work on two different scenarios:
Two dataframes: The user provides two separated dataframes, each containing a multivariate time series. These time-series can be regular or irregular, aligned or unaligned, but must have at least a few columns with the same names (pay attention to differences in case between column names representing the same entity) and units. This mode uses exclusively the following arguments: sequence.A, sequence.A.name (optional), sequence.B, sequence.B.name (optional), and merge.model.
One long dataframe: The user provides a single dataframe, through the sequences argument, with two or more multivariate time-series identified by a grouping.column.
prepareSequences( sequence.A = NULL, sequence.A.name = "A", sequence.B = NULL, sequence.B.name = "B", merge.mode = "complete", sequences = NULL, grouping.column = NULL, time.column = NULL, exclude.columns = NULL, if.empty.cases = "zero", transformation = "none", paired.samples = FALSE, same.time = FALSE )
| sequence.A | dataframe containing a multivariate time-series. |
|---|---|
| sequence.A.name | character string with the name of |
| sequence.B | dataframe containing a multivariate time-series. Must have overlapping columns with |
| sequence.B.name | character string with the name of |
| merge.mode | character string, one of: "overlap", "complete" (default option). If "overlap", |
| sequences | dataframe with multiple sequences identified by a grouping column. |
| grouping.column | character string, name of the column in |
| time.column | character string, name of the column with time/depth/rank data. If |
| exclude.columns | character string or character vector with column names in |
| if.empty.cases | character string with two possible values: "omit", or "zero". If "zero" (default), |
| transformation | character string. Defines what data transformation is to be applied to the sequences. One of: "none" (default), "percentage", "proportion", "hellinger", and "scale" (the latter centers and scales the data using the |
| paired.samples | boolean. If |
| same.time | boolean. If |
A dataframe with the multivariate time series. If squence.A and sequence.B are provided, the column identifying the sequences is named "id". If sequences is provided, the time-series are identified by grouping.column.
#two sequences as inputs data(sequenceA) data(sequenceB) AB.sequences <- prepareSequences( sequence.A = sequenceA, sequence.A.name = "A", sequence.B = sequenceB, sequence.B.name = "B", merge.mode = "complete", if.empty.cases = "zero", transformation = "hellinger" ) #several sequences in a single dataframe data(sequencesMIS) MIS.sequences <- prepareSequences( sequences = sequencesMIS, grouping.column = "MIS", if.empty.cases = "zero", transformation = "hellinger" )