Split into train and test set while splitting by group

createGroupedDataPartition(group, p)

Arguments

group

vector of groups whose length matches the number of rows in the dataset

p

maximum percentage of data that goes to training (maybe less depending on group sizes)

Value

row position integers corresponding to the training data

Examples

group <- c("A", "B", "A", "B", "C", "C", "A", "A", "D") set.seed(0) train_ind <- createGroupedDataPartition(group, 0.8)
#> Fraction of data in the training set: 0.777777777777778.