R/split_by_group.R
createGroupedDataPartition.Rd
Split into train and test set while splitting by group
createGroupedDataPartition(group, p)
group | vector of groups whose length matches the number of rows in the dataset |
---|---|
p | maximum percentage of data that goes to training (maybe less depending on group sizes) |
row position integers corresponding to the training data
group <- c("A", "B", "A", "B", "C", "C", "A", "A", "D") set.seed(0) train_ind <- createGroupedDataPartition(group, 0.8)#>