Split into train and test set while splitting by groups

create_grouped_data_partition(groups, p)

Arguments

groups

vector of groups. length must match the number of rows in the dataset.

p

maximum fraction of data that goes to training (maybe less depending on groups sizes)

Value

vector of row indices for the training set

Author

Zena Lapp, zenalapp@umich.edu

Examples

groups <- c("A", "B", "A", "B", "C", "C", "A", "A", "D") set.seed(0) train_ind <- create_grouped_data_partition(groups, 0.8)
#> Fraction of data in the training set: 0.777777777777778.