the function for creating a document term matrix
(tibble) the data frame containing the text data
(string) the name of the column containing the unique id
(string) the name of the column containing the text data
(list) the minimum and maximum n-gram length, e.g. c(1,3)
(stopwords) the stopwords to remove, e.g. stopwords::stopwords("en", source = "snowball")
(string) the word to remove
(integer) the rate of occurence of a word to be removed
(string) the mode of removal -> "most" or "least"
(integer) the rate of most frequent words to be removed
(integer) the rate of least frequent words to be removed
(float) the proportion of the data to be used for training
(integer) the random seed for reproducibility
(string) the directory to save the results, default is "./results", if NULL, no results are saved
the document term matrix