"Compresses" a dfm whose dimension names are the same, for either documents
or features. This may happen, for instance, if features are made equivalent
through application of a thesaurus. It may also occur after lower-casing or
stemming the features of a dfm, but this should only be done in very rare
cases (approaching never: it's better to do this before constructing
the dfm.) It could also be needed , after a cbind.dfm
or
rbind.dfm
operation.
compress(x, ...) # S3 method for dfm compress(x, margin = c("both", "documents", "features"), ...)
x | input object, a dfm |
---|---|
... | additional arguments passed from generic to specific methods |
margin | character indicating which margin to compress on, either
|
This function is deprecated: use dfm_compress
instead.
not_run({ mat <- rbind(dfm(c("b A A", "C C a b B"), tolower = FALSE, verbose = FALSE), dfm("A C C C C C", tolower = FALSE, verbose = FALSE)) colnames(mat) <- char_tolower(featnames(mat)) mat compress(mat, margin = "documents") compress(mat, margin = "features") compress(mat) # no effect if no compression needed compress(dfm(data_corpus_inaugural, verbose = FALSE)) })