R/sparse_tidiers.R
tdm_tidiers.Rd
Tidy a DocumentTermMatrix or TermDocumentMatrix into
a three-column data frame: term{}
, and value (with
zeros missing), with one-row-per-term-per-document.
# S3 method for DocumentTermMatrix tidy(x, ...) # S3 method for TermDocumentMatrix tidy(x, ...) # S3 method for dfm tidy(x, ...) # S3 method for dfmSparse tidy(x, ...) # S3 method for simple_triplet_matrix tidy(x, row_names = NULL, col_names = NULL, ...)
x | A DocumentTermMatrix or TermDocumentMatrix object |
---|---|
... | Extra arguments, not used |
row_names | Specify row names |
col_names | Specify column names |
if (requireNamespace("topicmodels", quietly = TRUE)) { data("AssociatedPress", package = "topicmodels") AssociatedPress tidy(AssociatedPress) }#> # A tibble: 302,031 x 3 #> document term count #> <int> <chr> <dbl> #> 1 1 adding 1 #> 2 1 adult 2 #> 3 1 ago 1 #> 4 1 alcohol 1 #> 5 1 allegedly 1 #> 6 1 allen 1 #> 7 1 apparently 2 #> 8 1 appeared 1 #> 9 1 arrested 1 #> 10 1 assault 1 #> # … with 302,021 more rows