To make the usage as consistent as possible with other packages, quanteda
also provides shortcut wrappers to convert
, designed to be
similar in syntax to analagous commands in the packages to whose format they
are converting.
as.wfm(x) as.DocumentTermMatrix(x, ...) dfm2ldaformat(x) quantedaformat2dtm(x)
x | the dfm to be converted |
---|---|
... | additional arguments used only by |
A converted object determined by the value of to
(see above).
See conversion target package documentation for more detailed descriptions
of the return formats.
as.wfm
converts a quanteda dfm into the
wfm
format used by the austin
package.
as.DocumentTermMatrix
will convert a quanteda dfm into
the tm package's DocumentTermMatrix format. Note: The
tm package version of as.TermDocumentMatrix
allows a
weighting
argument, which supplies a weighting function for
TermDocumentMatrix. Here the default is for term frequency
weighting. If you want a different weighting, apply the weights after
converting using one of the tm functions. For other available
weighting functions from the tm package, see
TermDocumentMatrix
.
dfm2ldaformat
provides converts a dfm into the list representation
of terms in documents used by tghe lda package (a list with components
"documents" and "vocab" as needed by
lda.collapsed.gibbs.sampler
).
quantedaformat2dtm
provides converts a dfm into the
sparse simple triplet matrix representation of terms in documents used by the
topicmodels package.
Additional coercion methods to base R objects are also available:
mycorpus <- corpus_subset(data_corpus_inaugural, Year > 1970) quantdfm <- dfm(mycorpus, verbose = FALSE) # shortcut conversion to austin package's wfm format identical(as.wfm(quantdfm), convert(quantdfm, to = "austin"))#> [1] TRUEnot_run({ # shortcut conversion to tm package's DocumentTermMatrix format identical(as.DocumentTermMatrix(quantdfm), convert(quantdfm, to = "tm")) }) not_run({ # shortcut conversion to lda package list format identical(dfm2ldaformat(quantdfm), convert(quantdfm, to = "lda")) }) # shortcut conversion to topicmodels package format not_run({ identical(quantedaformat2dtm(quantdfm), convert(quantdfm, to = "topicmodels")) })