quanteda/quanteda: CRAN v1.1.0

Published March 6, 2018 | Version v1.1.0

Software Open

New Features

Added as.dfm() methods for tm DocumentTermMatrix and TermDocumentMatrix objects. (#1222)
predict.textmodel_wordscores() nows includes an include_reftexts argument to exclude training texts from the predicted model object (#1229). The default behaviour is include_reftexts = TRUE, producing the same behaviour as existed before the introduction of this argument. This allows rescaling based on the reference documents (since rescaling requires prediction on the reference documents) but provides an easy way to exclude the reference documents from the predicted quantities.
textplot_wordcloud() now uses code entirely internal to quanteda, instead of using the wordcloud package.

Bug fixes and stability enhancements

Eliminated unnecessary dependency on the digest package.
Updated the vignette title to be less generic.
Improved the robustness of dfm_trim() and dfm_weight() for previously weighted dfm objects and when supplied thresholds are proportions instead of counts. (#1237)
Fixed a problem in summary.corpus(x, n = 101) when ndoc(x) > 100 (#1242).
Fixed a problem in predict.textmodel_wordscores(x, rescaling = "mv") that always reset the reference values for rescaling to the first and second documents (#1251).
Issues in the color generation and labels for textplot_keyness() are now resolved (#1233, #1233).

Performance improvements

textmodel methods are now exported, to facilitate extension packages for other textmodel methods (e.g. wordshoal).

Behaviour changes

Changed the default in textmodel_wordfish() to sparse = FALSE, in response to #1216.
dfm_group() now preserves docvars that are constant for the group aggregation (#1228).

Files

Name	Size	Download all
quanteda/quanteda-v1.1.0.zip md5:0879745f05b90396ed4abf220bab5cff	24.1 MB	Preview Download