Software Open Access
Kenneth Benoit; Kohei Watanabe; Haiyan Wang; Paul Nulty; Adam Obeng; Stefan Müller; Jiong Wei Lua; Aki Matsuo; Christian Mueller; José Tomás Atria; Will Lowe; Pablo Barberá; Christopher Gandrud; mark padgham; Tyler Rinker; Johannes Gruber; Katrin Leinweber; Kevin Reuning; Michael Chirico; Michael W. Kearney; Stas Malavin; Thomas J. Leeper; hotzeplotz; Chung-hong Chan; etienne-s; hofaichan; lindbrook; mmzmm; nicmer; Tom Paskhalis
dfm()
returns a dfm with the identical column order even if tokens_compound()
or tokens_ngrams()
is used in the upstream (#2100).dfm_group()
with NA values in a grouping variable now drops those, similar to the behaviour of tokens_group()
and corpus_group()
(#2134).char_wordstem()
now has a a new argument check_whitespace
, which will not throw an error when lower-casing text containing a whitespace character.dfm_remove()
now has a new argument padding = FALSE
that when TRUE
, collects counts of the removed features in the first column. This produces results consistent with what is compiled as a dfm built from tokens where some have been removed with padding = TRUE
(#2152).Name | Size | |
---|---|---|
quanteda/quanteda-v3.2.0.zip
md5:8dc5bd23b0203808e2ca2251318960db |
37.6 MB | Download |
All versions | This version | |
---|---|---|
Views | 3,050 | 86 |
Downloads | 310 | 2 |
Data volume | 9.1 GB | 75.2 MB |
Unique views | 2,883 | 81 |
Unique downloads | 195 | 2 |