Software Open Access
Kenneth Benoit; Kohei Watanabe; Haiyan Wang; Paul Nulty; Adam Obeng; Stefan Müller; Jiong Wei Lua; Aki Matsuo; Christian Mueller; Will Lowe; Pablo Barberá; Tyler Rinker; mark padgham; Christopher Gandrud; Tom Paskhalis; nicmer; lindbrook; hofaichan; etienne-s; hotzeplotz; Thomas J. Leeper; Stas Malavin; Michael W. Kearney; Michael Chirico; Katrin Leinweber
Bug fixes and stability enhancements
dfm_group()that changed or deleted docvars attributes of dfm objects (#1506).
textplot_xray()that caused incorrect facet labels when a pattern contained multiple list elements or values (#1514).
kwic()now correctly returns the pattern associated with each match as the
"keywords"attribute, for all
textstat_lexdiv()now works on tokens objects, not just dfm objects. New methods of lexical diversity now include MATTR (the Moving-Average Type-Token Ratio, Covington & McFall 2010) and MSTTR (Mean Segmental Type-Token Ratio).
tokens_split()allows splitting single into multiple tokens based on a pattern match. (#1500)
tokens_chunk()allows splitting tokens into new documents of equally-sized "chunks". (#1520)
textstat_entropy()now computes entropy for a dfm across feature or document margins.
textstat_readability()is vastly improved, now providing detailing all formulas and providing full references.
dfm_match()allows a user to specify the features in a dfm according to a fixed vector of feature names, including those of another dfm. Replaces
patternwas a dfm.
textplot_network()to allow more precise control of label sizes, either globally or individually.
tokens.tokens(x, remove_hyphens = TRUE)where
xwas generated with
remove_hyphens = FALSEnow behaves similarly to how the same tokens would be handled had this option been called on character input as
tokens.character(x, remove_hyphens = TRUE). (#1498)