There is a newer version of this record available.

Software Open Access

quanteda/quanteda: CRAN v1.5.0

Kenneth Benoit; Kohei Watanabe; Haiyan Wang; Paul Nulty; Adam Obeng; Stefan Müller; Jiong Wei Lua; Aki Matsuo; Christian Mueller; Will Lowe; Pablo Barberá; Tyler Rinker; mark padgham; Christopher Gandrud; José Tomás Atria; Tom Paskhalis; nicmer; lindbrook; hofaichan; etienne-s; hotzeplotz; Thomas J. Leeper; Stas Malavin; Michael W. Kearney; Michael Chirico; Katrin Leinweber; Johannes Gruber


Dublin Core Export

<?xml version='1.0' encoding='utf-8'?>
<oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
  <dc:creator>Kenneth Benoit</dc:creator>
  <dc:creator>Kohei Watanabe</dc:creator>
  <dc:creator>Haiyan Wang</dc:creator>
  <dc:creator>Paul Nulty</dc:creator>
  <dc:creator>Adam Obeng</dc:creator>
  <dc:creator>Stefan Müller</dc:creator>
  <dc:creator>Jiong Wei Lua</dc:creator>
  <dc:creator>Aki Matsuo</dc:creator>
  <dc:creator>Christian Mueller</dc:creator>
  <dc:creator>Will Lowe</dc:creator>
  <dc:creator>Pablo Barberá</dc:creator>
  <dc:creator>Tyler Rinker</dc:creator>
  <dc:creator>mark padgham</dc:creator>
  <dc:creator>Christopher Gandrud</dc:creator>
  <dc:creator>José Tomás Atria</dc:creator>
  <dc:creator>Tom Paskhalis</dc:creator>
  <dc:creator>nicmer</dc:creator>
  <dc:creator>lindbrook</dc:creator>
  <dc:creator>hofaichan</dc:creator>
  <dc:creator>etienne-s</dc:creator>
  <dc:creator>hotzeplotz</dc:creator>
  <dc:creator>Thomas J. Leeper</dc:creator>
  <dc:creator>Stas Malavin</dc:creator>
  <dc:creator>Michael W. Kearney</dc:creator>
  <dc:creator>Michael Chirico</dc:creator>
  <dc:creator>Katrin Leinweber</dc:creator>
  <dc:creator>Johannes Gruber</dc:creator>
  <dc:date>2019-07-04</dc:date>
  <dc:description>New features

Add flatten and levels arguments to as.list.dictionary2() to enable more flexible conversion of dictionary objects. (#1661)
In corpus_sample(), the size now works with the by argument, to control the size of units sampled from each group.
Improvements to textstat_dist() and textstat_simil(), see below.
Long tokens are not discarded automatically in the call to tokens(). (#1713)

Behaviour changes

textstat_dist() and textstat_simil() now return sparse symmetric matrix objects using classes from the Matrix package.  This replaces the former structure based on the dist class.  Computation of these classes is now also based on the fast implementation in the proxyC package.  When computing similarities, the new min_simil argument allows a user to ignore certain values below a specified similarity threshold.  A new coercion method as.data.frame.textstat_simildist() now exists for converting these returns into a data.frame of pairwise comparisons.  Existing methods such as as.matrix(), as.dist(), and as.list() work as they did before.
We have removed the "faith", "chi-squared", and "kullback" methods from textstat_dist() and textstat_simil() because these were either not symmetric or not invariant to document or feature ordering. Finally, the selection argument has been deprecated in favour of a new y argument.  
textstat_readability() now defaults to measure = "Flesch" if no measure is supplied.  This makes it consistent with textstat_lexdiv() that also takes a default measure ("TTR") if none is supplied.  (#1715)
The default values for max_nchar and min_nchar in tokens_select() are now NULL, meaning they are not applied if the user does not supply values.  Fixes #1713.

Bug fixes and stability enhancements

kwic.corpus() and kwic.tokens() behaviour now aligned, meaning that dictionaries are correctly faceted by key instead of by value. (#1684)
Improved formatting of tokens() verbose output. (#1683)
Subsetting and printing of subsetted kwic objects is more robust. (#1665)
The "Bormuth" and "DRP" measures are now fixed for textstat_readability(). (#1701)
</dc:description>
  <dc:identifier>https://zenodo.org/record/3268686</dc:identifier>
  <dc:identifier>10.5281/zenodo.3268686</dc:identifier>
  <dc:identifier>oai:zenodo.org:3268686</dc:identifier>
  <dc:relation>url:https://github.com/quanteda/quanteda/tree/v1.5.0</dc:relation>
  <dc:relation>doi:10.5281/zenodo.596731</dc:relation>
  <dc:rights>info:eu-repo/semantics/openAccess</dc:rights>
  <dc:title>quanteda/quanteda: CRAN v1.5.0</dc:title>
  <dc:type>info:eu-repo/semantics/other</dc:type>
  <dc:type>software</dc:type>
</oai_dc:dc>
556
125
views
downloads
All versions This version
Views 5568
Downloads 1251
Data volume 3.3 GB37.0 MB
Unique views 5218
Unique downloads 421

Share

Cite as