There is a newer version of this record available.

Software Open Access

quanteda/quanteda: CRAN v1.5.0

Kenneth Benoit; Kohei Watanabe; Haiyan Wang; Paul Nulty; Adam Obeng; Stefan Müller; Jiong Wei Lua; Aki Matsuo; Christian Mueller; Will Lowe; Pablo Barberá; Tyler Rinker; mark padgham; Christopher Gandrud; José Tomás Atria; Tom Paskhalis; nicmer; lindbrook; hofaichan; etienne-s; hotzeplotz; Thomas J. Leeper; Stas Malavin; Michael W. Kearney; Michael Chirico; Katrin Leinweber; Johannes Gruber


JSON-LD (schema.org) Export

{
  "description": "New features\n<ul>\n<li>Add <code>flatten</code> and <code>levels</code> arguments to <code>as.list.dictionary2()</code> to enable more flexible conversion of dictionary objects. (#1661)</li>\n<li>In <code>corpus_sample()</code>, the <code>size</code> now works with the <code>by</code> argument, to control the size of units sampled from each group.</li>\n<li>Improvements to <code>textstat_dist()</code> and <code>textstat_simil()</code>, see below.</li>\n<li>Long tokens are not discarded automatically in the call to <code>tokens()</code>. (#1713)</li>\n</ul>\nBehaviour changes\n<ul>\n<li><code>textstat_dist()</code> and <code>textstat_simil()</code> now return sparse symmetric matrix objects using classes from the <strong>Matrix</strong> package.  This replaces the former structure based on the <code>dist</code> class.  Computation of these classes is now also based on the fast implementation in the <strong>proxyC</strong> package.  When computing similarities, the new <code>min_simil</code> argument allows a user to ignore certain values below a specified similarity threshold.  A new coercion method <code>as.data.frame.textstat_simildist()</code> now exists for converting these returns into a data.frame of pairwise comparisons.  Existing methods such as <code>as.matrix()</code>, <code>as.dist()</code>, and <code>as.list()</code> work as they did before.</li>\n<li>We have removed the \"faith\", \"chi-squared\", and \"kullback\" methods from <code>textstat_dist()</code> and <code>textstat_simil()</code> because these were either not symmetric or not invariant to document or feature ordering. Finally, the <code>selection</code> argument has been deprecated in favour of a new <code>y</code> argument.  </li>\n<li><code>textstat_readability()</code> now defaults to <code>measure = \"Flesch\"</code> if no measure is supplied.  This makes it consistent with <code>textstat_lexdiv()</code> that also takes a default measure (\"TTR\") if none is supplied.  (#1715)</li>\n<li>The default values for <code>max_nchar</code> and <code>min_nchar</code> in <code>tokens_select()</code> are now NULL, meaning they are not applied if the user does not supply values.  Fixes #1713.</li>\n</ul>\nBug fixes and stability enhancements\n<ul>\n<li><code>kwic.corpus()</code> and <code>kwic.tokens()</code> behaviour now aligned, meaning that dictionaries are correctly faceted by key instead of by value. (#1684)</li>\n<li>Improved formatting of <code>tokens()</code> verbose output. (#1683)</li>\n<li>Subsetting and printing of subsetted kwic objects is more robust. (#1665)</li>\n<li>The \"Bormuth\" and \"DRP\" measures are now fixed for <code>textstat_readability()</code>. (#1701)</li>\n</ul>", 
  "license": "", 
  "creator": [
    {
      "affiliation": "London School of Economics and Political Science", 
      "@type": "Person", 
      "name": "Kenneth Benoit"
    }, 
    {
      "affiliation": "Waseda University", 
      "@type": "Person", 
      "name": "Kohei Watanabe"
    }, 
    {
      "affiliation": "Tracr", 
      "@type": "Person", 
      "name": "Haiyan Wang"
    }, 
    {
      "affiliation": "University College Dublin", 
      "@type": "Person", 
      "name": "Paul Nulty"
    }, 
    {
      "affiliation": "Columbia University, London School of Economics", 
      "@type": "Person", 
      "name": "Adam Obeng"
    }, 
    {
      "affiliation": "University of Zurich", 
      "@type": "Person", 
      "name": "Stefan M\u00fcller"
    }, 
    {
      "affiliation": "London School of Economics", 
      "@type": "Person", 
      "name": "Jiong Wei Lua"
    }, 
    {
      "affiliation": "Institute for Analytics and Data Science, University of Essex", 
      "@type": "Person", 
      "name": "Aki Matsuo"
    }, 
    {
      "affiliation": "London School of Economics and Political Science", 
      "@type": "Person", 
      "name": "Christian Mueller"
    }, 
    {
      "affiliation": "Princeton University", 
      "@type": "Person", 
      "name": "Will Lowe"
    }, 
    {
      "affiliation": "University of Southern California", 
      "@type": "Person", 
      "name": "Pablo Barber\u00e1"
    }, 
    {
      "affiliation": "Campus Labs", 
      "@type": "Person", 
      "name": "Tyler Rinker"
    }, 
    {
      "affiliation": "@ATFutures", 
      "@type": "Person", 
      "name": "mark padgham"
    }, 
    {
      "affiliation": "@zalando", 
      "@type": "Person", 
      "name": "Christopher Gandrud"
    }, 
    {
      "affiliation": "", 
      "@type": "Person", 
      "name": "Jos\u00e9 Tom\u00e1s Atria"
    }, 
    {
      "affiliation": "London School of Economics and Political Science", 
      "@type": "Person", 
      "name": "Tom Paskhalis"
    }, 
    {
      "affiliation": "", 
      "@type": "Person", 
      "name": "nicmer"
    }, 
    {
      "affiliation": "", 
      "@type": "Person", 
      "name": "lindbrook"
    }, 
    {
      "affiliation": "", 
      "@type": "Person", 
      "name": "hofaichan"
    }, 
    {
      "affiliation": "", 
      "@type": "Person", 
      "name": "etienne-s"
    }, 
    {
      "affiliation": "", 
      "@type": "Person", 
      "name": "hotzeplotz"
    }, 
    {
      "affiliation": "", 
      "@type": "Person", 
      "name": "Thomas J. Leeper"
    }, 
    {
      "affiliation": "Soil Cryology Lab", 
      "@type": "Person", 
      "name": "Stas Malavin"
    }, 
    {
      "affiliation": "@MUDSA", 
      "@type": "Person", 
      "name": "Michael W. Kearney"
    }, 
    {
      "affiliation": "@myteksi", 
      "@type": "Person", 
      "name": "Michael Chirico"
    }, 
    {
      "affiliation": "@TIBHannover", 
      "@type": "Person", 
      "name": "Katrin Leinweber"
    }, 
    {
      "affiliation": "University of Glasgow", 
      "@type": "Person", 
      "name": "Johannes Gruber"
    }
  ], 
  "url": "https://zenodo.org/record/3268686", 
  "codeRepository": "https://github.com/quanteda/quanteda/tree/v1.5.0", 
  "datePublished": "2019-07-04", 
  "version": "v1.5.0", 
  "@context": "https://schema.org/", 
  "identifier": "https://doi.org/10.5281/zenodo.3268686", 
  "@id": "https://doi.org/10.5281/zenodo.3268686", 
  "@type": "SoftwareSourceCode", 
  "name": "quanteda/quanteda: CRAN v1.5.0"
}
671
135
views
downloads
All versions This version
Views 67115
Downloads 1352
Data volume 3.7 GB74.1 MB
Unique views 62715
Unique downloads 522

Share

Cite as