textstat_readability.Rd
Calculate the readability of text(s) using one of a variety of computed indexes.
textstat_readability(x, measure = c("all", "ARI", "ARI.simple", "Bormuth", "Bormuth.GP", "Coleman", "Coleman.C2", "Coleman.Liau", "Coleman.Liau.grade", "Coleman.Liau.short", "Dale.Chall", "Dale.Chall.old", "Dale.Chall.PSK", "Danielson.Bryan", "Danielson.Bryan.2", "Dickes.Steiwer", "DRP", "ELF", "Farr.Jenkins.Paterson", "Flesch", "Flesch.PSK", "Flesch.Kincaid", "FOG", "FOG.PSK", "FOG.NRI", "FORCAST", "FORCAST.RGL", "Fucks", "Linsear.Write", "LIW", "nWS", "nWS.2", "nWS.3", "nWS.4", "RIX", "Scrabble", "SMOG", "SMOG.C", "SMOG.simple", "SMOG.de", "Spache", "Spache.old", "Strain", "Traenkle.Bailer", "Traenkle.Bailer.2", "Wheeler.Smith", "meanSentenceLength", "meanWordSyllables"), remove_hyphens = TRUE, min_sentence_length = 1, max_sentence_length = 10000, intermediate = FALSE, ...)
x | a character or corpus object containing the texts |
---|---|
measure | character vector defining the readability measure to calculate. Matches are case-insensitive. |
remove_hyphens | if |
min_sentence_length, max_sentence_length | set the minimum and maximum sentence lengths (in tokens, excluding punctuation) to include in the computation of readability. This makes it easy to exclude "sentences" that may not really be sentences, such as section titles, table elements, and other cruft that might be in the texts following conversion. For finer-grained control, consider filtering sentences prior first,
including through pattern-matching, using |
intermediate | if |
... | not used |
textstat_readability
returns a data.frame of documents and
their readability scores.
txt <- c(doc1 = "Readability zero one. Ten, Eleven.", doc2 = "The cat in a dilapidated tophat.") textstat_readability(txt, "Flesch")#> document Flesch #> 1 doc1 1.2575 #> 2 doc2 45.6450textstat_readability(txt, c("FOG", "FOG.PSK", "FOG.NRI"))#> document FOG FOG.PSK FOG.NRI #> 1 doc1 17.000000 4.608659 -1.3875 #> 2 doc2 9.066667 3.254382 -1.2600textstat_readability(data_corpus_inaugural[48:58], measure = c("Flesch.Kincaid", "Dale.Chall.old"))#> document Flesch.Kincaid Dale.Chall.old #> 1 1977-Carter 11.670742 8.218925 #> 2 1981-Reagan 9.798608 7.588069 #> 3 1985-Reagan 10.420294 7.430830 #> 4 1989-Bush 7.147029 6.584037 #> 5 1993-Clinton 10.374204 7.340028 #> 6 1997-Clinton 9.828863 7.388557 #> 7 2001-Bush 8.918201 7.216451 #> 8 2005-Bush 11.036277 7.622865 #> 9 2009-Obama 10.229426 7.456305 #> 10 2013-Obama 11.734767 7.845061 #> 11 2017-Trump 9.163084 6.777431