Users can subset output object of textstat_collocations, textstat_keyness or textstat_frequency based on "glob", "regex" or "fixed" patterns using this method.

textstat_select(x, pattern = NULL, selection = c("keep", "remove"),
  valuetype = c("glob", "regex", "fixed"), case_insensitive = TRUE)

Arguments

x

a textstat object

pattern

a character vector, list of character vectors, dictionary, or collocations object. See pattern for details.

selection

whether to "keep" or "remove" the rows that match the pattern

valuetype

the type of pattern matching: "glob" for "glob"-style wildcard expressions; "regex" for regular expressions; or "fixed" for exact matching. See valuetype for details.

case_insensitive

ignore case when matching, if TRUE

Examples

period <- ifelse(docvars(data_corpus_inaugural, "Year") < 1945, "pre-war", "post-war") dfmat <- dfm(data_corpus_inaugural, groups = period) tstat <- textstat_keyness(dfmat) textstat_select(tstat, 'america*')
#> feature chi2 p n_target n_reference #> 7 america 176.6114029 0.000000e+00 130 54 #> 9 americans 150.6479704 0.000000e+00 67 7 #> 16 america's 94.0689468 0.000000e+00 35 0 #> 107 american 19.0970791 1.242349e-05 69 94 #> 1134 americas 0.7944222 3.727663e-01 2 1 #> 2290 american's 0.2648039 6.068389e-01 1 0 #> 6902 americanism -0.3721574 5.418307e-01 0 1