Users can subset output object of textstat_collocations
,
textstat_keyness
or textstat_frequency
based on
"glob"
, "regex"
or "fixed"
patterns using this method.
textstat_select(x, pattern = NULL, selection = c("keep", "remove"), valuetype = c("glob", "regex", "fixed"), case_insensitive = TRUE)
x | a |
---|---|
pattern | a character vector, list of character vectors, dictionary, collocations, or dfm. See pattern for details. |
selection | whether to |
valuetype | the type of pattern matching: |
case_insensitive | ignore case when matching, if |
period <- ifelse(docvars(data_corpus_inaugural, "Year") < 1945, "pre-war", "post-war") mydfm <- dfm(data_corpus_inaugural, groups = period) keyness <- textstat_keyness(mydfm) textstat_select(keyness, 'america*')#> feature chi2 p n_target n_reference #> 7 america 176.6114029 0.000000e+00 130 54 #> 9 americans 150.6479704 0.000000e+00 67 7 #> 16 america's 94.0689468 0.000000e+00 35 0 #> 107 american 19.0970791 1.242349e-05 69 94 #> 1134 americas 0.7944222 3.727663e-01 2 1 #> 2290 american's 0.2648039 6.068389e-01 1 0 #> 6902 americanism -0.3721574 5.418307e-01 0 1