Convert various input as pattern to a vector used in tokens_select, tokens_compound and kwic.
pattern2id(pattern, types, valuetype, case_insensitive, concatenator = "_", remove_unigram = FALSE)
pattern | a character vector, list of character vectors, dictionary, collocations, or dfm. See pattern for details. |
---|---|
valuetype | the type of pattern matching: |
case_insensitive | ignore the case of dictionary values if |
concatenator | concatenator that join multi-word expression in tokens object |
remove_unigram | ignore single-word patterns if |
regex2id