stri_opts_brkiter: Generate a List with BreakIterator Settings¶
Description¶
A convenience function to tune the ICU BreakIterator
’s behavior in some text boundary analysis functions, see stringi-search-boundaries.
Usage¶
stri_opts_brkiter(
type,
locale,
skip_word_none,
skip_word_number,
skip_word_letter,
skip_word_kana,
skip_word_ideo,
skip_line_soft,
skip_line_hard,
skip_sentence_term,
skip_sentence_sep,
...
)
Arguments¶
|
single string; either the break iterator type, one of |
|
single string, |
|
logical; perform no action for ‘words’ that do not fit into any other categories |
|
logical; perform no action for words that appear to be numbers |
|
logical; perform no action for words that contain letters, excluding hiragana, katakana, or ideographic characters |
|
logical; perform no action for words containing kana characters |
|
logical; perform no action for words containing ideographic characters |
|
logical; perform no action for soft line breaks, i.e., positions where a line break is acceptable but not required |
|
logical; perform no action for hard, or mandatory line breaks |
|
logical; perform no action for sentences ending with a sentence terminator (‘ |
|
logical; perform no action for sentences that do not contain an ending sentence terminator, but are ended by a hard separator or end of input |
|
[DEPRECATED] any other arguments passed to this function generate a warning; this argument will be removed in the future |
Details¶
The skip_*
family of settings may be used to prevent performing any special actions on particular types of text boundaries, e.g., in case of the stri_locate_all_boundaries and stri_split_boundaries functions.
Note that custom break iterator rules (advanced users only) should be specified as a single string. For a detailed description of the syntax of RBBI rules, please refer to the ICU User Guide on Boundary Analysis.
Value¶
Returns a named list object. Omitted skip_*
values act as they have been set to FALSE
.
References¶
``ubrk.h`` File Reference – ICU4C API Documentation, https://unicode-org.github.io/icu-docs/apidoc/dev/icu4c/ubrk_8h.html
Boundary Analysis – ICU User Guide, http://userguide.icu-project.org/boundaryanalysis
See Also¶
Other text_boundaries: about_search_boundaries, about_search, stri_count_boundaries(), stri_extract_all_boundaries(), stri_locate_all_boundaries(), stri_split_boundaries(), stri_split_lines(), stri_trans_tolower(), stri_wrap()