stri_unique: Extract Unique Elements¶
Description¶
This function returns a character vector like str
, but with duplicate elements removed.
Usage¶
stri_unique(str, ..., opts_collator = NULL)
Arguments¶
|
a character vector |
|
additional settings for |
|
a named list with ICU Collator’s options, see stri_opts_collator, |
Details¶
As usual in stringi, no attributes are copied. Unlike unique
, this function tests for canonical equivalence of strings (and not whether the strings are just bytewise equal). Such an operation is locale-dependent. Hence, stri_unique
is significantly slower (but much better suited for natural language processing) than its base R counterpart.
See also stri_duplicated for indicating non-unique elements.
Value¶
Returns a character vector.
References¶
Collation - ICU User Guide, http://userguide.icu-project.org/collation
See Also¶
Other locale_sensitive: %s<%(), about_locale, about_search_boundaries, about_search_coll, stri_compare(), stri_count_boundaries(), stri_duplicated(), stri_enc_detect2(), stri_extract_all_boundaries(), stri_locate_all_boundaries(), stri_opts_collator(), stri_order(), stri_rank(), stri_sort_key(), stri_sort(), stri_split_boundaries(), stri_trans_tolower(), stri_wrap()
Examples¶
# normalized and non-Unicode-normalized version of the same code point:
stri_unique(c('\u0105', stri_trans_nfkd('\u0105')))
unique(c('\u0105', stri_trans_nfkd('\u0105')))
stri_unique(c('gro\u00df', 'GROSS', 'Gro\u00df', 'Gross'), strength=1)