stringi: THE String Processing Package for R

stringi (pronounced “stringy”, IPA [strinɡi]) is THE R package for very fast, portable, correct, consistent, and convenient string/text processing in any locale or character encoding.

—by Marek Gagolewski

Thanks to ICU, stringi fully supports a wide range of Unicode standards (see also this video).

It gives you a multitude of functions for:

  • string concatenation, padding, wrapping,

  • substring extraction,

  • pattern searching (e.g., with ICU Java-like regular expressions),

  • collation and sorting,

  • random string generation,

  • case mapping and folding,

  • string transliteration,

  • Unicode normalisation,

  • date-time formatting and parsing,

and many more.

stringi is among the most often downloaded R packages. downloads1 downloads2

You can obtain it from CRAN by calling:

install.packages("stringi")

stringi’s source code is hosted on GitHub. It has been released under the open source BSD-3-clause license.

The package’s API was inspired by that of the early (pre-tidyverse; v0.6.2) version of Hadley Wickham’s stringr package (and since the 2015 v1.0.0 stringr is powered by stringi). Moreover, Hadley suggested quite a few new package features. The contributions from Bartłomiej Tartanus and many others is greatly appreciated. Thanks!

Reference Manual