Convert texts or tokens to lower (or upper) case
toLower(x, keep_acronyms = FALSE, ...) # S3 method for character toLower(x, keep_acronyms = FALSE, ...) # S3 method for NULL toLower(x, ...) # S3 method for tokenizedTexts toLower(x, keep_acronyms = FALSE, ...) # S3 method for tokens toLower(x, ...) # S3 method for tokens toUpper(x, ...) # S3 method for corpus toLower(x, ...) toUpper(x, ...) # S3 method for character toUpper(x, ...) # S3 method for NULL toUpper(x, ...) # S3 method for tokenizedTexts toUpper(x, ...) # S3 method for corpus toUpper(x, ...)
x | texts to be lower-cased (or upper-cased) |
---|---|
keep_acronyms | if |
... | additional arguments passed to stringi functions, (e.g.
|
Texts tranformed into their lower- (or upper-)cased versions. If x
is a
character vector or a corpus, return a character vector. If
x
is a list of tokenized texts, then return a list of
tokenized texts.
test1 <- c(text1 = "England and France are members of NATO and UNESCO", text2 = "NASA sent a rocket into space.") toLower(test1)#> Warning: 'toLower.character' is deprecated. #> Use 'char_tolower' instead. #> See help("Deprecated")#> text1 #> "england and france are members of nato and unesco" #> text2 #> "nasa sent a rocket into space."toLower(test1, keep_acronyms = TRUE)#> Warning: 'toLower.character' is deprecated. #> Use 'char_tolower' instead. #> See help("Deprecated")#> text1 #> "england and france are members of NATO and UNESCO" #> text2 #> "NASA sent a rocket into space."#> tokenizedTexts from 2 documents. #> text1 : #> [1] "england" "and" "france" "are" "members" "of" "nato" #> [8] "and" "unesco" #> #> text2 : #> [1] "nasa" "sent" "a" "rocket" "into" "space" #>toLower(test2, keep_acronyms = TRUE)#> tokenizedTexts from 2 documents. #> text1 : #> [1] "england" "and" "france" "are" "members" "of" "NATO" #> [8] "and" "UNESCO" #> #> text2 : #> [1] "NASA" "sent" "a" "rocket" "into" "space" #>test1 <- c(text1 = "England and France are members of NATO and UNESCO", text2 = "NASA sent a rocket into space.") toUpper(test1)#> Warning: 'toUpper.character' is deprecated. #> Use 'char_toupper' instead. #> See help("Deprecated")#> text1 #> "ENGLAND AND FRANCE ARE MEMBERS OF NATO AND UNESCO" #> text2 #> "NASA SENT A ROCKET INTO SPACE."#> Warning: 'toUpper.character' is deprecated. #> Use 'char_toupper' instead. #> See help("Deprecated")#> Warning: 'toUpper.character' is deprecated. #> Use 'char_toupper' instead. #> See help("Deprecated")#> tokenizedTexts from 2 documents. #> text1 : #> [1] "ENGLAND" "AND" "FRANCE" "ARE" "MEMBERS" "OF" "NATO" #> [8] "AND" "UNESCO" #> #> text2 : #> [1] "NASA" "SENT" "A" "ROCKET" "INTO" "SPACE" #>