Removes features from a variety of objects, such as text, a
dfm, or a list of collocations. The most common usage for
removeFeatures
will be to eliminate stop words from a text or
text-based object. This function simply provides a convenience wrapper for
selectFeatures
where selection = "remove"
.
removeFeatures(x, features, ...)
x | object from which stopwords will be removed |
---|---|
features | character vector of features to remove |
... | additional arguments passed to |
an object with matching features removed
not_run({ ## for tokenized texts txt <- c(wash1 <- "Fellow citizens, I am again called upon by the voice of my country to execute the functions of its Chief Magistrate.", wash2 <- "When the occasion proper for it shall arrive, I shall endeavor to express the high sense I entertain of this distinguished honor.") removeFeatures(tokenize(txt, remove_punct = TRUE), stopwords("english")) itText <- tokenize("Ecco alcuni di testo contenente le parole che vogliamo rimuovere.", remove_punct = TRUE) removeFeatures(itText, stopwords("italian"), case_insensitive = TRUE) ## example for dfm objects mydfm <- dfm(data_char_ukimmig2010, verbose=FALSE) removeFeatures(mydfm, stopwords("english")) ## example for collocations (myCollocs <- collocations(data_corpus_inaugural[1:3], n=20)) removeFeatures(myCollocs, stopwords("english")) removeFeatures(myCollocs, stopwords("english"), pos = 2) })