This function finds the element indices of partial matching or similar strings in a character vector. Can be used to find exact or slightly mistyped elements in a string vector.
str_pos(search.string, find.term, maxdist = 2, part.dist.match = 0, show.pbar = FALSE)
search.string | Character vector with string elements. |
---|---|
find.term | String that should be matched against the elements of |
maxdist | Maximum distance between two string elements, which is allowed to treat them as similar or equal. Smaller values mean less tolerance in matching. |
part.dist.match | Activates similar matching (close distance strings) for parts (substrings)
of the
Default value is 0. See 'Details' for more information. |
show.pbar | Logical; f |
A numeric vector with index position of elements in search.string
that
partially match or are similar to find.term
. Returns -1
if no
match was found.
For part.dist.match = 1
, a substring of length(find.term)
is extracted
from search.string
, starting at position 0 in search.string
until
the end of search.string
is reached. Each substring is matched against
find.term
, and results with a maximum distance of maxdist
are considered as "matching". If part.dist.match = 2
, the range
of the extracted substring is increased by 2, i.e. the extracted substring
is two chars longer and so on.
This function does not return the position of a matching string inside
another string, but the element's index of the search.string
vector, where
a (partial) match with find.term
was found. Thus, searching for "abc" in
a string "this is abc" will not return 9 (the start position of the substring),
but 1 (the element index, which is always 1 if search.string
only has one element).
# NOT RUN { string <- c("Hello", "Helo", "Hole", "Apple", "Ape", "New", "Old", "System", "Systemic") str_pos(string, "hel") # partial match str_pos(string, "stem") # partial match str_pos(string, "R") # no match str_pos(string, "saste") # similarity to "System" # finds two indices, because partial matching now # also applies to "Systemic" str_pos(string, "sytsme", part.dist.match = 1) # finds nothing str_pos("We are Sex Pistols!", "postils") # finds partial matching of similarity str_pos("We are Sex Pistols!", "postils", part.dist.match = 1) # }