An ICD code consists of, at a minimum, a three digit ICD-10 code (i.e. one upper-case letter followed by two digits). This may optionally be followed by a two digit subcode, selected punctuation symbols (cross "*", dagger "U2020" or exclamation mark "!"). Both the period separating the three-digit code from the subcode, and the hypthen indicating an "incomplete" subcode, are optional. Finally, in the ambulatory system, an additional letter G, V, Z or A may be appended to signify the status ("security") of the diagnosis.
icd_parse(str, type = "bounded", bind_rows = TRUE)
str | Character vector from which to extract all ICD codes |
---|---|
type | A character string determining how strictly matching should be performed. This must be one of "strict" ( |
bind_rows | logical. Whether to convert the matrix output of |
data.frame (if bind_rows = TRUE) or matrix
By default, the function returns a data.frame
containing the matched codes and the standardised
three digit code (icd3
), subcode (icd_subcode
),
normcode (icd_norm
) and code without period (icd_sub
).
If bind_rows = FALSE
, the list output of
stringi::stri_match_all_regex
is returned.
This is particularly useful to retrieve the
matches from each element of the str
vector
separately.