Extract regions from NER annotations (CoNNL format).

conll_get_regions(x)

Arguments

x

A data.frame, a data.table, or any other object that can be coerced to a data.table. The input table is expected to have the columns "token" and "ner", and "cpos".

Examples

x <- data.frame( token = c( "Die", "Bundeskanzlerin", "Angela", "Merkel", "hält", "im", "Bundestag", "eine", "Rede", "." ), ne = c("O", "O", "B-PERS", "I-PERS", "O", "O", "B-ORG", "O", "O", "O"), stringsAsFactors = FALSE ) x[["cpos"]] <- 100L:(100L + nrow(x) - 1L) tab <- conll_get_regions(x)