Published May 12, 2020 | Version 1.0.0
Software Open

Predict CRAFT concepts with OGER+BioBERT

  • 1. University of Zurich
  • 2. Univesity of Zurich
  • 3. Istituto dalle Molle di Studi sull'Intelligenza Artificiale

Description

This dataset contains model weights, configuration files and utility scripts to reproduce the results reported in the following publication:

Lenz Furrer, Joseph Cornelius, Fabio Rinaldi (2020). Parallel sequence tagging for concept recognition. ArXiv e-print. arXiv:2003.07424

The code and models in this collection allow you to perform named entity recognition and normalisation for biomedical concepts in scientific literature.

It is based on the following resources:

  • The CRAFT corpus was used for training the models.
  • OGER performs dictionary-based matching of terms.
  • BioBERT served as a basis for example-driven prediction.

Files

Files (44.1 GB)

Name Size Download all
md5:3a390e83a98ba6bcd85e2dbb3ad09500
44.1 GB Download

Additional details

Related works

Funding

Swiss National Science Foundation
MelanoBase CR30I1_162758

References

  • Lenz Furrer, Joseph Cornelius, Fabio Rinaldi (2020). Parallel sequence tagging for concept recognition. ArXiv e-print. arXiv:2003.07424