Conference paper Open Access

Automatic detection of inadequate claims in biomedical articles: first steps

Koroleva Anna; Paroubek Patrick

In this article we present the first steps in developing an NLP algorithm for automatic detection of inadequate reporting of research results (known as spin) in biomedical articles. Inadequate reporting consists in presenting the experimental treatment as having a greater beneficial effect than it was shown by the research results. We propose a scheme for an algorithm that would automatically identify important claims in the articles abstracts, extract possible
supporting information from the article and check the adequacy of the claims. We present the state of the art and our first experiments for three tasks related to spin detection: classification of articles according to the type of reported clinical trial; classification of sentences in the abstracts aimed at identifying mentions of the Results and Conclusions of the experiment; and extraction of some trial characteristics. For each task, we outline possible directions of  further work.

Files (236.6 kB)
Name Size
236.6 kB Download
  • Aronson A. Effective mapping of biomedical text to the UMLS metathesaurus: the metamap program. In: Proc. AMIA Symposium (2001).
  • Boutron I., Altman D.G., Hopewell S., Vera-Badillo F., Tannock I., Ravaud P. Impact of spin in the abstracts of articles reporting results of randomized controlled trials in the field of Cancer: the SPIIN randomized controlled trial. J Clin Oncol, 32, 4120–4126 (2014).
  • Boutron I., Dutton S., Ravaud P., Altman D.G. Reporting and interpretation of randomized controlled trials with statistically nonsignificant results for primary outcomes. JAMA, 303, 2058–2064 (2010).
  • Chung G.Y. Towards identifying intervention arms in randomized controlled trials: extracting coordinating constructions. J Biomed Inform, 42(5), 790-800 (2009).
  • Cohen A.M., Smalheiser N.R., McDonagh M.S., Yu C., Adams C.E., Davis J.M., Yu P.S. Automated confidence ranked classification of randomized controlled trial articles: an aid to evidence-based medicine. J Am Med Inform Assoc., 22(3):707–17 (2015).
  • De Bruijn B., Carini S., Kiritchenko S., Martin J., Sim I. Automated information extraction of key trial design elements from clinical trial publications. In: Proceedings of the AMIA Annual Symposium, 141-145 (2008).
  • Friburger N., Maurel D. Finite-state transducer cascade to extract named entities in texts. Theoretical Computer Science, 313, 94-104 (2004).
  • Glanville J. M., Lefebvre C., Miles J. N., Camosso-Stefinociv J. How to identify randomized controlled trials in MEDLINE: 10 years on. Journal of the Medical Library Association, 94, 130–136 (2006).
  • Hall M., Frank E., Holmes G., Pfahringer B., Peter R., Witten I. H. The weka data mining software: An update. SIGKDD Explorations, 11(1) (2009).
  • Haneef R., Lazarus C., Ravaud P., Yavchitz A., Boutron I. Interpretation of results of studies evaluating an intervention highlighted in Google Health News: a cross-sectional study of news. PLoS ONE, 10(10) (2015).
  • Higgins J.P., Green S., eds. Cochrane handbook for systematic reviews of interventions. Wiley & Sons Ltd., West Sussex (2008).
  • Hirohata K., Okazaki N., Ananiadou S., Ishizuka M. Identifying sections in scientific abstracts using conditional random fields. In: Proceedings of the Third International Joint Conference on Natural Language Processing. Hyderabad, 381–388 (2008).
  • Kim S.N., Martinez D., Cavedon L., Yencken L.. Automatic classification of sentences to support evidence based medicine. BMC bioinformatics, 12(Suppl 2):S5 (2011).
  • Kiritchenko S., De Bruijn B., Carini S., Martin J., Sim I. ExaCT: automatic extraction of clinical trial characteristics from journal publications. BMC Med Inform Decis Mak., 10: 56-10.1186/1472-6947-10-56 (2010).
  • Kouylekov M., Magnini B. Tree Edit Distance for Textual Entailment. RANLP (2005).
  • Lazarus C., Haneef R., Ravaud P., Boutron I. Classification and prevalence of spin in abstracts of non-randomized studies evaluating an intervention. BMC Med Res Methodol., 15:85 (2015).
  • Maurel D., Friburger N., Antoine J.-Y., Eshkol-Taravella I., Nouvel D. Cascades autour de la reconnaissance des entités nommées. TAL 52-1 (2011).
  • McKibbon K.A., Wilczynski N.L., Haynes R.B. Retrieving randomized controlled trials from medline: a comparison of 38 published search filters. Health Information and Libraries Journal, 26(3), 187-202 (2009).
  • McKnight L., Srinivasan P. Categorization of sentence types in medical abstracts. In: AMIA Annu. Symp. Proc., 440–444 (2003).
  • Paumier S. (2016). Unitex 3.1 User Manual., last accessed 2017/07/12.
  • Raja K., Dasot N., Tech B., Goyal P., Jonnalagadda S.R. Towards evidence-based precision medicine: extracting population information from biomedical text using binary classifiers and syntactic patterns. In: AMIA Jt Summits Transl Sci Proc, 203-212 (2016).
  • Summerscales R.L., Argamon S., Bai S., Hupert J., Schwartz A. Automatic summarization of results from clinical trials. In: The 2011 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 372–377 (2011).
  • Summerscales R.L., Argamon S., Hupert J., Schwartz A. Identifying treatments, groups, and outcomes in medical abstracts. In: The Sixth Midwest Computational Linguistics Colloquium (MCLC 2009) (2009).
  • Xu R., Garten Y., Supekar K.S., Das A.K., Altman R.B., Garber A.M. Extracting subject demographic information from abstracts of randomized clinical trial reports. In: Proceedings of the 12th World Congress on Health (Medical) Informatics, 550-554 (2007).
  • Yamamoto Y., Takagi T. A sentence classification system for multi biomedical literature summarization. In: Proceedings of the 21st International Conference on Data Engineering Workshops (2005).
  • Yavchitz A., Boutron I., Bafeta A., Marroun I., Charles P., Mantz J., et al. Misrepresentation of randomized controlled trials in press releases and news coverage: a cohort study. PLoS Med, 9:e1001308 (2012).
  • Yavchitz A., Ravaud P., Altman D.G., Moher D., Hrobjartsson A., Lasserson T., Boutron I. A new classification of spin in systematic reviews and meta-analyses was developed and ranked according to the severity. Journal of Clinical Epidemiology, 75, 56-65 (2016).


Cite as