Published June 1, 2017 | Version v1
Journal article Open

Rhetorical Sentence Classification for Automatic Title Generation in Scientific Article

  • 1. Institut Teknologi Bandung

Description

In this paper, we proposed a work on rhetorical corpus construction and sentence classification model experiment that specifically could be incorporated in automatic paper title generation task for scientific article. Rhetorical classification is treated as sequence labeling. Rhetorical sentence classification model is useful in task which considers document’s discourse structure. We performed experiments using two domains of datasets: computer science (CS dataset), and chemistry (GaN dataset). We evaluated the models using 10-fold-cross validation (0.70-0.79 weighted average F-measure) as well as on-the-run (0.30-0.36 error rate at best). We argued that our models performed best when handled using SMOTE filter for imbalanced data.

Files

14 4061.pdf

Files (498.1 kB)

Name Size Download all
md5:a1224471ec033a5fd726a4e5976b3db5
498.1 kB Preview Download