Published December 6, 2017 | Version v1
Conference paper Open

Using Titles vs. Full-text as Source for Automated Semantic Document Annotation

  • 1. ZBW -- Leibniz Information Centre for Economics
  • 2. Kiel University

Description

We conduct the first systematic comparison of automated semantic
annotation based on either the full-text or only on the title metadata
of documents. Apart from the prominent text classification baselines
kNN and SVM, we also compare recent techniques of Learning
to Rank and neural networks and revisit the traditional methods
logistic regression, Rocchio, and Naive Bayes. Across three of our
four datasets, the performance of the classifications using only titles
reaches over 90% of the quality compared to the performance when
using the full-text.

Files

a20-galke.pdf

Files (458.2 kB)

Name Size Download all
md5:0703000f435a909082d067312c999228
458.2 kB Preview Download

Additional details

Related works

Is new version of
https://arxiv.org/abs/1705.05311 (URL)

Funding

MOVING – Training towards a society of data-savvy information professionals to enable open leadership innovation 693092
European Commission