Using Titles vs. Full-text as Source for Automated Semantic Document Annotation

Galke, Lukas Paul Achatius; Mai, Florian; Schelten, Alan; Brunsch, Dennis; Scherp, Ansgar

doi:10.1145/3148011.3148039

Published December 6, 2017 | Version v1

Conference paper Open

Using Titles vs. Full-text as Source for Automated Semantic Document Annotation

1. ZBW -- Leibniz Information Centre for Economics
2. Kiel University

We conduct the first systematic comparison of automated semantic
annotation based on either the full-text or only on the title metadata
of documents. Apart from the prominent text classification baselines
kNN and SVM, we also compare recent techniques of Learning
to Rank and neural networks and revisit the traditional methods
logistic regression, Rocchio, and Naive Bayes. Across three of our
four datasets, the performance of the classifications using only titles
reaches over 90% of the quality compared to the performance when
using the full-text.

Files

a20-galke.pdf

Files (458.2 kB)

Name	Size	Download all
a20-galke.pdf md5:0703000f435a909082d067312c999228	458.2 kB	Preview Download

Additional details

Is new version of: https://arxiv.org/abs/1705.05311 (URL)

European Commission
MOVING - Training towards a society of data-savvy information professionals to enable open leadership innovation 693092

273

Views

341

Downloads

Show more details

	All versions	This version
Views	273	273
Downloads	341	341
Data volume	158.5 MB	158.5 MB

More info on how stats are collected....

DOI

Resource type

Conference paper

Publisher

Zenodo

Conference

Ninth International Conference on Knowledge Capture (K-CAP 2017) , Austin, Texas, 04-06 December 2017

Languages

English

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: January 10, 2018
Modified: August 2, 2024

a20-galke.pdf

Files (458.2 kB)

Related works

Funding

Using Titles vs. Full-text as Source for Automated Semantic Document Annotation

Authors/Creators

Description

Files

a20-galke.pdf

Files (458.2 kB)

Additional details

Related works

Funding