UPDATE: Zenodo migration postponed to Oct 13 from 06:00-08:00 UTC. Read the announcement.

Dataset Open Access

Language Function Analysis 2011 Corpus (LFA-11)

Henning Wachsmuth; Kathrin Bujna

The Language Function Analysis 2011 Corpus (LFA-11) is a German text corpus of promotional text, reviews and blog posts on music and smartphones. The texts were manually classified with respect to their topic relevance, language function, and sentiment polarity.

The purpose of the corpus is to provide textual data for the development and evaluation of approaches to language function analysis and sentiment analysis. Therefore, each text is classified by language function (personal, commercial, or informational) as well as by sentiment (positive, negative, neutral).

The corpus consists of two separated collections, which contain the texts about music and smartphones respectively. The music collection consists of 2,713 promotional texts and reviews from both users and professionals. The smartphone collection contains 2,093 blog posts on smartphones from the Spinn3r corpus.

Files (5.1 MB)
Name Size
lfa-11-corpus.tar.gz
md5:02051a782b664938b51c34e193efc343
5.1 MB Download
  • Henning Wachsmuth and Kathrin Bujna (2011). Back to the Roots of Genres: Text Classification by Language Function

282
29
views
downloads
All versions This version
Views 282282
Downloads 2929
Data volume 148.1 MB148.1 MB
Unique views 274274
Unique downloads 2828

Share

Cite as