Dataset Open Access

Language Function Analysis 2011 Corpus (LFA-11)

Henning Wachsmuth; Kathrin Bujna

The Language Function Analysis 2011 Corpus (LFA-11) is a German text corpus of promotional text, reviews and blog posts on music and smartphones. The texts were manually classified with respect to their topic relevance, language function, and sentiment polarity.

The purpose of the corpus is to provide textual data for the development and evaluation of approaches to language function analysis and sentiment analysis. Therefore, each text is classified by language function (personal, commercial, or informational) as well as by sentiment (positive, negative, neutral).

The corpus consists of two separated collections, which contain the texts about music and smartphones respectively. The music collection consists of 2,713 promotional texts and reviews from both users and professionals. The smartphone collection contains 2,093 blog posts on smartphones from the Spinn3r corpus.

Files (5.1 MB)
Name Size
lfa-11-corpus.tar.gz
md5:02051a782b664938b51c34e193efc343
5.1 MB Download
  • Henning Wachsmuth and Kathrin Bujna (2011). Back to the Roots of Genres: Text Classification by Language Function

19
6
views
downloads
All versions This version
Views 1919
Downloads 66
Data volume 30.6 MB30.6 MB
Unique views 1818
Unique downloads 55

Share

Cite as