Dataset Open Access

Language Function Analysis 2011 Corpus (LFA-11)

Henning Wachsmuth; Kathrin Bujna

The Language Function Analysis 2011 Corpus (LFA-11) is a German text corpus of promotional text, reviews and blog posts on music and smartphones. The texts were manually classified with respect to their topic relevance, language function, and sentiment polarity.

The purpose of the corpus is to provide textual data for the development and evaluation of approaches to language function analysis and sentiment analysis. Therefore, each text is classified by language function (personal, commercial, or informational) as well as by sentiment (positive, negative, neutral).

The corpus consists of two separated collections, which contain the texts about music and smartphones respectively. The music collection consists of 2,713 promotional texts and reviews from both users and professionals. The smartphone collection contains 2,093 blog posts on smartphones from the Spinn3r corpus.

Files (5.1 MB)
Name Size
lfa-11-corpus.tar.gz
md5:02051a782b664938b51c34e193efc343
5.1 MB Download
  • Henning Wachsmuth and Kathrin Bujna (2011). Back to the Roots of Genres: Text Classification by Language Function

157
14
views
downloads
All versions This version
Views 157157
Downloads 1414
Data volume 71.5 MB71.5 MB
Unique views 154154
Unique downloads 1313

Share

Cite as