Dataset Open Access

Experimental datasets for sentiment analysis and emotion mining - Emotion Mining Toolkit (EMTk)

Fabio Calefato; Filippo Lanubile; Nicole Novielli

Description

Datasets for sentiment analysis and emotion mining, distributed with the Emotion Mining Toolkit (EMTk) Docker container (see https://collab-uniba.github.io/EMTk for more):

  • Stack Overflow - A couple of gold standards of 4,000+ posts, manually annotated for mining both emotions and polarity.
  • Jira - A gold standard of ~4,000 issues, manually annotated for emotions.

Citation

Please, see the references below for the papers to cite. Do not cite this Zenodo upload directly.

Files (1.8 MB)
Name Size
collab-uniba/EMTK_datasets-v1.0.zip
md5:51b7fb54a317e454b8854b7b944ec4b0
1.8 MB Download
  • F. Calefato, F. Lanubile, and N. Novielli (2017) "Sentiment Polarity Detection for Software Development." Empirical Software Engineering Journal, DOI: 10.1007/s10664-017-9546-9.

  • F. Calefato, F. Lanubile, N. Novielli (2017) "EmoTxt: A Toolkit for Emotion Recognition from Text." In Proc. 7th Affective Computing and Intelligent Interaction (ACII'17), San Antonio, TX, USA, Oct. 23-26, 2017.

  • M. Ortu, A. Murgia, G. Destefanis, P. Tourani, R. Tonelli, M. Marchesi, and B. Adams. 2016. The emotional side of software developers in JIRA. In Proc. of the 13th Int'l Conf. on Mining Software Repositories (MSR '16). ACM, New York, NY, USA, 480-483.

  • N. Novielli, F. Calefato, F. Lanubile (2018) "A Gold Standard for Emotions Annotation in Stack Overflow." In Proc. of the 15th International Conference on Mining Software Repositories (MSR 2018), Gothenburg, Sweden, May 28-29, 2018.

129
21
views
downloads
All versions This version
Views 129129
Downloads 2121
Data volume 38.3 MB38.3 MB
Unique views 119119
Unique downloads 1919

Share

Cite as