Published August 21, 2020 | Version v2
Conference paper Open

Validação e construção de um dicionário léxico para auxiliar a análise de sentimentos em repositórios de projetos de software (Material Suplementar)

  • 1. Universidade Federal da Bahia (UFBA)

Description

A análise de sentimentos faz inferência sobre polaridades em palavras que podem representar possíveis emoções. A assertividade dessa classificação é importante para a confiabilidade do resultado esperado. Por esta razão, este trabalho busca investigar, validar e construir um dicionário léxico, no contexto de Engenharia de Software, utilizando como base 560 palavras, emoticons e expressões idiomáticas da ferramenta SentiStrength-SE. Um experimento com 559 questões respondidas por 48 participantes da área de Computação foi realizado para validação da concordância dos termos léxicos do dicionário. Ao final da coleta dos dados os termos foram reunidos para validação utilizando uma base de dados do Stack Overflow para encontrar os resultados sobre accuracy, precision, recall e F1-score do novo dicionário. O novo dicionário léxico apresenta 79% de acurácia e precisão, com 78% de Recall e F1-score com um intervalo de polaridade menor do que o dicionário original.

Abstract

Sentiment analysis makes inference about polarities in words that can represent possible emotions. The assertiveness of this classification is important for the results reliability. For this reason, this article investigates, validates and builds the lexicon dictionary, in the context of Software Engineering, using 560 words, emoticons and idiomatic expressions from the SentiStrength-SE tool. An experiment online with 559 questions answered from 48 participants in the Computing area was performed to validate lexical terms agreement from the dictionary. At the end of the data collection, the terms were gathered for validation using a Stack Overflow database to find the results on accuracy, precision, recall and F1-score of the new dictionary. The new lexical dictionary has 79% Accuracy and Precision, with 78% Recall and f1-score with a smaller polarity interval than the original dictionary.

Files

Apêndice01_Resultados_do_experimento.pdf

Files (12.7 MB)

Name Size Download all
md5:14705cc3ae67a1879378e7be5ade9043
58.2 kB Preview Download
md5:08a23a50e053189c25f4b7e241d70b0e
55.0 kB Preview Download
md5:6854772e1fb8a6e2c2020b7455c57869
119.9 kB Preview Download
md5:f146c5a8efc09fe70e6f186ea78af9da
148.8 kB Preview Download
md5:56b68e7664cd33432e069c15982ea076
522 Bytes Preview Download
md5:e05c517f163ce24c0cea816e65e92690
94.8 kB Preview Download
md5:a4e6715fb5efef573e91839ff83085be
319.7 kB Preview Download
md5:5e0adf7591edcb6dbfe3722f2b90919c
321.1 kB Preview Download
md5:7a13c5bb1473f789568ac2b3595b971e
317.6 kB Preview Download
md5:f70fb00c8f980dc406a4b7ed8cfc9dd3
320.6 kB Preview Download
md5:1b6415d8f29c5fd0d23030680a2f076f
709.7 kB Preview Download
md5:e3fff01024ac9eb1935b02fb9bd633c5
709.8 kB Preview Download
md5:f6d0b56123a61184ccc6468bf3e89ef9
710.4 kB Preview Download
md5:fdf82f28a2640dc94c01f614d1d9ef8a
708.3 kB Preview Download
md5:4463d6e324d67fb79fe20e19b2074570
319.9 kB Preview Download
md5:3bf2ebfed976672068df1acfa9b7756d
320.7 kB Preview Download
md5:37c3ae61f186d7c84d1c9529f2911ae5
319.0 kB Preview Download
md5:8cc6ac52719ec51c8cfc6953a43077f7
318.5 kB Preview Download
md5:53e66368a53982532e97e08a412ad695
707.3 kB Preview Download
md5:e25859424ebbdf72c397f82b560288b6
712.6 kB Preview Download
md5:9070f19cc9e33848c8207b7b300f84c9
704.5 kB Preview Download
md5:dbc9fa54d829ed766e3a753dfb0108c8
713.0 kB Preview Download
md5:213c1dc2db6215f98386424cde89de27
1.5 kB Preview Download
md5:56791b7597e3b01ecb7e720e5062767f
1.6 MB Download
md5:48fcc640077a9290aebfd960e872dce7
1.2 MB Preview Download
md5:bb0a21786296ab7b052f8e082a548573
1.2 MB Preview Download
md5:bc7892e0efe00bf0e392aa6a1afc4669
6.3 kB Preview Download