Sentiment Analysis in Tickets for IT Support
1. Does the paper propose a new opinion mining approach?
Yes
2. Which opinion mining techniques are used (list all of them, clearly stating their name/reference)?
The first step of the proposed dictionary-based approach consists in the creation of a domain dictionary. The customized domain dictionary is created by the automatic expansion and pruning of a set of seed words, using a thesaurus and a sentiment lexicon. Specifically, the authors chose the seeds using a list of salutations/closing expressions, and a set of manually inspected sentiment words extracted from the most frequent ones found in their Gold Standard of IT tickets. To support the dictionary creation, they leveraged two popular general-purpose dictionaries: 1) SentiWordNet sentiment lexicon: https://github.com/aesuli/sentiwordnet - Papers: https://github.com/aesuli/SentiWordNet/tree/master/papers 2) WordNet thesaurus for English: https://wordnet.princeton.edu/ - Papers: George A. Miller (1995). WordNet: A Lexical Database for English. Communications of the ACM Vol. 38, No. 11: 39-41; Christiane Fellbaum (1998, ed.) WordNet: An Electronic Lexical Database. Cambridge, MA: MIT Press.
3. Which opinion mining approaches in the paper are publicly available? Write down their name and links. If no approach is publicly available, leave it blank or None.
None
4. What is the main goal of the whole study?
This paper proposes a method to assess the sentiment contained in tickets for IT support. The authors propose three methods for calculating the polarity scores of words and expressions contained in tickets: dictionary-based, structure-based and hybrid.
5. What the researchers want to achieve by applying the technique(s) (e.g., calculate the sentiment polarity of app reviews)?
IT tickets include a description of errors, incidents, requests for support, etc. Thus, the main challenge is to automatically distinguish between factual information, which is intrinsically negative (e.g. error description), from the actual sentiment/opinion embedded in the description.
6. Which dataset(s) the technique is applied on?
34,895 tickets from five organizations, from which the authors randomly selected 2,333 tickets to compose a Gold Standard.
7. Is/Are the dataset(s) publicly available online? If yes, please indicate their name and links.
No
8. Is the application context (dataset or application domain) different from that for which the technique was originally designed?
No
9. Is the performance (precision, recall, run-time, etc.) of the technique verified? If yes, how did they verify it and what are the results?
The best results display an average precision and recall of 82.83% and 88.42%, which outperforms the compared sentiment analysis solutions
10. Does the paper replicate the results of previous work? If yes, leave a summary of the findings (confirm/partially confirms/contradicts).
No
11. What success metrics are used?
Precision and recall
12. Write down any other comments/notes here.
More details on the three methods proposed: 1) Dictionary Method: For each word from a ticket, the method searches whether it exists in the Domain Dictionary, based on the word lemma and POS. If it is found, its polarity score is computed using SentiWordNet; otherwise, it is assigned a neutral polarity. Detailed information on the algorithm can be found in the paper. 2) Template Method: this method explores the structure of the document. The tokens in the same position in the ticket receive a common score, referred to as “category score”. The combination of polarity and position results in 6 categories: Positive Greeting, Negative Greeting, Positive Report, Negative Report, Positive Closure and Negative Closure. Greetings tokens are present in the initial segment of the ticket, closure ones in the ending segment, and report tokens are located in between. Details on the algorithm can be found in the paper 3) Hybrid Method: a combination of the previous two.