Quantitative Analysis of Sentiment Expression Across Large Language Models: A Comparative Study Using Plutchik's Wheel of Emotions
Authors/Creators
-
Butler, Raleigh
(Project manager)
- Ward, Dylan (Project member)
- Jenkins, Dana (Project member)
- Lantrip, A.R. (Project member)
- Armstrong, Erin (Other)
- Butler, Rory (Other)
- Driza, Paige (Other)
- Fields, Jackson (Other)
- Levario, Ricardo (Other)
- Miller, Kylee (Other)
- Plessala, Bennett (Other)
- Sigman, Nathaniel (Other)
- Slater, Leah (Other)
- Vassallo, Emily (Other)
- Vivekanandan, Avinash (Other)
- Yildirim, Lisa (Other)
Description
Recent advances in Large Language Models (LLMs) have dramatically transformed the landscape of natural language processing, yet our understanding of how these models express and manipulate emotional content remains limited. This study presents a comprehensive analysis of sentiment expression across multiple prominent LLMs, including Llama 8B, Gemini 1.5 Flash, ChatGPT 4, and Claude 3.5 Sonnet. Using Plutchik's Wheel of Emotions as a theoretical framework, we evaluate how different LLMs express and combine emotional states through generated text. Our analysis employs both LIWC (Linguistic Inquiry and Word Count) and SALLEE (Syntax-Aware LexicaL Emotion Engine) to quantify emotional expression across 50 text generations per sentiment per model. Results reveal distinctive patterns in how different LLMs handle emotional intensity and emotional combinations, with significant variations in consistency and accuracy across models. These findings have important implications for both practical applications of LLMs and theoretical understanding of artificial emotional expression.
Files
Quantitative Analysis of Sentiment Expression Across Large Language Models_ A Comparative Study Using Plutchik's Wheel of Emotions-Final-Ready.pdf
Files
(2.0 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:29fb4b71b47b14d8766cc66053084254
|
2.0 MB | Preview Download |
Additional details
References
- Amazon. (n.d.). Sentiment. Retrieved from https://docs.aws.amazon.com/comprehend/latest/dg/how-sentiment.html
- Antoun, W. et al. (2024). From Text to Source: Results in Detecting Large Language Model-Generated Content. Retrieved from https://arxiv.org/html/2309.13322v2
- Antoun, W. et al. (2023) Towards a Robust Detection of Language Model Generated Text: Is ChatGPT that Easy to Detect? Retrieved from https://arxiv.org/abs/2306.05871
- Azwarni, N.S. et al. (2024) Evaluating TextBlob, Lexicon, Support Vector Machine, Naive Bayes, and ChatGPT Approaches for Sentiment Analysis of NASDAQ Listed Companies. Retrieved from http://www.jatit.org/volumes/Vol102No13/20Vol102No13.pdf
- Baccianella, S. et al. (2010) SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining. Retrieved from https://aclanthology.org/L10-1531/
- Google. (n.d.). Natural Language API Basics. Retrieved from https://cloud.google.com/natural-language/docs/basics
- He, Z., Guo, S., Rao, A., & Lerman, K. (2024). Whose emotions and moral sentiments do language models reflect? Retrieved from https://arxiv.org/abs/2402.11114
- Hoang et al. (2019) Aspect-Based Sentiment Analysis using BERT. Retrieved from https://aclanthology.org/W19-6120/
- Huang, D.-M., Van Rijn, P., Sucholutsky, I., Marjieh, R., & Jacoby, N. (2024). Characterizing similarities and divergences in conversational tones in humans and LLMs by sampling with people. Retrieved from https://aclanthology.org/2024.acl-long.565/
- Hutto, C. et al. (2014) VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text. Retrieved from https://ojs.aaai.org/index.php/ICWSM/article/view/14550
- IBM. (n.d.). Class ToneAnalyzerV3. Retrieved from https://watson-developer-cloud.github.io/node-sdk/master/classes/toneanalyzerv3.html
- Kumar, A. et al. (2023) Causal Effect Regularization: Automated Detection and Removal of Spurious Correlations. Retrieved from https://arxiv.org/pdf/2306.11072
- Kumar, A. et al. (2023) Causal Inference Using LLM-Guided Discovery. Retrieved from https://arxiv.org/pdf/2310.15117
- Main, Paul. Structural Learning. (n.d.). The emotion wheel - A tool for developing emotional literacy. Retrieved from https://www.structural-learning.com/post/emotion-wheel
- Nguyen-Son, H. Q. et al. (2024) SimLLM: Detecting Sentences Generated by Large Language Models Using Similarity between the generation and its Re-generation. Retrieved from https://aclanthology.org/2024.emnlp-main.1246/
- Pang, B., Lee, L., & Vaithyanathan, S. (2002). Thumbs up? Sentiment classification using machine learning techniques. Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing (EMNLP), 10, 79–86. Retrieved from https://aclanthology.org/W02-1011/
- Pennebaker, J. W. (2011). The secret life of pronouns: What our words say about us. Bloomsbury Press.
- Plutchik, R. (1980). Emotion: A psychoevolutionary synthesis. Harper & Row.
- Receptiviti. (n.d.). Linguistic Inquiry and Word Count (LIWC). (n.d.). Retrieved from https://www.receptiviti.com/liwc
- Receptiviti. (n.d.). Syntax-Aware LexicaL Emotion Engine (SALLEE). (n.d.). Retrieved from https://docs.receptiviti.com/frameworks/emotions
- Ribeiro, F. N., Araujo, M., Gonçalves, P., Gonçalves, M. A., & Benevenuto, F. (2015). A benchmark comparison of state-of-the-practice sentiment analysis methods. Retrieved from https://www.researchgate.net/publication/286302059_A_Benchmark_Comparison_of_St ate-of-the-Practice_Sentiment_Analysis_Methods
- Rozado, D., Hughes, R., & Halberstadt, J. (2022). Longitudinal analysis of sentiment and emotion in news media headlines using automated labeling with transformer language models. Retrieved from https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0276367
- Sharma, N. A., et al. (2024). A review of sentiment analysis: Tasks, applications, and deep learning techniques. Retrieved from https://link.springer.com/article/10.1007/s41060-024-00594-x
- Stanford Natural Language Processing Group. (n.d.) CoreNLP. Retrieved from https://stanfordnlp.github.io/CoreNLP/
- Tai, R. H., Bentley, L. R., Xia, X., Sitt, J. M., Fankhauser, S. C., Chicas-Mosier, A. M., & Monteith, B. G. (2024). An examination of the use of large language models to aid analysis of textual data. Retrieved from https://www.researchgate.net/publication/378178132_An_Examination_of_the_Use_of_L arge_Language_Models_to_Aid_Analysis_of_Textual_Data
- WhyLabs (2024). 7 Ways to Evaluate and Monitor LLMs. Retrieved from https://whylabs.ai/blog/posts/7-ways-to-evaluate-and-monitor-llms
- Yang, H. et al. (2024). Large Language Models Meet Text-Centric Multimodal Sentiment Analysis: A Survey. Retrieved from https://arxiv.org/html/2406.08068v2
- Zhang, W. et al. (2024) Sentiment Analysis in the Era of Large Language Models: A Reality Check. Retrieved from https://aclanthology.org/2024.findings-naacl.246.pdf