Published December 30, 2023 | Version v1
Journal article Open

Chunker Based Sentiment Analysis and Tense Classification for Nepali Text

Description

The article represents the Sentiment Analysis (SA) and Tense Classification using Skip gram model for the word to vector encoding on Nepali language. The experiment on SA for positive-negative classification is carried out in two ways. In the first experiment the vector representation of each sentence is generated by using Skip-gram model followed by the Multi-Layer Perceptron (MLP) classification and it is observed that the F1 score of 0.6486 is achieved for positive-negative classification with overall accuracy of 68%. Whereas in the second experiment the verb chunks are extracted using Nepali parser and carried out the similar experiment on the verb chunks. F1 scores of 0.6779 is observed for positive -negative classification with overall accuracy of 85%. Hence, Chunker based sentiment analysis is proven to be better than sentiment analysis using sentences. This paper also proposes using a skip-gram model to identify the tenses of Nepali sentences and verbs. In the third experiment, the vector representation of each sentence is generated by using Skip-gram model followed by the Multi-Layer Perceptron (MLP)classification and it is observed that verb chunks had very low overall accuracy of 53%. In the fourth experiment, conducted for Tense Classification using Sentences resulted in improved efficiency with overall accuracy of 89%. Past tenses were identified and classified more accurately than other tenses. Hence, sentence based tense classification is proven to be better than verb Chunker based sentiment analysis.

Files

12623ijnlc01.pdf

Files (1.3 MB)

Name Size Download all
md5:0ce5a98a0475a2f36dea1a7c0217f772
1.3 MB Preview Download

Additional details

Dates

Copyrighted
2023

References

  • [1] K. &. K. D. S. Shrivastava, "A Sentiment Analysis System for the Hindi Language by Integrating Gated Recurrent Unit with Genetic Algorithm.," The International Arab Journal of Information Technology., no. 17. 954-964. 10.34028/I ajit/17/6/14., (2020). [2] S. Ghosh, "Multitasking of sentiment detection and emotion recoignition in code-mixed Hinglish data.," Knowledge-Based Systems., no. 260.110182.10.1016/j.knosys.2022.110182, (2022). [3] B. K.Bal., "Structure of Nepali Grammar (1st.ed.)"., ,Nepal. : Madan PuraskarPustakalaya, 2004. [4] T. Mikolov, K. Chen, G. Corrado and a. J. Dean., "Efficient estimation of word representations in vector space.," ICLR Workshop, 2013. [5] T. Mikolov, I. Sutskever, K. Chen, G. Corrado and J. Dean., "Distributed representations of words and phrases and their compositionality.," NIPS ,, 2013. [6] K. Kiprono Elijah Koech, "https://towardsdatascience.com/cross-entropy-lossfunctionf38c4ec8643e," 20 Oct 2020. [Online]. Available: https://towardsdatascience.com/crossentropy-lossfunction-f38c4ec8643e. [7] A. Pradhan, A. Yajnik and a. Prajapati., "A Conceptual Graph Approach to the Parsing of Projective Sentences.," International Journal of Mathematics and Computer Science,, no. 15(1) 199–221, (2020). [8] A. Pradhan and A. Yajnik, "Parts-of-speech tagging of Nepali texts with Bidirectional LSTM, Conditional Random Fields and HMM," Multimedia Tools and Applications, 2023. [9] A. Pradhan and A. Yajnik, "Probabilistic and Neural Network Based POS Tagging of Ambiguous Nepali text:," A comparative Study. ISEEIE, Association for Computing Machinery, Seoul,Republic of Korea., no. https://doi.org/10.1145/3459104.3459146., (2021). [10] A. MacKinlay., "The effects of Part –Of-Speech Tagsets on Tagger Performance (Bachelor's thesis) Master's thesis", .University of Melbourne ,Australia., 2005. [11] https://www.kaggle.com/datasets/aayamoza/nepali-sentiment-analysis [12] Piryani, Rajesh & Piryani, Bhawna & Singh, Vivek & Pinto, David. "Sentiment analysis in Nepali: Exploring machine learning and lexicon-based approaches". Journal of Intelligent and Fuzzy Systems. 1-12. 10.3233/JIFS-179884. (2020). [13] Tung, A.K.H. Rule-Based Classification. In: Liu, L., Özsu, M. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4899-7993-3_559-2, (2017). [14] Akbar Karimi., "Hidden Markov Models vs. Conditional Random Fields"., June 13, 2023., https://www.baeldung.com/cs/hidden-markov-vs-crf. [15] Xingyu Fu, Weijia Shi, Zian Zhao, Xiaodong Yu, and Dan Roth., "Design challenges for lowresource cross-lingual entity linking"., arXiv preprint arXiv:2005.00692., 2020.