Published December 13, 2019 | Version v1
Conference paper Open

Sentiment Classification for Under-Resourced Language Using Word2Vec Neural Network: Amharic Language Social Media Text

  • 1. University of Technology, Taiwan

Description

Sentiment classification becomes popular task in social network texts which express opinions on different issue to analyze and produce useful knowledge. However, many linguistic computational resources are available only for English language. In the recent years, due to the emergence of social media platforms, opinion-rich resources are booming abundant for under-resourced languages with the need to perform Sentiment Analysis. On the other hand,most of the existing researches focus on how to extract the effective features, such as lexical and syntactic features,while limited work has been done on semantic features, which can make more contributions to both under-resourced and resourceful languages. In this paper, we proposed sentiment classification based on Word2Vec for Amharic Language text on political domain. The Word2Vec establishes the neural network models to learn the vector representations of words to extract the deep semantic relationships. Firstly, we cluster the similar features together and apply language modeling Ngram to check sentiment-bearing Co-occurring Terms (COT). Word2Vec and TF-IDF were used to learn the word representations as a candidate feature vector. Secondly, The Gradient-Boosting Tree(GBT) and Random forest machine learning classifiers were used to train and test in the Apache Spark platform. In our experiments, we use the Amharic language in Ethiopia and adopt a standard natural language pre-processing techniques on the crawled Facebook datasets to categorize into positive and negative opinions. Experimental results of feature extraction using Word2Vec technique performs better in the GBT classifier achieving an average accuracy of 82.29%. Therefore, our proposed approach can successfully discriminate among posts and comments expressing positive and negative opinions.

Files

Sentiment Classification for Under.pdf

Files (88.6 kB)

Name Size Download all
md5:18902f0941db301223e8d7bc2ab2acfc
88.6 kB Preview Download