Deep Learning-Driven Comparative Study of Word Embedding Techniques: Word2Vec, GloVe, and FastText in Health Condition Reviews
Description
Abstract
Health products, including medications, play a crucial role in public health. Reviews from individuals who have experienced illnesses offer valuable insights for the community in selecting appropriate treatments. In statistical analysis and classification methods, these reviews are processed using Natural Language Processing (NLP), where text mining is pivotal in data processing.
This study aims to integrate Word Embedding techniques with Long Short-Term Memory (LSTM) to classify health-related reviews effectively using machine learning approaches, specifically through sentiment analysis. Word Embedding techniques, such as Word2Vec, FastText, and GloVe, are employed to analyze the structure and context of words. The dataset consisted of 215,063 reviews, separated into 161,297 training samples and 53,766 test samples, covering 13 different health conditions. Training and validation processes were conducted to assess the effectiveness of each method in combination with LSTM. The training and validation accuracy rates achieved were 95.09% for Word2Vec, 94.88% for GloVe, and 95.07% for FastText in training, with validation accuracy rates of 94.47%, 94.17%, and 95.44%, respectively. Test accuracy rates confirmed these findings, with 85.20% for Word2Vec, 84.19% for GloVe, and 86.22% for FastText. FastText outperformed the other methods in effectively categorizing health-related reviews. The results indicate that the integration of Word Embedding techniques and LSTM is effective in classifying health-related reviews, with FastText showing superior performance.
Keywords: Classification; Drug Review; Deep Learning, LSTM, Word Embedding
Files
ISRGJMS1822025.pdf
Files
(1.1 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:12bf2cf563c7db1320e4d6797e81423c
|
1.1 MB | Preview Download |