Published August 30, 2020 | Version v1
Journal article Open

Comprehensive Analysis of Variants of TF-IDF Applied on LDA and LSA Topic Modelling

  • 1. M.Sc in Applied Mathematics, M S Ramaiah University of Applied Sciences, Bangalore, Inida.
  • 2. Department of Computer Science and Engineering, M S Ramaiah University of Applied Sciences, Bangalore, Inida.
  • 1. Publisher


Present generation is fully connected virtually through many sources of social media. In social media, opinions of people for any post, news or about any product through comments or emoticon designed to express the satisfactory note. Market standards improve on this basis. There are different online markets like Amazon, Flipkart, Myntra improve their businesses using these reviews passed. Analyzing large scale opinion or feedback of individual’s helps to identify hidden insights and work towards customer satisfaction. This paper proposes for applying different weighting scheme of TF-IDF (Term Frequency-Inverse Document Frequency) for topic modeling methods LSA and LDA to cluster the topics of discussion from large scale reviews related to booming online market ‘Amazon’. The main focus of the paper is to observe the changes in the topic modeling by applying different weighting schemes of TF-IDF. In this work topic-based models like LDA (Latent Dirichlet Allocation) and LSA (Latent Semantic Allocation) applied to various weighting schemes of TF-IDF and observed the changes of weights leads to variation of term frequency of different topics with respect to its documents. Results also show that the variation of term weights results changes in topic modeling. Visualization results of topic modeling clusters with different TF-IDF weighting schemes are presented.



Files (648.8 kB)

Name Size Download all
648.8 kB Preview Download

Additional details

Related works

Is cited by
Journal article: 2249-8958 (ISSN)


Retrieval Number