Online news popularity prediction before publication: effect of readability, emotion, psycholinguistics features
Creators
- 1. University of Calicut
Description
The development of world wide web with easy access to massive information sources anywhere and anytime paves way for more people to rely on online news media rather than print media. The scenario expedites rapid growth of online news industries and leads to substantial competitive pressure. In this work, we propose a set of hybrid features for online news popularity prediction before publication. Two categories of features extracted from news articles, the first being conventional features comprising metadata, temporal, contextual, and embedding vector features, and the second being enhanced features comprising readability, emotion, and psycholinguistics features are extracted from the articles. Apart from analyzing the effectiveness of conventional and enhanced features, we combine these features to come up with a set of hybrid features. We curate an Indian news dataset consisting of news articles from the most rated Indian news websites for the study and also contribute the dataset for future research. Evaluations are performed over the Indian news dataset (IND) and compared with the performance over the benchmark mashable dataset using various supervised machine learning models. Our results indicate that the proposed hybrid of enhanced features with conventional features are highly effective for online news popularity prediction before publication.
Files
15 21095 1570720425.pdf
Files
(298.7 kB)
Name | Size | Download all |
---|---|---|
md5:b4f4ac0015d8f658691a35d0b6531745
|
298.7 kB | Preview Download |