Short‐text feature expansion and classification based on nonnegative matrix factorization
Creators
- 1. Guangdong University of Technology
- 2. University of Amsterdam
Description
In this paper, a non‐negative matrix factorization feature
expansion (NMFFE) approach was proposed to
overcome the feature‐sparsity issue when expanding
features of short‐text. First, we took the internal relationships
of short texts and words into account when
segmenting words from texts and constructing their
relationship matrix. Second, we utilized the Dual
regularization non‐negative matrix tri‐factorization
(DNMTF) algorithm to obtain the words clustering
indicator matrix, which was used to get the feature
space by dimensionality reduction methods. Thirdly,
words with close relationship were selected out from
the feature space and added into the short‐text to solve
the sparsity issue. The experimental results showed
that the accuracy of short text classification of our
NMFFE algorithm increased 25.77%, 10.89%, and 1.79%
on three data sets: Web snippets, Twitter sports, and
AGnews, respectively compared with the Word2Vec
algorithm and Char‐CNN algorithm. It indicated that
the NMFFE algorithm was better than the BOW algorithm
and the Char‐CNN algorithm in terms of classification
accuracy and algorithm robustness.
Files
2020.jounal.intelligentsystems-proof.pdf
Files
(985.9 kB)
Name | Size | Download all |
---|---|---|
md5:a24372cb312d85cc874d32455e059aee
|
985.9 kB | Preview Download |
Additional details
Funding
- ARTICONF – smART socIal media eCOsytstem in a blockchaiN Federated environment 825134
- European Commission
- Blue Cloud – Blue-Cloud: Piloting innovative services for Marine Research & the Blue Economy 862409
- European Commission
- ENVRI-FAIR – ENVironmental Research Infrastructures building Fair services Accessible for society, Innovation and Research 824068
- European Commission