Journal article Open Access
Zhang, Ling;
Jiang, Wenchao;
Zhao, Zhiming
In this paper, a non‐negative matrix factorization feature
expansion (NMFFE) approach was proposed to
overcome the feature‐sparsity issue when expanding
features of short‐text. First, we took the internal relationships
of short texts and words into account when
segmenting words from texts and constructing their
relationship matrix. Second, we utilized the Dual
regularization non‐negative matrix tri‐factorization
(DNMTF) algorithm to obtain the words clustering
indicator matrix, which was used to get the feature
space by dimensionality reduction methods. Thirdly,
words with close relationship were selected out from
the feature space and added into the short‐text to solve
the sparsity issue. The experimental results showed
that the accuracy of short text classification of our
NMFFE algorithm increased 25.77%, 10.89%, and 1.79%
on three data sets: Web snippets, Twitter sports, and
AGnews, respectively compared with the Word2Vec
algorithm and Char‐CNN algorithm. It indicated that
the NMFFE algorithm was better than the BOW algorithm
and the Char‐CNN algorithm in terms of classification
accuracy and algorithm robustness.
Name | Size | |
---|---|---|
2020.jounal.intelligentsystems-proof.pdf
md5:a24372cb312d85cc874d32455e059aee |
985.9 kB | Download |
Views | 369 |
Downloads | 171 |
Data volume | 168.6 MB |
Unique views | 362 |
Unique downloads | 170 |