Journal article Open Access

Short‐text feature expansion and classification based on nonnegative matrix factorization

Zhang, Ling; Jiang, Wenchao; Zhao, Zhiming

In this paper, a non‐negative matrix factorization feature

expansion (NMFFE) approach was proposed to

overcome the feature‐sparsity issue when expanding

features of short‐text. First, we took the internal relationships

of short texts and words into account when

segmenting words from texts and constructing their

relationship matrix. Second, we utilized the Dual

regularization non‐negative matrix tri‐factorization

(DNMTF) algorithm to obtain the words clustering

indicator matrix, which was used to get the feature

space by dimensionality reduction methods. Thirdly,

words with close relationship were selected out from

the feature space and added into the short‐text to solve

the sparsity issue. The experimental results showed

that the accuracy of short text classification of our

NMFFE algorithm increased 25.77%, 10.89%, and 1.79%

on three data sets: Web snippets, Twitter sports, and

AGnews, respectively compared with the Word2Vec

algorithm and Char‐CNN algorithm. It indicated that

the NMFFE algorithm was better than the BOW algorithm

and the Char‐CNN algorithm in terms of classification

accuracy and algorithm robustness.

Files (985.9 kB)
Name Size
985.9 kB Download
Views 369
Downloads 171
Data volume 168.6 MB
Unique views 362
Unique downloads 170


Cite as