Journal article Open Access

Short‐text feature expansion and classification based on nonnegative matrix factorization

Zhang, Ling; Jiang, Wenchao; Zhao, Zhiming

In this paper, a non‐negative matrix factorization feature

expansion (NMFFE) approach was proposed to

overcome the feature‐sparsity issue when expanding

features of short‐text. First, we took the internal relationships

of short texts and words into account when

segmenting words from texts and constructing their

relationship matrix. Second, we utilized the Dual

regularization non‐negative matrix tri‐factorization

(DNMTF) algorithm to obtain the words clustering

indicator matrix, which was used to get the feature

space by dimensionality reduction methods. Thirdly,

words with close relationship were selected out from

the feature space and added into the short‐text to solve

the sparsity issue. The experimental results showed

that the accuracy of short text classification of our

NMFFE algorithm increased 25.77%, 10.89%, and 1.79%

on three data sets: Web snippets, Twitter sports, and

AGnews, respectively compared with the Word2Vec

algorithm and Char‐CNN algorithm. It indicated that

the NMFFE algorithm was better than the BOW algorithm

and the Char‐CNN algorithm in terms of classification

accuracy and algorithm robustness.

Files (985.9 kB)
Name Size
985.9 kB Download
Views 110
Downloads 56
Data volume 55.2 MB
Unique views 107
Unique downloads 56


Cite as