Published September 22, 2020 | Version camera ready
Journal article Open

Short‐text feature expansion and classification based on nonnegative matrix factorization

  • 1. Guangdong University of Technology
  • 2. University of Amsterdam

Description

In this paper, a non‐negative matrix factorization feature

expansion (NMFFE) approach was proposed to

overcome the feature‐sparsity issue when expanding

features of short‐text. First, we took the internal relationships

of short texts and words into account when

segmenting words from texts and constructing their

relationship matrix. Second, we utilized the Dual

regularization non‐negative matrix tri‐factorization

(DNMTF) algorithm to obtain the words clustering

indicator matrix, which was used to get the feature

space by dimensionality reduction methods. Thirdly,

words with close relationship were selected out from

the feature space and added into the short‐text to solve

the sparsity issue. The experimental results showed

that the accuracy of short text classification of our

NMFFE algorithm increased 25.77%, 10.89%, and 1.79%

on three data sets: Web snippets, Twitter sports, and

AGnews, respectively compared with the Word2Vec

algorithm and Char‐CNN algorithm. It indicated that

the NMFFE algorithm was better than the BOW algorithm

and the Char‐CNN algorithm in terms of classification

accuracy and algorithm robustness.

Files

2020.jounal.intelligentsystems-proof.pdf

Files (985.9 kB)

Name Size Download all
md5:a24372cb312d85cc874d32455e059aee
985.9 kB Preview Download

Additional details

Funding

ARTICONF – smART socIal media eCOsytstem in a blockchaiN Federated environment 825134
European Commission
Blue Cloud – Blue-Cloud: Piloting innovative services for Marine Research & the Blue Economy 862409
European Commission
ENVRI-FAIR – ENVironmental Research Infrastructures building Fair services Accessible for society, Innovation and Research 824068
European Commission