Journal article Open Access

Short‐text feature expansion and classification based on nonnegative matrix factorization

Zhang, Ling; Jiang, Wenchao; Zhao, Zhiming

Citation Style Language JSON Export

  "DOI": "10.1002/int.22290", 
  "container_title": "Int Journal Intelligent Systems", 
  "title": "Short\u2010text feature expansion and classification based on nonnegative matrix factorization", 
  "issued": {
    "date-parts": [
  "abstract": "<p>In this paper, a non\u2010negative matrix factorization feature</p>\n\n<p>expansion (NMFFE) approach was proposed to</p>\n\n<p>overcome the feature\u2010sparsity issue when expanding</p>\n\n<p>features of short\u2010text. First, we took the internal relationships</p>\n\n<p>of short texts and words into account when</p>\n\n<p>segmenting words from texts and constructing their</p>\n\n<p>relationship matrix. Second, we utilized the Dual</p>\n\n<p>regularization non\u2010negative matrix tri\u2010factorization</p>\n\n<p>(DNMTF) algorithm to obtain the words clustering</p>\n\n<p>indicator matrix, which was used to get the feature</p>\n\n<p>space by dimensionality reduction methods. Thirdly,</p>\n\n<p>words with close relationship were selected out from</p>\n\n<p>the feature space and added into the short\u2010text to solve</p>\n\n<p>the sparsity issue. The experimental results showed</p>\n\n<p>that the accuracy of short text classification of our</p>\n\n<p>NMFFE algorithm increased 25.77%, 10.89%, and 1.79%</p>\n\n<p>on three data sets: Web snippets, Twitter sports, and</p>\n\n<p>AGnews, respectively compared with the Word2Vec</p>\n\n<p>algorithm and Char\u2010CNN algorithm. It indicated that</p>\n\n<p>the NMFFE algorithm was better than the BOW algorithm</p>\n\n<p>and the Char\u2010CNN algorithm in terms of classification</p>\n\n<p>accuracy and algorithm robustness.</p>", 
  "author": [
      "family": "Zhang, Ling"
      "family": "Jiang, Wenchao"
      "family": "Zhao, Zhiming"
  "page": "1-15", 
  "version": "camera ready", 
  "type": "article-journal", 
  "id": "4042991"
Views 132
Downloads 73
Data volume 72.0 MB
Unique views 128
Unique downloads 73


Cite as