Journal article Open Access

Short‐text feature expansion and classification based on nonnegative matrix factorization

Zhang, Ling; Jiang, Wenchao; Zhao, Zhiming

MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="">
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">correlation</subfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">feature extension</subfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">nonnegative matrix factorization</subfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">short text classification</subfield>
  <controlfield tag="005">20210304220035.0</controlfield>
  <controlfield tag="001">4042991</controlfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Guangdong University of Technology</subfield>
    <subfield code="a">Jiang, Wenchao</subfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">University of Amsterdam</subfield>
    <subfield code="0">(orcid)0000-0002-6717-9418</subfield>
    <subfield code="a">Zhao, Zhiming</subfield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">985866</subfield>
    <subfield code="z">md5:a24372cb312d85cc874d32455e059aee</subfield>
    <subfield code="u"></subfield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2020-09-22</subfield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">openaire</subfield>
    <subfield code="o"></subfield>
  <datafield tag="909" ind1="C" ind2="4">
    <subfield code="c">1-15</subfield>
    <subfield code="p">Int Journal Intelligent Systems</subfield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="u">Guangdong University of Technology</subfield>
    <subfield code="a">Zhang, Ling</subfield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">Short‐text feature expansion and classification based on nonnegative matrix factorization</subfield>
  <datafield tag="536" ind1=" " ind2=" ">
    <subfield code="c">862409</subfield>
    <subfield code="a">Blue-Cloud: Piloting innovative services for Marine Research &amp; the Blue Economy</subfield>
  <datafield tag="536" ind1=" " ind2=" ">
    <subfield code="c">825134</subfield>
    <subfield code="a">smART socIal media eCOsytstem in a blockchaiN Federated environment</subfield>
  <datafield tag="536" ind1=" " ind2=" ">
    <subfield code="c">824068</subfield>
    <subfield code="a">ENVironmental Research Infrastructures building Fair services Accessible for society, Innovation and Research</subfield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u"></subfield>
    <subfield code="a">Creative Commons Attribution 4.0 International</subfield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2"></subfield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;In this paper, a non‐negative matrix factorization feature&lt;/p&gt;

&lt;p&gt;expansion (NMFFE) approach was proposed to&lt;/p&gt;

&lt;p&gt;overcome the feature‐sparsity issue when expanding&lt;/p&gt;

&lt;p&gt;features of short‐text. First, we took the internal relationships&lt;/p&gt;

&lt;p&gt;of short texts and words into account when&lt;/p&gt;

&lt;p&gt;segmenting words from texts and constructing their&lt;/p&gt;

&lt;p&gt;relationship matrix. Second, we utilized the Dual&lt;/p&gt;

&lt;p&gt;regularization non‐negative matrix tri‐factorization&lt;/p&gt;

&lt;p&gt;(DNMTF) algorithm to obtain the words clustering&lt;/p&gt;

&lt;p&gt;indicator matrix, which was used to get the feature&lt;/p&gt;

&lt;p&gt;space by dimensionality reduction methods. Thirdly,&lt;/p&gt;

&lt;p&gt;words with close relationship were selected out from&lt;/p&gt;

&lt;p&gt;the feature space and added into the short‐text to solve&lt;/p&gt;

&lt;p&gt;the sparsity issue. The experimental results showed&lt;/p&gt;

&lt;p&gt;that the accuracy of short text classification of our&lt;/p&gt;

&lt;p&gt;NMFFE algorithm increased 25.77%, 10.89%, and 1.79%&lt;/p&gt;

&lt;p&gt;on three data sets: Web snippets, Twitter sports, and&lt;/p&gt;

&lt;p&gt;AGnews, respectively compared with the Word2Vec&lt;/p&gt;

&lt;p&gt;algorithm and Char‐CNN algorithm. It indicated that&lt;/p&gt;

&lt;p&gt;the NMFFE algorithm was better than the BOW algorithm&lt;/p&gt;

&lt;p&gt;and the Char‐CNN algorithm in terms of classification&lt;/p&gt;

&lt;p&gt;accuracy and algorithm robustness.&lt;/p&gt;</subfield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.1002/int.22290</subfield>
    <subfield code="2">doi</subfield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">publication</subfield>
    <subfield code="b">article</subfield>
Views 131
Downloads 72
Data volume 71.0 MB
Unique views 127
Unique downloads 72


Cite as