Conference paper Open Access

Multi-Task Learning of Graph-based Inductive Representations of Music Content

Antonia Saravanou; Federico Tomasi; Rishabh Mehrotra; Mounia Lalmas


MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <leader>00000nam##2200000uu#4500</leader>
  <controlfield tag="005">20211103165701.0</controlfield>
  <controlfield tag="001">5624379</controlfield>
  <datafield tag="711" ind1=" " ind2=" ">
    <subfield code="d">November 7-12, 2021</subfield>
    <subfield code="g">ISMIR 2021</subfield>
    <subfield code="a">International Society for Music Information Retrieval Conference</subfield>
    <subfield code="c">Online</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="a">Federico Tomasi</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="a">Rishabh Mehrotra</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="a">Mounia Lalmas</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">604454</subfield>
    <subfield code="z">md5:7aafcbd422dbe6cfcfee95d3d151dfd8</subfield>
    <subfield code="u">https://zenodo.org/record/5624379/files/000075.pdf</subfield>
  </datafield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="y">Conference website</subfield>
    <subfield code="u">https://ismir2021.ismir.net</subfield>
  </datafield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2021-11-07</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">openaire</subfield>
    <subfield code="p">user-ismir</subfield>
    <subfield code="o">oai:zenodo.org:5624379</subfield>
  </datafield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="a">Antonia Saravanou</subfield>
  </datafield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">Multi-Task Learning of Graph-based Inductive Representations of Music Content</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">user-ismir</subfield>
  </datafield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u">https://creativecommons.org/licenses/by/4.0/legalcode</subfield>
    <subfield code="a">Creative Commons Attribution 4.0 International</subfield>
  </datafield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2">opendefinition.org</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">Music streaming platforms rely heavily on learning meaningful representations of tracks to surface apt recommendations to users in a number of different use cases. In this work, we consider the task of learning music track representations by leveraging three rich heterogeneous sources of information: (i) organizational information (e.g., playlist co-occurrence), (ii) content information (e.g., audio &amp;amp; acoustics), and (iii) music stylistics (e.g., genre). We advocate for a multi-task formulation of graph representation learning, and propose MUSIG: Multi-task Sampling and Inductive learning on Graphs. MUSIG allows us to derive generalized track representations that combine the benefits offered by (i) the inductive graph based framework, which generates embeddings by sampling and aggregating features from a node's local neighborhood, as well as, (ii) multi-task training of aggregation functions, which ensures the learnt functions perform well on a number of important tasks. We present large scale empirical results for track recommendation for the playlist completion task, and compare different classes of representation learning approaches, including collaborative filtering, word2vec and node embeddings as well as, graph embedding approaches. Our results demonstrate that considering content information (i.e.,audio and acoustic features) is useful and that multi-task supervision helps learn better representations.</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">doi</subfield>
    <subfield code="i">isVersionOf</subfield>
    <subfield code="a">10.5281/zenodo.5624378</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="g">602-609</subfield>
    <subfield code="b">ISMIR</subfield>
    <subfield code="a">Online</subfield>
    <subfield code="t">Proceedings of the 22nd International Society for Music Information Retrieval Conference</subfield>
  </datafield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.5281/zenodo.5624379</subfield>
    <subfield code="2">doi</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">publication</subfield>
    <subfield code="b">conferencepaper</subfield>
  </datafield>
</record>
304
282
views
downloads
All versions This version
Views 304304
Downloads 282282
Data volume 170.5 MB170.5 MB
Unique views 286286
Unique downloads 254254

Share

Cite as