Journal article Open Access

Benchmarking of cell type deconvolution pipelines for transcriptomics data

Cobos, F.; Alquicira-Hernandez, J.; Powell, J.; Mestdagh, P.; Peter, K.


MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <leader>00000nam##2200000uu#4500</leader>
  <datafield tag="041" ind1=" " ind2=" ">
    <subfield code="a">eng</subfield>
  </datafield>
  <controlfield tag="005">20201209122717.0</controlfield>
  <controlfield tag="001">4312852</controlfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Garvan Institute of Medical Research</subfield>
    <subfield code="0">(orcid)0000-0002-9049-7780</subfield>
    <subfield code="a">Alquicira-Hernandez, J.</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Garvan Institute of Medical Research</subfield>
    <subfield code="0">(orcid)0000-0002-5070-4124</subfield>
    <subfield code="a">Powell, J.</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Cancer Research Institute Ghent (CRIG)</subfield>
    <subfield code="0">(orcid)0000-0001-7821-9684</subfield>
    <subfield code="a">Mestdagh, P.</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Ghent University</subfield>
    <subfield code="0">(orcid)0000-0002-7726-5096</subfield>
    <subfield code="a">Peter, K.</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">1720878</subfield>
    <subfield code="z">md5:fab3e70298694bd83fe4d60202263507</subfield>
    <subfield code="u">https://zenodo.org/record/4312852/files/s41467-020-19015-1.pdf</subfield>
  </datafield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  </datafield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2020-12-02</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">openaire</subfield>
    <subfield code="p">user-ipc</subfield>
    <subfield code="o">oai:zenodo.org:4312852</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="4">
    <subfield code="v">Article number: 5650 (2020)</subfield>
    <subfield code="p">Nature Communications</subfield>
    <subfield code="n">11</subfield>
  </datafield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="u">Ghent University</subfield>
    <subfield code="0">(orcid)0000-0002-8816-9243</subfield>
    <subfield code="a">Cobos, F.</subfield>
  </datafield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">Benchmarking of cell type deconvolution pipelines for transcriptomics data</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">user-ipc</subfield>
  </datafield>
  <datafield tag="536" ind1=" " ind2=" ">
    <subfield code="c">826121</subfield>
    <subfield code="a">individualizedPaediatricCure: Cloud-based virtual-patient models for precision paediatric oncology</subfield>
  </datafield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u">https://creativecommons.org/licenses/by/4.0/legalcode</subfield>
    <subfield code="a">Creative Commons Attribution 4.0 International</subfield>
  </datafield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2">opendefinition.org</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;Many computational methods have been developed to infer cell type proportions from bulk transcriptomics data. However, an evaluation of the impact of data transformation, preprocessing, marker selection, cell type composition and choice of methodology on the deconvolution results is still lacking. Using five single-cell RNA-sequencing (scRNA-seq) datasets, we generate pseudo-bulk mixtures to evaluate the combined impact of these factors. Both bulk deconvolution methodologies and those that use scRNA-seq data as reference perform best when applied to data in linear scale and the choice of normalization has a dramatic impact on some, but not all methods. Overall, methods that use scRNA-seq data have comparable performance to the best performing bulk methods whereas semisupervised approaches show higher error values. Moreover, failure to include cell types in the reference that are present in a mixture leads to substantially worse results, regardless of the previous choices. Altogether, we evaluate the combined impact of factors affecting the deconvolution task across different datasets and propose general guidelines to maximize its performance.&lt;/p&gt;</subfield>
  </datafield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.1038/s41467-020-20288-9</subfield>
    <subfield code="2">doi</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">publication</subfield>
    <subfield code="b">article</subfield>
  </datafield>
</record>
37
33
views
downloads
Views 37
Downloads 33
Data volume 56.8 MB
Unique views 35
Unique downloads 31

Share

Cite as