Journal article Open Access

Gene and pathway mutation scores for 5,805 primary tumors from TCGA

Kuijjer, Marieke Lydia


MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <leader>00000nam##2200000uu#4500</leader>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">Kuijjer, Marieke Lydia, et al. Br J Cancer. 2018 May;118(11):1492-1501</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">somatic mutations</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">mutations</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">SAMBAR</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">de-sparsification</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">cancer</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">TCGA</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">mutation data</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">mutation scores</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">pathway mutation scores</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">biological pathways</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">gene mutation scores</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">subtypes</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">cancer subtypes</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">pan-cancer</subfield>
  </datafield>
  <controlfield tag="005">20200120173118.0</controlfield>
  <datafield tag="500" ind1=" " ind2=" ">
    <subfield code="a">This work was funded through a grant from the NVIDIA foundation (grant no. 2014-133322 (3953)). This work was additionally supported by a Postdoctoral Fellowship Program from the Charles A. King Trust Fund, Sara Elizabeth O'Brien Trust, Bank of America, N.A., co-Trustees.</subfield>
  </datafield>
  <controlfield tag="001">1494861</controlfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Genentech Inc.</subfield>
    <subfield code="0">(orcid)0000-0001-8221-7139</subfield>
    <subfield code="4">res</subfield>
    <subfield code="a">Paulson, Joseph Nathaniel</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Bristol-Myers Squibb</subfield>
    <subfield code="4">res</subfield>
    <subfield code="a">Salzman, Peter</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">University of Massachusetts Boston</subfield>
    <subfield code="4">res</subfield>
    <subfield code="a">Ding, Wei</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Harvard TH Chan School of Public Health</subfield>
    <subfield code="0">(orcid)0000-0002-2702-5879</subfield>
    <subfield code="4">res</subfield>
    <subfield code="a">Quackenbush, John</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">35476</subfield>
    <subfield code="z">md5:ead3ae779654613b66e694966e5be294</subfield>
    <subfield code="u">https://zenodo.org/record/1494861/files/sample_tumor_annotation.RData</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">5907025</subfield>
    <subfield code="z">md5:cb555d2e45f3b7c34919f0d3f31e796d</subfield>
    <subfield code="u">https://zenodo.org/record/1494861/files/TCGA_SAMBAR.RData</subfield>
  </datafield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  </datafield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2018-11-23</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">openaire</subfield>
    <subfield code="p">user-mkuijjer</subfield>
    <subfield code="o">oai:zenodo.org:1494861</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="4">
    <subfield code="c">1492-1501</subfield>
    <subfield code="n">11</subfield>
    <subfield code="p">British Journal of Cancer</subfield>
    <subfield code="v">118</subfield>
  </datafield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="u">Centre for Molecular Medicine Norway, University of Oslo</subfield>
    <subfield code="0">(orcid)0000-0001-6280-3130</subfield>
    <subfield code="a">Kuijjer, Marieke Lydia</subfield>
  </datafield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">Gene and pathway mutation scores for 5,805 primary tumors from TCGA</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">user-mkuijjer</subfield>
  </datafield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u">https://creativecommons.org/licenses/by/3.0/legalcode</subfield>
    <subfield code="a">Creative Commons Attribution 3.0 Unported</subfield>
  </datafield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2">opendefinition.org</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;This dataset contains gene and pathway mutation scores for 5,805 primary tumors from 23 different cancer types from The Cancer Genome Atlas (TCGA).&lt;br&gt;
&lt;br&gt;
Gene mutation scores of 2,219 cancer-associated genes were calculated by normalizing the number of non-silent mutations in a gene (obtained from .maf files from TCGA) by the gene&amp;#39;s length. We used SAMBAR (Subtyping Agglomerated Mutations By Annotation Relations) to calculate pathway mutation scores. In short, SAMBAR takes the sum of mutation scores of all genes belonging to a biological pathway and then corrects these scores for the pathway&amp;#39;s gene set size and the number of times a gene is represented in the complete set of pathways. Please see &lt;a href="https://www.nature.com/articles/s41416-018-0109-7"&gt;our publication&lt;/a&gt; in the &lt;em&gt;British Journal of Cancer&lt;/em&gt; for methodological details.&lt;/p&gt;

&lt;p&gt;In the RData file &amp;quot;TCGA_SAMBAR.RData&amp;quot;, we share the following objects:&lt;/p&gt;

&lt;p&gt;- &lt;strong&gt;gene_scores&lt;/strong&gt;: a 2219 by 5805 numeric matrix including gene (rows) mutation scores for each sample (columns).&lt;/p&gt;

&lt;p&gt;- &lt;strong&gt;pathway_scores&lt;/strong&gt;: a 1066 by 5805 numeric matrix including pathway (rows) mutation scores for each sample (columns).&lt;br&gt;
&lt;br&gt;
The file &amp;quot;sample_tumor_annotation.RData&amp;quot; contains the object:&lt;br&gt;
&lt;br&gt;
- &lt;strong&gt;sample_annotation&lt;/strong&gt;: a 5805 by 2 character matrix including sample names (first column) and the tumor type the sample belongs to (&lt;a href="https://gdc.cancer.gov/resources-tcga-users/tcga-code-tables/tcga-study-abbreviations"&gt;TCGA Study Abbreviations&lt;/a&gt;).&lt;/p&gt;</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">doi</subfield>
    <subfield code="i">references</subfield>
    <subfield code="a">10.1038/s41416-018-0109-7</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">doi</subfield>
    <subfield code="i">isVersionOf</subfield>
    <subfield code="a">10.5281/zenodo.1494839</subfield>
  </datafield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.5281/zenodo.1494861</subfield>
    <subfield code="2">doi</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">publication</subfield>
    <subfield code="b">article</subfield>
  </datafield>
</record>
271
66
views
downloads
All versions This version
Views 271187
Downloads 6658
Data volume 219.6 MB172.3 MB
Unique views 240167
Unique downloads 4538

Share

Cite as