Dataset Open Access

Data sets for modeling double strand break susceptibility and interrogating structural variation in cancer

Tracy Ballinger; Britta Bouwman; Reza Mirzazadeh; Silvano Garnerone; Nicola Crosetto; Colin Semple


MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <leader>00000nmm##2200000uu#4500</leader>
  <datafield tag="041" ind1=" " ind2=" ">
    <subfield code="a">eng</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">double strand break, cancer, structural variation, chromatin, random forest modeling</subfield>
  </datafield>
  <controlfield tag="005">20190410032249.0</controlfield>
  <controlfield tag="001">2537101</controlfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Karolinska Institutet</subfield>
    <subfield code="0">(orcid)0000-0002-9827-9497</subfield>
    <subfield code="a">Britta Bouwman</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Karolinska Institutet</subfield>
    <subfield code="a">Reza Mirzazadeh</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Karolinska Institutet</subfield>
    <subfield code="0">(orcid)0000-0002-0252-6108</subfield>
    <subfield code="a">Silvano Garnerone</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Karolinska Institutet</subfield>
    <subfield code="0">(orcid)0000-0002-3019-6978</subfield>
    <subfield code="a">Nicola Crosetto</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">University of Edinburgh</subfield>
    <subfield code="0">(orcid)0000-0003-1765-4118</subfield>
    <subfield code="a">Colin Semple</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">1641648161</subfield>
    <subfield code="z">md5:3dda42c853aa889359c8a47fe697084a</subfield>
    <subfield code="u">https://zenodo.org/record/2537101/files/supp_files_v1.tar.gz</subfield>
  </datafield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  </datafield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2019-01-10</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">openaire_data</subfield>
    <subfield code="o">oai:zenodo.org:2537101</subfield>
  </datafield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="u">University of Edinburgh</subfield>
    <subfield code="0">(orcid)0000-0002-7689-0009</subfield>
    <subfield code="a">Tracy Ballinger</subfield>
  </datafield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">Data sets for modeling double strand break susceptibility and interrogating structural variation in cancer</subfield>
  </datafield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u">http://creativecommons.org/licenses/by/4.0/legalcode</subfield>
    <subfield code="a">Creative Commons Attribution 4.0 International</subfield>
  </datafield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2">opendefinition.org</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;This is data used and produced for the study of &amp;quot;Modeling double strand break susceptibility to interrogate structural variation in cancer&amp;quot;.&amp;nbsp;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;Structural variants (SVs) are known to play important roles in a variety of cancers, but their origins and functional consequences are still poorly understood. Many SVs are thought to emerge from errors in the repair processes following DNA double strand breaks (DSBs).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Results:&lt;/strong&gt; We used experimentally quantified DSB frequencies in cell lines with matched chromatin and sequence features to derive the first quantitative genome-wide models of DSB susceptibility. These models are accurate and provide novel insights into the mutational mechanisms generating DSBs. Models trained in one cell type can be successfully applied to others, but a substantial proportion of DSBs appear to reflect cell type specific processes. Using model predictions as a proxy for susceptibility to DSBs in tumours, many SV-enriched regions appear to be poorly explained by selectively neutral mutational bias alone. A substantial number of these regions show unexpectedly high SV breakpoint frequencies given their predicted susceptibility to mutation and are therefore credible targets of positive selection in tumours. These putatively positively selected SV hotspots are enriched for genes previously shown to be oncogenic. In contrast, several hundred regions across the genome show unexpectedly low levels of SVs, given their relatively high susceptibility to mutation. These novel coldspot regions appear to be subject to purifying selection in tumours and are enriched for active promoters and enhancers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conclusions:&lt;/strong&gt; We conclude that models of DSB susceptibility offer a rigorous approach to the inference of SVs putatively subject to selection in tumours.&lt;/p&gt;</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">doi</subfield>
    <subfield code="i">isVersionOf</subfield>
    <subfield code="a">10.5281/zenodo.2537100</subfield>
  </datafield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.5281/zenodo.2537101</subfield>
    <subfield code="2">doi</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">dataset</subfield>
  </datafield>
</record>
63
28
views
downloads
All versions This version
Views 6363
Downloads 2828
Data volume 46.0 GB46.0 GB
Unique views 5555
Unique downloads 2020

Share

Cite as