Published December 23, 2024 | Version v1
Dataset Open

Processed data for "CNSistent integration and feature extraction from somatic copy number profiles".

  • 1. ROR icon University Hospital Cologne

Contributors

Supervisor:

  • 1. ROR icon University Hospital Cologne

Description

Description

Data processed by the CNSistent tool avaiable https://bitbucket.org/schwarzlab/cnsistent This is a result of running the data processing script on the source data available in the repository or at https://zenodo.org/records/14677713. To be used through the CNSistent library, this data should be extracted into CNSistent/out/.

Repository Structure

For each dataset of TRACERx, TCGA, and PCAWG the repository contains the following:

  • filled copy number segments (*_fill.tsv) using `cns fill`
  • imputed copy number segments (*_imp.tsv) using `cns imp`
  • aggregated copy number segments (*_[segmentation_type].tsv) using `cns aggregate`

The aggregation methods are the following:

  • fixed-size segments of 10 Mb, 5 Mb, 3 Mb, 2 Mb, 1 Mb, 500 Kb, and 250 Kb size;
  • Whole chromosome, chromosome arm, and cytoband-level CN segments;
  • Gene-level CN values based on the ENSEMBL and COSMIC gene sets; and
  • Breakpoint clustering using distance thresholdsof 1 Mb, 500 Kb, and 250 Kb.

Each file is aggregated based on the segmentation scheme in the corresponding file called `segs_[segmentation_type].bed`

Files

Files (11.6 GB)

Name Size Download all
md5:eeab5d05b6df25db665fbeae02f59781
11.6 GB Download