There is a newer version of the record available.

Published September 15, 2019 | Version v2
Dataset Open

Clonal decomposition and DNA replication states defined by scaled single cell genome sequencing

  • 1. Memorial Sloan Kettering Cancer Center

Description

OV2295 Tables

ov2295_breakpoint_counts.csv.gz: Table of breakpoint counts per cell

  • prediction_id: identifier for the breakpoint
  • cell_id: identifier for the cell
  • read_count: number of reads
  • library_id: identifier for the DNA library
  • sample_id: identifier for the sequenced sample
  • chromosome_1: chromosome of breakend 1
  • strand_1: orientation of break end 1
  • position_1: position of break end 1
  • chromosome_2: chromosome of breakend 2
  • strand_2: orientation of break end 2
  • position_2: position of break end 2

ov2295_cell_cn.csv.gz: Table of cell specific copy number

  • cell_id: identifier for the cell
  • sample_id: identifier for the sequenced sample
  • library_id: identifier for the DNA library
  • chr: chromosome of bin
  • start: start of bin
  • end: end of bin
  • reads: number of reads
  • copy: raw normalized copy number
  • state: copy number state

ov2295_cell_metrics.csv.gz: Table of cell metrics

  • cell_id: identifier of the cell
  • unpaired_mapped_reads: number of unpaired mapped reads
  • paired_mapped_reads: number of mapped reads that were properly paired
  • unpaired_duplicate_reads: number of unpaired duplicated reads
  • paired_duplicate_reads: number of paired reads that were also marked as duplicate
  • unmapped_reads: number of unmapped reads
  • percent_duplicate_reads: percentage of duplicate reads
  • estimated_library_size: scaled total number of mapped reads
  • total_reads: total number of reads, regardless of mapping status
  • total_mapped_reads: total number of mapped reads
  • total_duplicate_reads: number of duplicate reads
  • total_properly_paired: number of properly paired reads
  • coverage_breadth: percentage of genome covered by some read
  • coverage_depth: average reads per nucleotide position in the genome
  • median_insert_size: median insert size between paired reads
  • mean_insert_size: mean insert size between paired reads
  • standard_deviation_insert_size: standard deviation of the insert size between paired reads
  • index_sequence: index sequence of the adaptor sequence
  • column: column of the cell on the nanowell chip
  • img_col: column of the cell from the perspective of the microscope
  • index_i5: id of the i5 index adapter sequence
  • sample_type: type of the sample
  • primer_i7: id of the i5 index primer sequence
  • experimental_condition: experimental treatment of the cell, includes controls
  • index_i7: id of the i7 index adapter sequence
  • cell_call: living/dead classification of the cell based on staining usually, C1 == living, C2 == dead
  • sample_id: name of the sample
  • primer_i5: id of the i5 index primer sequence
  • row: row of the cell on the nanowell chip
  • library_id: identifier for the DNA library
  • index: ignored
  • multiplier: during parameter searching, the set [1..6] that was chosen
  • MSRSI_non_integerness: median of segment residuals from segment integer copy number states
  • MBRSI_dispersion_non_integerness: median of bin residuals from segment integer copy number states
  • MBRSM_dispersion: median of bin residuals from segment median copy number values
  • autocorrelation_hmmcopy: hmmcopy copy autocorrelation
  • cv_hmmcopy: ignored
  • empty_bins_hmmcopy: number of empty bins in hmmcopy
  • mad_hmmcopy: median absolute deviation of hmmcopy copy
  • mean_hmmcopy_reads_per_bin: mean reads per hmmcopy bin
  • median_hmmcopy_reads_per_bin: median reads per hmmcopy bin
  • std_hmmcopy_reads_per_bin: standard deviation value of reads in hmmcopy bins
  • total_halfiness: summed halfiness penality score of the cell
  • total_mapped_reads_hmmcopy: total mapped reads in all hmmcopy bins
  • scaled_halfiness: summed scaled halfiness penalty score of the cell
  • mean_state_mads: mean value for all median absolute deviation scores for each state
  • mean_state_vars: variance value for all median absolute deviation scores for each state
  • mad_neutral_state: median absolute deviation score of the neutral 2 copy state
  • breakpoints: number of breakpoints, as indicated by state changes not at the ends of chromosomes
  • mean_copy: mean hmmcopy copy value
  • state_mode: the most commonly occuring state
  • log_likelihood: hmmcopy log likelihood for the cell
  • true_multiplier: the exact decimal value used to scale the copy number for segmentation
  • order: order of the cell in the hierarchical clustering tree
  • quality: random forest classifier proability score that cell is good

ov2295_clone_alleles.csv.gz: Table of clone specific allele data

  • chr: chromosome of bin
  • start: start of bin
  • end: end of bin
  • hap_label: haplotype block identifier
  • clone_id: clone identifier
  • allele_1_sum: number of reads for allele 1 of the haplotype block
  • allele_2_sum: number of reads for allele 2 of the haplotype block
  • total_counts_sum: total reads for the haplotype block

ov2295_clone_breakpoints.csv.gz: Table of breakpoints per clone for OV2295 samples. Columns:

  • prediction_id: identifier for the breakpoint
  • chromosome_1: chromosome of breakend 1
  • strand_1: orientation of break end 1
  • position_1: position of break end 1
  • chromosome_2: chromosome of breakend 2
  • strand_2: orientation of break end 2
  • position_2: position of break end 2
  • clone_id: clone identifier
  • read_count: number of reads
  • is_present: presence=1, absent=0

ov2295_clone_clusters.csv.gz: Table of cell clusters as putative clones

  • cell_id: identifier for the cell
  • clone_id: clone identifier

ov2295_clone_cn.csv.gz: Table of allele specific copy number per clone for OV2295 samples. Columns:

  • chr: chromosome of bin
  • start: start of bin
  • end: end of bin
  • total_cn: HMMCopy predicted total copy number 
  • minor_cn: HMM predicted minor copy number 
  • major_cn: HMM predicted major copy number 
  • clone_id: clone identifier

ov2295_clone_snvs.csv.gz: Table of SNVs per clone for OV2295 samples.  Columns:

  • chrom: chromosome
  • coord: genome position
  • ref: reference nucleotide
  • alt: alternate nucleotide
  • clone_id: clone identifier
  • ref_counts: number of reads at this position matching the reference nucleotide
  • alt_counts: number of reads at this position matching the alternate nucleotide
  • total_counts: total number of reads at this position
  • is_present: presence=0, absent=1
  • is_het: is heterozygous
  • is_hom: is homozygous for the alternate

ov2295_nodes.csv.gz: Table of phylogenetic information for SNV evolution

  • variant_id: identifier for the SNV as chrom:coord:ref:alt
  • node: node in the phylogenetic tree
  • loss: probability the SNV was lost at this node
  • origin: probability the SNV originated at this node
  • presence: probability the SNV is present at this node
  • ml_origin: binary indicator the SNV originated at this node
  • ml_presence: binary indicator the SNV is present at this node
  • ml_loss: binary indicator the SNV was lost at this node

ov2295_snv_counts.csv.gz: Table of SNV counts

  • chrom: chromosome
  • coord: genome position
  • ref: reference nucleotide
  • alt: alternate nucleotide
  • ref_counts: number of reads at this position matching the reference nucleotide
  • alt_counts: number of reads at this position matching the alternate nucleotide
  • cell_id: identifier for the cell
  • total_counts: total number of reads at this position
  • sample_id: identifier for the sequenced sample

ov2295_tree.pickle: Phylogenetic tree in python pickle format.  Requires installation of the stochastic dollo code at: https://bitbucket.org/dranew/dollo, version 0.4.2.

Note the following sample mapping: ‘SA922’: ‘OV2295(R2)’, ‘SA921’: ‘TOV2295(R)’, ‘SA1090’: ‘OV2295’,

Plots

ov_supp_clone_allele_cn.png: Clone allele ratios for each OV2295 sample.

ov_supp_clone_total_cn.png: Clone copy number for each OV2295 sample.

ov_supp_sample_total_cn.png: Bulk copy number for each OV2295 sample.

ov_supp_sample_allele_cn.png: Bulk allele ratios for each OV2295 sample.

Files

ov_supp_clone_allele_cn.png

Files (204.0 MB)

Name Size Download all
md5:bb6d40b02dc36c5f0a2c0f81d9e70388
94.0 kB Download
md5:e8d0a089e264d676f2b8aca62cc5382c
171.5 MB Download
md5:04c3c529a21ada1be2df08c3e037357b
491.9 kB Download
md5:9e8569f804bdbddb6ce4ef95b2c1c3a0
6.8 MB Download
md5:6b0f97be49fea56b344b5ea2021a1b4a
21.2 kB Download
md5:e37fa250d769c4d194af372624aba615
2.5 kB Download
md5:aa20d8db8d3529a5264cf705248bfd93
705.2 kB Download
md5:4de3a2da63ec4d7150e4282264fa326e
640.6 kB Download
md5:ce27682c3a1ddfb8caf525467ede8b25
6.9 MB Download
md5:3d1ac0ab42cb8e84caabd9e20b356027
15.3 MB Download
md5:1ad7887a247243cf235011700e69e449
1.1 kB Download
md5:9cba732f0609c5e0d35dd9ad8c99cf9c
621.0 kB Preview Download
md5:c071b873ff0eab17f8ac15178814058f
244.1 kB Preview Download
md5:124247c49f9eb0c12c68dc1b65e5b463
551.7 kB Preview Download
md5:97f4403008fc60439345383b8ea85920
148.6 kB Preview Download

Additional details

Related works

Is supplement to
10.1101/411058 (DOI)