Published June 10, 2022 | Version 1.1
Dataset Restricted

cfDNA methylome profiling for detection and subtyping of Small Cell Lung Cancers

  • 1. Cancer Research UK Manchester Institute

Description

Methyl-Binding Domain protein sequencing (MBD-Seq) was applied to samples derived from patients with small cell lung cancer (SCLC), as well as non-cancerous controls. This included circulating tumour cell derived explants (CDX) or patient derived Xenograft (PDX) preclinical models derived from 33 patients with SCLC, circulating cell-free DNA (cfDNA) from 78 patients with SCLC, cfDNA from 79 non-cancer controls and 13 non-cancerous lung tissue samples.

The objects deposited here include R data files containing qseaSets from the R package qsea, which includes the read counts per sample per 300 base pair window across the genome, as well as information on copy number variation and metadata tables and the scripts used to generate and analyse them.

Details of files:

DX_All_min50_max1000_w300_q10.rds

  A qseaSet containing all 97 CDX/PDX samples (including replicates) and the 13 normal lung tissue samples. Note that min50_max1000_w300_q10 refers to including paired reads with between 50 and 1000 base pairs (bp), a window size of 300bp and a minimum MAPQ score of 10.

DX_merged.rds

  A qseaSet with the biological replicates of each CDX merged together (formed from the above dataset).

cfDNA_All_min50_max1000_w300_q10.rds

  A qseaSet containing 157 cfDNA samples used in the main body of the paper.

NCCsForTrain_All_min50_max1000_w300_q10.rds

  A qseaSet containing the 38 NCC cfDNA samples used in the mixture sets for training the tumour/normal classifier. These samples are a subset of those in the cfDNA_All_min50_max1000_w300_q10.rds object.

ValidationSet_All_min50_max1000_w300_q10.rds

  A qseaSet containing the 41 NCC cfDNA samples and 78 SCLC cfDNA samples used to validate the tumour/normal classifier. A subset of the cfDNA_All_min50_max1000_w300_q10.rds object.

cfDNApostTreatment_All_min50_max1000_w300_q10.rds

  A qseaSet containing 7 cfDNA samples which were collected at a later, post-treatment, timepoint (mostly disease progression) from the same patients as in the main cfDNA object. Used only for Extended Figure 5, not any other part of the manuscript.

varyDNAinput_All_min50_max1000_w300_q10.rds

  A qseaSet containing independent replicates of the cell line H1975, with different ng amounts of starting DNA (1-75ng). Used only for Figure 1B.

TNmixSets_combined_regionsFiltered_redo2.rds

  A qseaSet containing the synthetic mixture sets generated by mixing either a CDX/PDX sample and a NCC sample or two NCC samples. Used to train the tumour/normal classifier. This object is restricted to only the windows used in the classifier for size regions, but the original mixtures are across the whole genome.

CDXarrayWide.csv

  Pre-processed 450k Infinium Methylation array beta values for 8 CDX samples which were previously sequenced. Used for Supplementary Table 7 only.

ArrayCDXs_percent100.rds

  A qseaSet containing the 8 CDX samples processed on 450k Infinium Methylation arrays (CDXarrayWide.csv) converted to estimated reads.  

KeyTFsincYAP.csv

  Variance stabilised transform (vst) values generated from RNASeq for the CDX/PDX samples, for the key genes involved in the subtype classifications.

infinium-methylationepic-v-1-0-b5-manifest-file.csv

  A lookup file for the Infinium EPIC arrays, as downloaded from https://emea.support.illumina.com/downloads/infinium-methylationepic-v1-0-product-files.html.

SCLC_transcript_expression_adjusted_for_batch_effects.csv

SCLC_methylation_beta_values_of_individual_probes_after_QC_and_filtering_out_SNVs.csv

  Pre-processed transcript and methylation beta values from Infinium EPIC arrays for SCLC cell lines, as downloaded from sclccelllines.cancer.gov/sclc/downloads.xhtml (data timestamped as December 2019).

CellLine_100percent.rds

  A qseaSet containing the SCLC cell lines converted to estimated MBD-Seq reads.

Cellline_mixtureSets.rds

  A qseaSet containing the synthetic mixture sets generated by mixing converted cell lines with a NCC sample. Used to train the subtype classifier. 

DilutionSeries_CDX13_min50_max1000_w300_q10.rds

DilutionSeries_CDX29_min50_max1000_w300_q10.rds

DilutionSeries_CDX32_min50_max1000_w300_q10.rds

  Three qseaSets containing the results of an in silico dilution of a CDX (CDX13 = POU2F3, CDX29 = NEUROD1, CDX32 = ASCL1) with a single NCC, used to test limit of detection of the subtype classifier. Reads were mixed at the fastq level, prior to the NextFlow pipeline being used.

DilutionSeries_H446_Rep*_min50_max1000_w300_q10.rds

  Eleven qseaSets containing the results of an in silico dilution of a SCLC cell line H446 with a single NCC (not used to build the classifier), used to determine limit of detection of the tumour/normal classifier. Reads were mixed at the fastq level with different random seeds, prior to a NextFlow pipeline being used.

poirier_oncogene.rda
PoirierEtAl_Oncogene2015_SuppTab1.csv
PoirierEtAl_Oncogene2015_SuppTab2.csv

  Processed data object for the 2015 Oncogene paper Poirier et al (PMID:25746006), with 450k array data for SCLC tumours and normal lungs, along with two of the supplementary tables from that paper.

 

Notes

The work was funded by NIH grants R01 CA197936, R35 CA263816, U24 CA213274, Cancer Research UK (CRUK) via core-funding to the CRUK Manchester Institute (grant no. C5759/A27412) and the CRUK Manchester Centre (grant no. A25254), and supported by the CRUK Manchester Experimental Cancer Medicines Centre (grant no. A20465), the CRUK Lung Cancer Centre of Excellence (grant no. A25146), the Manchester Experimental Cancer Medicine Centre and the NIHR Manchester Biomedical Research Centre.

Files

Restricted

The record is publicly accessible, but files are restricted to users with access.

Request access

If you would like to request access to these files, please fill out the form below.

You need to satisfy these conditions in order for this request to be accepted:

This data may only be used for academic use.

Please email DAC@cruk.manchester.ac.uk to request a Data Access Request form, which will need to be signed by an institutional representative, as well as potentially an International Data Transfer agreement. 

Requests sent only via the webform will not be granted.

If you get no response from the email address above then please follow up, Zenodo does not send reminders about pending requests.

You are currently not logged in. Do you have an account? Log in here