Published April 28, 2022 | Version v1
Dataset Open

Data from: Benchmarking ultra-high molecular weight DNA preservation methods for long-read and long-range sequencing

  • 1. University of Toronto
  • 2. Rockefeller University
  • 3. Max Planck Institute of Molecular Cell Biology and Genetics
  • 4. Wellcome Sanger Institute
  • 5. Arima Genomics (United States)
  • 6. Uppsala University
  • 7. University of Massachusetts Amherst
  • 8. Universidad de Los Andes
  • 9. Johns Hopkins University
  • 10. National Marine Fisheries Service

Description

Studies in vertebrate genomics require sampling from a broad range of tissue types, taxa, and localities. Recent advancements in long-read and long-range genome sequencing have made it possible to produce high-quality chromosome-level genome assemblies for almost any organism. However, adequate tissue preservation for the requisite ultra-high molecular weight DNA (uHMW DNA) remains a major challenge. Here we present a comparative study of preservation methods for field and laboratory tissue sampling, across vertebrate classes and different tissue types. We find that no single method is best for all cases. Instead, the optimal storage and extraction methods vary by taxa, by tissue, and by down-stream application. Therefore, we provide sample preservation guidelines that ensure sufficient DNA integrity and amount required for use with long-read and long-range sequencing technologies across vertebrates. Our best practices generate the uHMW DNA needed for the high-quality reference genomes for Phase 1 of the Vertebrate Genomes Project (VGP), whose ultimate mission is to generate chromosome-level reference genome assemblies of all ~70,000 extant vertebrate species.

Notes

FEMTO: A directory containing subfolders with FEMTO Pulse outputs. Files are grouped by run and by species. See "sample info.txt" within each taxon folder for a list of samples and treatments. These files can be opened with a Fragment Analyzer software such as ProSize.

PFGE: A folder containing images from Pulsed Field Gel Electrophoresis. Each gel has two images associated. One image with no text overlay and one with labeling. 

Scripts: A folder containing two scripts written to analyze data for this study. 

Funding provided by: Howard Hughes Medical Institute*
Crossref Funder Registry ID:
Award Number:

Funding provided by: Rockefeller University
Crossref Funder Registry ID: http://dx.doi.org/10.13039/100012007
Award Number:

Funding provided by: Max Planck Institute of Molecular Cell Biology and Genetics*
Crossref Funder Registry ID:
Award Number:

Funding provided by: Wellcome Trust
Crossref Funder Registry ID: http://dx.doi.org/10.13039/100010269
Award Number: WT207492

Funding provided by: Wellcome Trust
Crossref Funder Registry ID: http://dx.doi.org/10.13039/100010269
Award Number: 104640/Z/14/Z

Funding provided by: Wellcome Trust
Crossref Funder Registry ID: http://dx.doi.org/10.13039/100010269
Award Number: 092096/Z/10/Z

Files

uHMW_sample_prep_raw_data.zip

Files (138.1 MB)

Name Size Download all
md5:4e79ad55cbf893686f0116e91eb15076
138.1 MB Preview Download