Info: Zenodo’s user support line is staffed on regular business days between Dec 23 and Jan 5. Response times may be slightly longer than normal.

Published August 11, 2022 | Version v.1.2
Dataset Open

cGTEx_dataset:A multi-tissue atlas of regulatory variants in cattle

  • 1. Animal Genomics and Improvement Laboratory, Henry A. Wallace Beltsville Agricultural Research Center, Agricultural Research Service, USDA, Beltsville, MD, USA; National Engineering Laboratory of Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, China School of Life Sciences, Westlake University, Hangzhou, China
  • 2. Animal Genomics and Improvement Laboratory, Henry A. Wallace Beltsville Agricultural Research Center, Agricultural Research Service, USDA, Beltsville, MD, USA Department of Animal and Avian Sciences, University of Maryland, College Park, MD, USA
  • 3. MRC Human Genetics Unit at the Institute of Genetics and Cancer, The University of Edinburgh, Edinburgh, UK
  • 4. State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
  • 5. National Engineering Laboratory of Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, China
  • 6. Institute of Animal Science, Chinese Academy of Agricultural Science, Beijing, China
  • 7. Scotland's Rural College (SRUC), Roslin Institute Building, Midlothian, UK
  • 8. Faculty of Veterinary & Agricultural Science, The University of Melbourne, Parkville, Victoria, Australia Agriculture Victoria, AgriBio, Centre for AgriBiosciences, Bundoora, Victoria, Australia
  • 9. Agriculture Victoria, AgriBio, Centre for AgriBiosciences, Bundoora, Victoria, Australia
  • 10. MRC Human Genetics Unit at the Institute of Genetics and Cancer, The University of Edinburgh, Edinburgh, UK The Roslin Institute, Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Midlothian, UK
  • 11. The Roslin Institute, Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Midlothian, UK
  • 12. INRAE, AgroParisTech, GABI, Université Paris-Saclay, Jouy-en-Josas, France
  • 13. Guangdong Provincial Key Laboratory of Waterfowl Healthy Breeding, College of Animal Science & Technology, Zhongkai University of Agriculture and Engineering, Guangzhou, China
  • 14. Animal Genomics and Improvement Laboratory, Henry A. Wallace Beltsville Agricultural Research Center, Agricultural Research Service, USDA, Beltsville, MD, USA
  • 15. Department of Animal and Avian Sciences, University of Maryland, College Park, MD, USA
  • 16. Animal Genomics and Improvement Laboratory, Henry A. Wallace Beltsville Agricultural Research Center, Agricultural Research Service, USDA, Beltsville, MD, USA MRC Human Genetics Unit at the Institute of Genetics and Cancer, The University of Edinburgh, Edinburgh, UK

Description

The files are raw data of the cGTEX dataset used in the publication https://doi.org/10.1038/s41588-022-01153-5. For details, please read the Methods section. 

1. cGTEx_meta_data_8646sample.xlsx

Metadata consists of sample names with their sample accession, including information such as data size, cleaned reads, mapping rate, and age. The data is extracted from SRA (https://www.ncbi.nlm.nih.gov/sra/) and BIGD (https://bigd.big.ac.cn/bioproject/) ( samples starting with CRS)

2. cGTEx_count_8646sample_27607gene.txt.gz

Data consist of raw RNA-seq read count of 27607 genes (column names as Ensembl gene id )of 8646 samples (as row names) 

3. cGTEx_TPM_8646sample_27607gene.txt.gz

Data consist of TPM values of 27607 genes (column names as Ensembl gene id) in  samples (8646 samples as row names)

4. cGTEx_imputed_vcf.tar.gz

Imputed genotypes (SNP) of 7297 RNA-seq samples in 29 autosomes.

5. cGTEx_exon_junction_8646sample.tar.gz

Exon junction files of 8646 files 

Note: Small discrepancies in some sample names or the absence of headers in some data sets compared to https://cgtex.roslin.ed.ac.uk/ are sorted out in this upload.

 

Notes

This work was supported in part by Agriculture and Food Research Initiative (AFRI) grant numbers 2016-67015-24886, 2019-67015-29321, and 2021-67015-33409 from the United States Department of Agriculture (USDA) National Institute of Food and Agriculture (NIFA) Animal Genome and Reproduction Programs, and US–Israel Binational Agricultural Research and Development (BARD) grant number US-4997-17 from the BARD Fund. L.F. was partially funded through Health Data Research UK (HDRUK) award HDR-9004 and the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No. 801215. A.T. acknowledged funding from the Biotechnology and Biological Sciences Research Council through program grants BBS/E/D/10002070 and BBS/E/D/30002275, Medical Research Council research grant MR/P015514/1 and HDRUK award HDR-9004. O.C.-X. was supported by MR/R025851/1. R.X. was supported by Australian Research Council's Discovery Projects (DP200100499). Y. Yu. was supported by the National Science Foundation of China-Pakistan Science Foundation Joint Project (31961143009) and National Key R&D Program of China (2021YFD1200900 and 2021YFD1200903). L.M. was supported in part by AFRI grant numbers 2020-67015-31398 and 2021-67015-33409 from the NIFA. G.E.L., B.D.R. and C.P.V.T. were supported by appropriated project 8042-31000-001-00-D, 'Enhancing Genetic Merit of Ruminants Through Improved Genome Assembly, Annotation, and Selection' of the Agricultural Research Service (ARS) of the USDA. C.-J.L. was supported by appropriated project 8042-31310-078-00-D, 'Improving Feed Efficiency and Environmental Sustainability of Dairy Cattle through Genomics and Novel Technologies' of ARS-USDA. J.B.C. was supported by appropriated project 8042-31000-002-00-D, 'Improving Dairy Animals by Increasing Accuracy of Genomic Prediction, Evaluating New Traits, and Redefining Selection Goals' of ARS-USDA. This research used resources provided by the SCINet project of the ARS-USDA project number 0500-00093-001-00-D. Mention of trade names or commercial products in this article is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the USDA. The USDA is an equal opportunity provider and employer. All the funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript. We thank US dairy producers for providing phenotypic, genomic and pedigree data through the Council on Dairy Cattle Breeding under ARS-USDA Material Transfer Research Agreement 58-8042-8-007. Access to 1000 Bull Genomes Project data was provided under ARS-USDA Data Transfer Agreement 15443. International genetic evaluations were calculated by the International Bull Evaluation Service (Interbull; Uppsala, Sweden).

Files

Files (17.9 GB)

Name Size Download all
md5:c12001193882626a23b4e304bd0b25ff
211.3 MB Download
md5:b12008ed6210b848ee51d430c9176fa7
2.7 GB Download
md5:8618cfd580f070733f9c3e9e04ec2d4d
14.4 GB Download
md5:90625600c68513a4703e14671a5f846b
866.3 kB Download
md5:f1a61af00f65b9f392c14c5b3263e814
538.4 MB Download

Additional details

Related works

Compiles
Journal article: 10.1038/s41588-022-01153-5 (DOI)
Is identical to
Dataset: https://cgtex.roslin.ed.ac.uk/ (URL)
Is supplemented by
Workflow: https://zenodo.org/record/6510550 (URL)

Funding

TRAINEd – TRAIN@Ed 801215
European Commission
Prediction of genes and regulatory elements in farm animal genomes BBS/E/D/10002070
UK Research and Innovation
Genetic improvement of farmed animals BBS/E/D/30002275
UK Research and Innovation
Vast-scale linear mixed modelling genetic discovery approaches for genome- and exome-wide association analyses to enable therapeutic target validation MR/R025851/1
UK Research and Innovation
Understanding disease through environment-wide association studies MR/P015514/1
UK Research and Innovation