Dataset Open Access

Relate-estimated coalescence rates, allele ages, and selection p-values for the 1000 Genomes Project

Speidel, Leo; Forest, Marie; Shi, Sinan; Myers, Simon R.


JSON Export

{
  "files": [
    {
      "links": {
        "self": "https://zenodo.org/api/files/00eb7c26-b189-45be-af7b-ec9b8d2ab4d7/allele_ages_AFR.zip"
      }, 
      "checksum": "md5:c3d94e2084205dcb7101bd51d3660409", 
      "bucket": "00eb7c26-b189-45be-af7b-ec9b8d2ab4d7", 
      "key": "allele_ages_AFR.zip", 
      "type": "zip", 
      "size": 2830727172
    }, 
    {
      "links": {
        "self": "https://zenodo.org/api/files/00eb7c26-b189-45be-af7b-ec9b8d2ab4d7/allele_ages_AMR.zip"
      }, 
      "checksum": "md5:f7da0238962e45a2446fbe9d88b6fef8", 
      "bucket": "00eb7c26-b189-45be-af7b-ec9b8d2ab4d7", 
      "key": "allele_ages_AMR.zip", 
      "type": "zip", 
      "size": 1127375764
    }, 
    {
      "links": {
        "self": "https://zenodo.org/api/files/00eb7c26-b189-45be-af7b-ec9b8d2ab4d7/allele_ages_EAS.zip"
      }, 
      "checksum": "md5:f1e5241291fe674fb5c3d4390c9d6663", 
      "bucket": "00eb7c26-b189-45be-af7b-ec9b8d2ab4d7", 
      "key": "allele_ages_EAS.zip", 
      "type": "zip", 
      "size": 1040290936
    }, 
    {
      "links": {
        "self": "https://zenodo.org/api/files/00eb7c26-b189-45be-af7b-ec9b8d2ab4d7/allele_ages_EUR.zip"
      }, 
      "checksum": "md5:c637c78c2248ca46144eb40d2c4f6c4d", 
      "bucket": "00eb7c26-b189-45be-af7b-ec9b8d2ab4d7", 
      "key": "allele_ages_EUR.zip", 
      "type": "zip", 
      "size": 1177791492
    }, 
    {
      "links": {
        "self": "https://zenodo.org/api/files/00eb7c26-b189-45be-af7b-ec9b8d2ab4d7/allele_ages_SAS.zip"
      }, 
      "checksum": "md5:9ee3fada4ae59271992d8288ee8a82d1", 
      "bucket": "00eb7c26-b189-45be-af7b-ec9b8d2ab4d7", 
      "key": "allele_ages_SAS.zip", 
      "type": "zip", 
      "size": 1223898401
    }, 
    {
      "links": {
        "self": "https://zenodo.org/api/files/00eb7c26-b189-45be-af7b-ec9b8d2ab4d7/coalescence_rates.zip"
      }, 
      "checksum": "md5:b7b247836e1d078de755824fcecfc75b", 
      "bucket": "00eb7c26-b189-45be-af7b-ec9b8d2ab4d7", 
      "key": "coalescence_rates.zip", 
      "type": "zip", 
      "size": 19658
    }
  ], 
  "owners": [
    68481
  ], 
  "doi": "10.5281/zenodo.3234689", 
  "stats": {
    "version_unique_downloads": 172.0, 
    "unique_views": 610.0, 
    "views": 669.0, 
    "version_views": 669.0, 
    "unique_downloads": 172.0, 
    "version_unique_views": 610.0, 
    "volume": 493504343733.0, 
    "version_downloads": 366.0, 
    "downloads": 366.0, 
    "version_volume": 493504343733.0
  }, 
  "links": {
    "doi": "https://doi.org/10.5281/zenodo.3234689", 
    "conceptdoi": "https://doi.org/10.5281/zenodo.3234688", 
    "bucket": "https://zenodo.org/api/files/00eb7c26-b189-45be-af7b-ec9b8d2ab4d7", 
    "conceptbadge": "https://zenodo.org/badge/doi/10.5281/zenodo.3234688.svg", 
    "html": "https://zenodo.org/record/3234689", 
    "latest_html": "https://zenodo.org/record/3234689", 
    "badge": "https://zenodo.org/badge/doi/10.5281/zenodo.3234689.svg", 
    "latest": "https://zenodo.org/api/records/3234689"
  }, 
  "conceptdoi": "10.5281/zenodo.3234688", 
  "created": "2019-05-30T08:33:15.744564+00:00", 
  "updated": "2020-01-24T19:26:19.225876+00:00", 
  "conceptrecid": "3234688", 
  "revision": 10, 
  "id": 3234689, 
  "metadata": {
    "access_right_category": "success", 
    "doi": "10.5281/zenodo.3234689", 
    "description": "<p><strong>Overview</strong></p>\n\n<p>Coalescence rates, allele ages, and p-values for evidence of positive selection calculated for 2478&nbsp;samples of the&nbsp;1000 Genomes Project&nbsp;using Relate.</p>\n\n<p>We estimated the joint genealogy of all 1000 GP populations and then extracted the embedded genealogy for each population.<br>\nFor the genealogy of each population, we jointly estimated the population size history and branch lengths.&nbsp;<br>\nVariants segregating in more than one&nbsp;population&nbsp;therefore have&nbsp;correlated but different allele ages in each population.</p>\n\n<p>Please refer to&nbsp;<a href=\"https://www.nature.com/articles/s41588-019-0484-x\">Speidel et al.&nbsp;Nature Genetics (2019)</a>&nbsp;for more details or email leo.speidel@outlook.com for any queries.</p>\n\n<p><strong>Coalescence rates</strong></p>\n\n<p>The zipped directory&nbsp;coalescence_rates.zip&nbsp;contains coalescence rates for 26 populations in the 1000 Genomes Project data set.</p>\n\n<ul>\n\t<li>The .coal files show the haploid coalescence rates, please refer to the&nbsp;<a href=\"https://myersgroup.github.io/relate/modules.html#PopulationSizeScript_FileFormats\">Relate documentation</a>&nbsp;for the file format.</li>\n\t<li>The popsize.RData file is an R data frame storing the diploid population sizes (0.5/coalescence rate) calculated using the .coal files. The columns of this data frame, named &quot;pop_size&quot;,&nbsp;are\n\t<ul>\n\t\t<li>gens_ago: Time in generations at which epoch starts. (To get years from generations, we multiply by 28.)</li>\n\t\t<li>population_size: Diploid population size in this epoch.</li>\n\t\t<li>population: Name of population&nbsp;</li>\n\t\t<li>region: Name of region (AFR, AMR, EAS, EUR, SAS)</li>\n\t</ul>\n\t</li>\n</ul>\n\n<p><strong>Allele ages and selection p-values</strong></p>\n\n<p>The zipped directories&nbsp;allele_ages_*.zip&nbsp;contain&nbsp;R&nbsp;data frames for each 1000GP population storing allele ages and selection p-values.<br>\nPlease note that only mutations that segregate in the population and map to a unique branch in the Relate-estimated marginal trees are included. Selection p-values are only provided for mutations of DAF &gt; 2 that pass quality filters (see Speidel et al., 2019).&nbsp;</p>\n\n<p>To get an age estimate for a neutral mutation, use&nbsp;0.5*(lower_age + upper_age). To get years from generations, we multiply by 28.</p>\n\n<p>The columns of these&nbsp;data frames, named &quot;allele_ages&quot;,&nbsp;are</p>\n\n<ul>\n\t<li>CHR: chromosome index</li>\n\t<li>BP: base-pair position (GRCh37)</li>\n\t<li>ID: id of SNP</li>\n\t<li>lower_age: Age in generations of coalescence event at the lower end of the branch onto which the mutation maps</li>\n\t<li>upper_age: Age in generations of coalescence event at the upper end of the branch onto which the mutation maps</li>\n\t<li>ancestral/derived: Ancestral/derived allele</li>\n\t<li>upstream: Upstream (5&#39;) allele</li>\n\t<li>downstream: Downstream (3&#39;) allele</li>\n\t<li>DAF: Derived-allele frequency</li>\n\t<li>pvalue: log10 p-value for selection evidence</li>\n</ul>", 
    "language": "eng", 
    "title": "Relate-estimated coalescence rates, allele ages, and selection p-values for the 1000 Genomes Project", 
    "license": {
      "id": "CC-BY-4.0"
    }, 
    "notes": "For R object files, use load() to load data frames into R.", 
    "relations": {
      "version": [
        {
          "count": 1, 
          "index": 0, 
          "parent": {
            "pid_type": "recid", 
            "pid_value": "3234688"
          }, 
          "is_last": true, 
          "last_child": {
            "pid_type": "recid", 
            "pid_value": "3234689"
          }
        }
      ]
    }, 
    "version": "v1.0.0", 
    "references": [
      "Speidel et al., Nature Genetics 2019, A method for genome-wide genealogy estimation for thousands of samples. https://doi.org/10.1038/s41588-019-0484-x"
    ], 
    "keywords": [
      "Genetics", 
      "Genealogy", 
      "Population size", 
      "Allele age", 
      "Positive selection", 
      "1000 Genomes Project"
    ], 
    "publication_date": "2019-05-29", 
    "creators": [
      {
        "orcid": "0000-0002-4644-8033", 
        "affiliation": "Department of Statistics, University of Oxford", 
        "name": "Speidel, Leo"
      }, 
      {
        "affiliation": "Universit\u00e9 du Qu\u00e9bec \u00e0 Montr\u00e9al, Montr\u00e9al, Canada", 
        "name": "Forest, Marie"
      }, 
      {
        "affiliation": "Department of Statistics, University of Oxford", 
        "name": "Shi, Sinan"
      }, 
      {
        "orcid": "0000-0002-2585-9626", 
        "affiliation": "Department of Statistics, University of Oxford", 
        "name": "Myers, Simon R."
      }
    ], 
    "access_right": "open", 
    "resource_type": {
      "type": "dataset", 
      "title": "Dataset"
    }, 
    "related_identifiers": [
      {
        "scheme": "doi", 
        "identifier": "10.5281/zenodo.3234688", 
        "relation": "isVersionOf"
      }
    ]
  }
}
669
366
views
downloads
All versions This version
Views 669669
Downloads 366366
Data volume 493.5 GB493.5 GB
Unique views 610610
Unique downloads 172172

Share

Cite as