1000 Genomes Project Cleaned Dataset
Authors/Creators
Description
The first four authors performed standard quality control analysis on the 1000 Genomes (1KG) Project genotypes that were generated on the Illumina Omni2.5M chip, at the Broad and Sanger Institutes. The datasets were then posted on the website of The Centre for Applied Genomics at Sick Kids Hospital at https://www.tcag.ca/tools/1000genomes.html. The last two authors then looked for overlap between those datasets and the Hapmap3 datasets that had gene expression for Endoplasmic Reticulum Aminopeptidase 2 (ERAP2), and chose the Yoruban from Ibadan, Nigeria (YRI) and Utah residents with Northern and Western European ancestry (CEU) subpopulations. These two subpopulations had the largest overlap between the 1KG and HapMap3 datasets, with 91 YRI and 104 CEU samples. The text files provided in this repository contain the IDs of all invidividuals and phenotypes for the labelled populations e.g. ERAP2_CEU_YRI_phenotypes.txt has phenotypes for both populations. The two *_pc_outliers.txt contain the IDs of the individuals excluded from analysis due to extraneous principal components. In summary, 88 YRI and 102 CEU individuals were included in the analysis.
Files
CEU_pc_outliers.txt
Files
(935.0 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:fc630aac6d4531f548b41928307453ef
|
25 Bytes | Preview Download |
|
md5:9f75eb9d0e5e7c07f4d9639da2b633cd
|
42.1 kB | Preview Download |
|
md5:76bc779772f847e3445f1df815f49d64
|
43.4 kB | Preview Download |
|
md5:f13e64bb2c6668126ed88bb46aebc7a3
|
42.0 kB | Preview Download |
|
md5:641ffa37909a33fffef0e2469b96a2a0
|
873.3 MB | Download |
|
md5:3f33c1e34d794d1ce50cd9425fe35b27
|
61.6 MB | Download |
|
md5:e7fe24cfaf57c641e28849ea986d1774
|
41.7 kB | Download |
|
md5:bd8510bec6c4c8a77a9d5364a25d26a6
|
38 Bytes | Preview Download |
Additional details
References
- Nicole M. Roslin, Li Weili, Andrew D. Paterson, Lisa J. Strug bioRxiv 078600; doi: https://doi.org/10.1101/078600
- The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015). https://doi.org/10.1038/nature15393