Published July 22, 2020 | Version v1
Dataset Open

MyND RNAseq and eQTL results

  • 1. Icahn School of Medicine at Mount Sinai

Description

This dataset is part of the manuscript: "Dysregulation of mitochondrial and proteo-lysosomal genes in Parkinson's disease myeloid cells", by Navarro E, Udine E, et al.

Description of files:

  • MyND_monocyte.cis_eqtl_nominal.txt.gz - Full nominal eQTL summary statistics (gzip-compressed)
  • MyND_monocyte.cis_eqtl_permuted.txt.gz - Full permuted eQTL summary statistics (gzip-compressed)
  • MyND_monocyte.cis_sqtl_nominal.txt.gz - Full nominal sQTL summary statistics (gzip-compressed)
  • MyND_monocyte.cis_sqtl_permuted.txt.gz - Full permuted sQTL summary statistics (gzip-compressed)
  • gencode.v30.primary_assembly.annotation.txt.gz - Gencode (v30) gene annotations used in the analysis (gzip-compressed)
  • monocyte_counts_matrix.txt.gz - RSEM counts from monocytes samples (230 samples) (gzip-compressed)
  • monocyte_tpms_matrix.txt.gz -RSEM TPMs  from monocytes samples (230 samples) (gzip-compressed)
  • microglia_counts_matrix.txt.gz - RSEM counts from microglia samples (128 samples - 55 donors) (gzip-compressed)
  • microglia_tpms_matrix.txt.gz - RSEM TPMs from microglia samples (128 samples - 55 donors) (gzip-compressed)
  • processed_seurat_obj.RDS - Seurat R data object file containing single-cell RNA-seq results (14,827 features, 19,144 cells, 10 donors)

Table columns are formatted as follows:

Nominal eQTL results include all SNP-gene pairs tested (using a 1Mb window from each side of the transcription start site (TSS) of a gene). Table columns are formatted as follows:

  1. "pheno_id" - The phenotype ID
  2. "pheno_chr" - The chromosome ID of the phenotype
  3. "pheno_start" - The start position of the phenotype
  4. "pheno_end" - The end position of the phenotype
  5. "pheno_strand" - The strand orientation of the phenotype
  6. "num_var" - The total number of variants tested in cis
  7. "distance" - The distance between the phenotype and the tested variant (accounting for strand orientation)
  8. "snp_id" - The ID of the tested variant
  9. "snp_chr" - The chromosome ID of the variant
  10. "snp_start" - The start position of the variant
  11. "snp_end" - The end position of the variant
  12. "nominal_pval" - The nominal P-value of association between the variant and the phenotype
  13. "slope" - The corresponding regression slope
  14. "lead_snp" - A binary flag equal to 1 is the variant is the top variant in cis

Permuted eQTL results include only the top SNP-gene association for each gene (1000 permutations). Table columns are formatted as follows:

  1. "gene_id" - The phenotype ID
  2. "gene_chr" - The chromosome ID of the phenotype
  3. "gene_start" - The start position of the phenotype
  4. "gene_end" - The end position of the phenotype
  5. "gene_strand" - The strand orientation of the phenotype
  6. "num_var" - The total number of variants tested in cis
  7. "distance" - The distance between the phenotype and the tested variant (accounting for strand orientation)
  8. "snp_id" - The ID of the top variant
  9. "snp_chr" - The chromosome ID of the top variant
  10. "snp_start" - The start position of the top variant
  11. "snp_end" - The end position of the top variant
  12. "degree_of_freedom" - The number of degrees of freedom used to compute the P-values
  13. "dummy" - Dummy
  14. "bval1" - The first parameter value of the fitted beta distribution
  15. "bval2" - The second parameter value of the fitted beta distribution (it also gives the effective number of independent tests in the region)
  16. "nominal_pval" - The nominal P-value of association between the phenotype and the top variant in cis
  17. "slope" - The corresponding regression slope
  18. "empirical_pval" - The P-value of association adjusted for the number of variants tested in cis given by the direct method (i.e. empirircal P-value)
  19. "beta_dist_pval" - The P-value of association adjusted for the number of variants tested in cis given by the fitted beta distribution. We strongly recommend to use this adjusted P-value in any downstream analysis

Nominal sQTL results include all SNP-junction pairs tested (using a 100kb window from the center of each intron cluster). Table columns are formatted as follows:

  1. "pheno_id" - The phenotype ID
  2. "pheno_chr" - The chromosome ID of the phenotype
  3. "pheno_start" - The start position of the phenotype
  4. "pheno_end" - The end position of the phenotype
  5. "pheno_strand" - The strand orientation of the phenotype
  6. "num_var" - The total number of variants tested in cis
  7. "distance" - The distance between the phenotype and the tested variant (accounting for strand orientation)
  8. "snp_id" - The ID of the tested variant
  9. "snp_chr" - The chromosome ID of the variant
  10. "snp_start" - The start position of the variant
  11. "snp_end" - The end position of the variant
  12. "nominal_pval" - The nominal P-value of association between the variant and the phenotype
  13. "slope" - The corresponding regression slope
  14. "lead_snp" - A binary flag equal to 1 is the variant is the top variant in cis

Permuted sQTL results include only the top SNP-junction association by gene (1000 permutations). Table columns are formatted as follows:

  1. "pheno_id" - The phenotype group ID (here a gene ID)
  2. "pheno_chr" - The chromosome ID of the phenotype group
  3. "pheno_start" - The start position of the phenotype group
  4. "pheno_end" - The end position of the phenotype group
  5. "pheno_strand" - The strand orientation of the phenotype group
  6. "pheno_id" - The top phenotype in the group (here an exon ID)
  7. "num_pheno" - The total number of phenotypes in the group (i.e. #exons)
  8. "num_var" - The total number of variants tested in cis
  9. "distance" - The distance between the phenotype group and the tested variant (accounting for strand orientation)
  10. "snp_id" - The ID of the top variant
  11. "snp_chr" - The chromosome ID of the top variant
  12. "snp_start" - The start position of the top variant
  13. "snp_end" - The end position of the top variant
  14. "degree_of_freedom” - The number of degrees of freedom used to compute the P-valuesm"
  15. "dummy" - Dummy
  16. "bval1" - The first parameter value of the fitted beta distribution
  17. "bval2" - The second parameter value of the fitted beta distribution (it also gives the effective number of independent tests in the region)
  18. "nominal_pval" - The nominal P-value of association between the top phenotype and the top variant in cis
  19. "slope" - The corresponding regression slope
  20. "empirical_pval" - The P-value of association adjusted for the number of variants and phenotypes tested in cis given by the direct method (i.e. empirircal P-value)
  21. "beta_dist_pval" - The P-value of association adjusted for the number of variants and phenotypes tested in cis given by the fitted beta distribution. We strongly recommend to use this adjusted P-value in any downstream analysis

NOTE: The effect sizes of eQTLs and sQTL are defined as the effect of the alternative allele (ALT) relative to the reference (REF) allele in the human genome reference (GRCh38). 

 

Files

Files (7.4 GB)

Name Size Download all
md5:25de740377d384bad292b90196eac490
1.2 MB Download
md5:023e7c02b977e9280b028311c8a67f93
7.2 MB Download
md5:d94e70def73144102df98173332c1751
9.1 MB Download
md5:e3a9418cd4bae5e5bec165a5de1bebb8
12.0 MB Download
md5:5c99e2d6e5ad908921d68283bbfaa3a6
10.9 MB Download
md5:0d6496a584b7943d68c28d261bab86ef
1.7 GB Download
md5:7f9ddae2ad3f831f93731cabc9307d2e
1.1 MB Download
md5:8668e9d2278a2d1b0183be7cb6f2f818
5.3 GB Download
md5:720d6bcb20bfc315e9acecdc09648c5a
745.2 kB Download
md5:d18a367e2de5d40594f01a1ba569d738
355.7 MB Download