Dataset for "KnowYourCG: Facilitating base-level sparse methylome interpretation" --- MSA
Description
Knowledgebases for the Methylation Screening Array (MSA)
This repository hosts the knowledgebases for the Methylation Screening Array (MSA), a high-throughput platform optimized for screening trait-associated DNA methylation across large populations. These datasets are formatted for the knowYourCG Bioconductor package to facilitate functional enrichment analysis and technical quality control. This video provides an interview with the experts who designed the MSA, explaining its vision and practical applications in epigenetics. The MSA platform uniquely integrates insights from previous EWAS and single-cell methylome profiles. The following knowledgebases are organized by their biological and technical roles.
This replace the legacy repo at Github
1. Genomic Context & Sequence Features
These knowledgebases define the fundamental genomic locations and sequence-level characteristics of the MSA probes.
-
AllTelomeres - Telomeric and subtelomeric regions.
-
Centromere - Centromeric regions.
-
Chromosome - Chromosomal assignments.
-
CGI - CpG Islands.
-
IntergenicCpGs - Probes located in intergenic regions.
-
EvoCons - Phylogenetic evolutionary conservation.
-
G4peaks - High-confidence G-quadruplex structures.
-
nFlankCG - Nucleotide composition of flanking sequences.
-
Tetranuc2 - Tetranucleotide frequency.
2. Epigenomic States & Regulatory Elements
Datasets linking methylation sites to chromatin states and transcriptional regulation.
-
ChromHMM, ChromHMMfullStack, & REMCChromHMM - Chromatin state models and Roadmap Epigenomics transitions.
-
HM - Histone Modification peaks.
-
TFBSrm - Transcription Factor Binding Sites.
-
CTCFbind - CTCF insulator binding sites.
-
ABCompartment - High-level chromatin A/B compartments.
-
PMD - Partially Methylated Domains.
-
RoadMapNegGeneExpCpG & RoadMapPosGeneExpCpG - Probes correlated with gene expression levels.
3. Tissue Specificity & Biological Signatures
Knowledgebases focused on cell-type identity and specialized epigenetic phenomena.
-
TiSigBLUEPRINT, TiSigBrain, & TiSigLoyfer - Tissue-specific methylation signatures.
-
ImprintingDMR - Differentially Methylated Regions in imprinted genes.
-
XCILinkedWGBSSorted - X-Chromosome Inactivation signatures.
-
CoRSIV - Correlated Regions of Systemic Interindividual Variation.
-
IntermediateMeth - Sites of intermediate methylation levels.
-
MetagenePC - Gene-level methylation principal components.
4. Technical Metadata & Quality Control
Datasets describing the physical architecture of the MSA array and recommended filtering.
-
ProbeType - Infinium Type I vs Type II probe design.
-
InfiniumChemistry - Technical chemistry metadata.
-
ProbeCGnum - Number of CpGs per probe.
-
Mask - Probes flagged for potential technical artifacts.
-
Blacklist - Genomic regions known to produce poor quality data.
External Resources
-
Software: knowYourCG (Bioconductor)
-
Documentation: Usage Manual for MSA Knowledgebases
-
Background Paper: Goldberg and Fu, et al., Science Advances (2025)
Files
Files
(72.8 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:0c8e233b6d984d9db8b47d29fc8d3236
|
1.1 MB | Download |
|
md5:79cd7367f201bdc344339c244c9f5b84
|
70 Bytes | Download |
|
md5:35c9942b87c79faef9168c5a03aa25be
|
2.2 kB | Download |
|
md5:5aee91b5122e3f0e63ada78b6ee47551
|
6.9 kB | Download |
|
md5:894c6c3b0d27843303d87dedc3da188e
|
1.1 MB | Download |
|
md5:9e5b3060401aec93c99c1c9ab58f6686
|
1.2 MB | Download |
|
md5:8bb7b13869aaf7b25a61b6962dc90a2c
|
1.3 MB | Download |
|
md5:754ce6850727e43cd59add560233622e
|
1.2 MB | Download |
|
md5:625b58e8a845f373200b43585f511133
|
15.4 kB | Download |
|
md5:4e6ad0c231dc0b00962a1ce28c6151e4
|
33.0 kB | Download |
|
md5:814c73932a07446a6f3e74a4fbe47be5
|
142.9 kB | Download |
|
md5:17509722a4cb69303271572920c24ffb
|
16.1 kB | Download |
|
md5:327ba1cd7f7543f8e7db912e687f96c7
|
2.6 MB | Download |
|
md5:91318b273efb486c938a4b31a9d9979d
|
6.3 kB | Download |
|
md5:052f9733f060674d1ff51b4ef4bdaa8d
|
1.1 MB | Download |
|
md5:ebd32841c071df0a5b648c086f6359ab
|
209.2 kB | Download |
|
md5:e54b81d1f3cb7774cd566e044870974a
|
22.3 kB | Download |
|
md5:725d020f931d0b0198de4329ed35eff1
|
620.5 kB | Download |
|
md5:31c2899ff0dba60b14ecd73310e63806
|
4.6 MB | Download |
|
md5:2b8664da8cca7c4f6ee095d73b54a132
|
1.2 MB | Download |
|
md5:654cab8628e756ca74a42519feeab271
|
718.9 kB | Download |
|
md5:911426809233729109da833bdefbe9d5
|
1.2 MB | Download |
|
md5:764f82c94c7eac419a04b0bbe7a12936
|
1.1 MB | Download |
|
md5:2b372ccefac0d74fc44495def1646b6f
|
1.2 MB | Download |
|
md5:3781d98275776de06a76d38e36668c68
|
281.8 kB | Download |
|
md5:833df6ad32bf675afba432afac4717b5
|
294.7 kB | Download |
|
md5:854830c3d4f614c201a6289a10508e33
|
68.5 kB | Download |
|
md5:243d1d77ade2c10112301c0e55a9cc83
|
45.8 kB | Download |
|
md5:8d6ca7df779c43e45adbf3e3ea04b143
|
1.1 MB | Download |
|
md5:a80bfdf1e1299eb9e7c241af76d5ad1b
|
40.7 MB | Download |
|
md5:23bad15eb2651f61828791b80d174ae9
|
2.9 MB | Download |
|
md5:56de6584ccf08f5f85d7ff0b450cde7f
|
4.5 MB | Download |
|
md5:8a6e7bfe93178b734b6648dc029ae8df
|
2.3 MB | Download |
|
md5:712f1fbcd032af7935f38cae56fc803c
|
11.7 kB | Download |
Additional details
Related works
- Is supplement to
- Dataset: 10.1126/sciadv.adw3027 (DOI)