Published January 20, 2026 | Version v2
Dataset Open

Dataset for "KnowYourCG: Facilitating base-level sparse methylome interpretation" --- MSA

  • 1. ROR icon Children's Hospital of Philadelphia

Description

Knowledgebases for the Methylation Screening Array (MSA)

This repository hosts the knowledgebases for the Methylation Screening Array (MSA), a high-throughput platform optimized for screening trait-associated DNA methylation across large populations. These datasets are formatted for the knowYourCG Bioconductor package to facilitate functional enrichment analysis and technical quality control. This video provides an interview with the experts who designed the MSA, explaining its vision and practical applications in epigenetics. The MSA platform uniquely integrates insights from previous EWAS and single-cell methylome profiles. The following knowledgebases are organized by their biological and technical roles.

This replace the legacy repo at Github

1. Genomic Context & Sequence Features

These knowledgebases define the fundamental genomic locations and sequence-level characteristics of the MSA probes.

2. Epigenomic States & Regulatory Elements

Datasets linking methylation sites to chromatin states and transcriptional regulation.

3. Tissue Specificity & Biological Signatures

Knowledgebases focused on cell-type identity and specialized epigenetic phenomena.

4. Technical Metadata & Quality Control

Datasets describing the physical architecture of the MSA array and recommended filtering.

  • ProbeType - Infinium Type I vs Type II probe design.

  • InfiniumChemistry - Technical chemistry metadata.

  • ProbeCGnum - Number of CpGs per probe.

  • Mask - Probes flagged for potential technical artifacts.

  • Blacklist - Genomic regions known to produce poor quality data.

External Resources

Files

Files (72.8 MB)

Name Size Download all
md5:0c8e233b6d984d9db8b47d29fc8d3236
1.1 MB Download
md5:79cd7367f201bdc344339c244c9f5b84
70 Bytes Download
md5:35c9942b87c79faef9168c5a03aa25be
2.2 kB Download
md5:5aee91b5122e3f0e63ada78b6ee47551
6.9 kB Download
md5:894c6c3b0d27843303d87dedc3da188e
1.1 MB Download
md5:9e5b3060401aec93c99c1c9ab58f6686
1.2 MB Download
md5:8bb7b13869aaf7b25a61b6962dc90a2c
1.3 MB Download
md5:754ce6850727e43cd59add560233622e
1.2 MB Download
md5:625b58e8a845f373200b43585f511133
15.4 kB Download
md5:4e6ad0c231dc0b00962a1ce28c6151e4
33.0 kB Download
md5:814c73932a07446a6f3e74a4fbe47be5
142.9 kB Download
md5:17509722a4cb69303271572920c24ffb
16.1 kB Download
md5:327ba1cd7f7543f8e7db912e687f96c7
2.6 MB Download
md5:91318b273efb486c938a4b31a9d9979d
6.3 kB Download
md5:052f9733f060674d1ff51b4ef4bdaa8d
1.1 MB Download
md5:ebd32841c071df0a5b648c086f6359ab
209.2 kB Download
md5:e54b81d1f3cb7774cd566e044870974a
22.3 kB Download
md5:725d020f931d0b0198de4329ed35eff1
620.5 kB Download
md5:31c2899ff0dba60b14ecd73310e63806
4.6 MB Download
md5:2b8664da8cca7c4f6ee095d73b54a132
1.2 MB Download
md5:654cab8628e756ca74a42519feeab271
718.9 kB Download
md5:911426809233729109da833bdefbe9d5
1.2 MB Download
md5:764f82c94c7eac419a04b0bbe7a12936
1.1 MB Download
md5:2b372ccefac0d74fc44495def1646b6f
1.2 MB Download
md5:3781d98275776de06a76d38e36668c68
281.8 kB Download
md5:833df6ad32bf675afba432afac4717b5
294.7 kB Download
md5:854830c3d4f614c201a6289a10508e33
68.5 kB Download
md5:243d1d77ade2c10112301c0e55a9cc83
45.8 kB Download
md5:8d6ca7df779c43e45adbf3e3ea04b143
1.1 MB Download
md5:a80bfdf1e1299eb9e7c241af76d5ad1b
40.7 MB Download
md5:23bad15eb2651f61828791b80d174ae9
2.9 MB Download
md5:56de6584ccf08f5f85d7ff0b450cde7f
4.5 MB Download
md5:8a6e7bfe93178b734b6648dc029ae8df
2.3 MB Download
md5:712f1fbcd032af7935f38cae56fc803c
11.7 kB Download

Additional details

Related works

Is supplement to
Dataset: 10.1126/sciadv.adw3027 (DOI)