Published January 22, 2026 | Version v2
Dataset Open

Dataset for "KnowYourCG: Facilitating base-level sparse methylome interpretation" --- EPIC

  • 1. ROR icon Children's Hospital of Philadelphia

Description

Knowledgebases for the MethylationEPIC Array (EPIC v1.0)

This repository hosts curated knowledgebases for the Infinium MethylationEPIC (v1.0) array. These datasets are formatted as RDS files specifically for use with the knowYourCG Bioconductor package. They provide the necessary biological and technical context to perform functional enrichment analysis on EPIC array data.

1. Technical Metadata & Quality Control

Datasets describing the physical properties of the EPIC array and recommended filtering parameters.

  • ProbeType — Infinium Type I vs Type II probe design.

  • InfiniumChemistry — Technical chemistry metadata for the EPIC platform.

  • Mask — Probes flagged for potential technical artifacts, SNPs, or cross-reactivity.

  • Blacklist — Genomic regions known to produce anomalous or unreliable signals.

2. Genomic Context & Sequence Features

Knowledgebases defining the DNA sequence characteristics and genomic landmarks of the EPIC probes.

  • CGI — CpG Island (CGI) associations.

  • nFlankCG — Nucleotide composition of sequences flanking the targeted CpGs.

  • Tetranuc2 — Tetranucleotide frequency signatures.

  • rmsk1 & rmsk2 — RepeatMasker repetitive elements (SINEs, LINEs, LTRs, etc.).

3. Epigenomic States & Regulatory Elements

Functional annotations relating methylation sites to chromatin accessibility and gene regulation.

  • ChromHMM & REMCChromHMM — Multi-tissue chromatin state models and Roadmap Epigenomics data.

  • HM — Histone modification peak overlaps (e.g., H3K4me3, H3K27ac).

  • TFBSrm — Transcription Factor Binding Sites.

  • CTCFbind — CTCF binding and genomic insulator sites.

  • ABCompartment — Higher-order chromatin A/B compartments.

  • PMD — Partially Methylated Domains.

4. Biological Signatures & Special Features

Knowledgebases focused on specific biological phenomena and gene-level summaries.

  • ImprintingDMR — Differentially Methylated Regions associated with genomic imprinting.

  • MetagenePC — Principal components summarizing gene-level methylation profiles.

External Resources

Files

Files (187.2 MB)

Name Size Download all
md5:bd23eab9c6ceb9c5a0871c9bcda0d816
2.7 MB Download
md5:da92581eacb76f698bb64f01dcacf63a
1.8 kB Download
md5:3257c335e875c9195e545d70e733d916
2.8 MB Download
md5:ff26bd6a89597812746cb94f82d518b8
2.9 MB Download
md5:8f65a4622c687644a1fcd87e0d589f53
112.6 kB Download
md5:a6751d35bdb8c9cff0c63aa79e9849a9
8.7 MB Download
md5:72799ed72990d354389cc8279b5a75fb
2.7 MB Download
md5:2a47f3c9890e6611b8c1c1ffadc385b9
12.2 MB Download
md5:20664e7f4f7117f7944102c947097841
2.9 MB Download
md5:be1c0c281ed97755c394a2e5d3aca7ff
705.5 kB Download
md5:35f28257e2ddc9e6458a7c74b0e9c709
756.1 kB Download
md5:ad87e917361fd2530a9209f87b2d3495
2.7 MB Download
md5:1b6ea06d7092f967276b3ccc08d8df24
148.1 MB Download

Additional details

Related works

Is supplement to
Dataset: 10.1126/sciadv.adw3027 (DOI)