Published December 22, 2025
| Version v1
Dataset
Open
CryoFM EMDB Data Lists
Authors/Creators
- 1. ByteDance Seed
Description
Description
This dataset provides CSV lists containing structured metadata for cryo-electron microscopy (cryo-EM) map processing tasks used in the CryoFM research papers. The data lists are curated from entries in the Electron Microscopy Data Bank (EMDB) and organized into CSV files with detailed metadata for training, validation, and testing of deep learning models.
Dataset Contents
This repository contains CSV data lists for two main research projects:
1. CryoFM1 (ICLR 2025): CSV lists for cryo-EM map processing at two different resolutions
cryofm1_1-5apix_dataset: High-resolution dataset (~1.5 Å/pixel)cryofm1_3apix_dataset: Standard-resolution dataset (~3.0 Å/pixel)
2. CryoFM2: CSV lists for foundation model pre-training and fine-tuning
cryofm2_pretrain_dataset: Pre-training dataset with half-map pairscryofm2_emhancer_dataset: Enhancement dataset with half-map pairs and model-based LocScale mapscryofm2_emready_dataset: EMReady dataset with deposited and simulated maps
CSV List Structure
CryoFM1 CSV lists contain: EMDB ID, relative path to map file, voxel dimensions (nz, ny, nx), and pixel size (apix). Maps are rescaled to specified resolutions (1.5 or 3.0 Å/pixel).
CryoFM2 CSV lists contain: map paths, statistical features (mean, std, quantile_max_value), pixel size (apix), and EMDB/PDB IDs. All maps are resized to 1.5 Å/pixel.
Detailed schema descriptions are provided in `schema.md` files within each dataset directory.
Note
This dataset contains CSV metadata lists only; the actual map files are not included. Map files should be downloaded from EMDB using the provided EMDB IDs.
Files
cryoFM-emdb-lists.zip
Files
(358.3 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:dede72b9ecd120b8f6cb5dd9b3a94b59
|
358.3 kB | Preview Download |
Additional details
Related works
- Is supplement to
- Conference paper: arXiv:2410.08631 (arXiv)
- Preprint: 10.64898/2025.12.29.696802 (DOI)
Software
- Repository URL
- https://github.com/ByteDance-Seed/cryofm
- Programming language
- Python
- Development Status
- Active