Published June 19, 2024 | Version v2
Dataset Open

DiffModeler: Large Macromolecular Structure Modeling in Low-Resolution Cryo-EM Maps Using Diffusion Model

Description

Here, we store the modeled structures generated by DiffModeler for its 4 benchmark datasets: CryoREAD dataset(0-5A resolution, protein-DNA/RNA complex), ModelAngelo dataset(0-5A resolution, most protein complexes, a few protein-RNA complex), intermediate resolution dataset (5-10A resolution, protein complex), low resolution dataset (10-20A resolution, protein complex). For all protein-DNA/RNA complex, the map will be modeled by CryoREAD+DiffModeler.

For each dataset, we keep the modeled structures by DiffModeler, named as [EMD-ID]_DiffModeler.cif; and their corressponding native structures from RCSB are saved as [EMD-ID]_[PDB_ID]_native.cif.

For CryoREAD dataset, it includes 61 targets. For ModelAngelo dataset, it includes 28 targets. For intermediate resolution dataset , it includes 71 targets. For low resolution dataset, it incldues 6 targets.

For intermediate resolution dataset, many maps were run with inaccurate AF2 predicted single-chain structures. We also benchmarked DiffModeler's performance by using native single-chain structures as input. They are saved under "dataset_5_10A_nativechain" folder.

Additionally, we have stored the traced backbone map of the intermediate resolution dataset in the "dataset_5_10A_diffusion_traced_backbone_map" folder. The traced maps are saved as [EMD-ID]_diffusion.mrc. The intermediate reverse diffusion maps of the intermediate resolution dataset are saved in the "dataset_5_10A_reverse_diffusion_maps" folder. Each sub-folder is named according to the corresponding map's [EMD-ID] and contains three intermediate reverse diffusion maps: 20percentile_reverse_diffusion.mrc, 50percentile_reverse_diffusion.mrc, and 80percentile_reverse_diffusion.mrc. A higher percentile indicates a map closer to the end of the reverse diffusion steps.

If you used DiffModeler, please cite: "Wang, Xiao, Han Zhu, Genki Terashi, Manav Taluja, and Daisuke Kihara. "DiffModeler: Large Macromolecular Structure Modeling in Low-Resolution Cryo-EM Maps Using Diffusion Model." bioRxiv (2024): 2024-01.".

If you used CryoREAD, please cite: "Xiao Wang, Genki Terashi & Daisuke Kihara. De novo structure modeling for nucleic acids in cryo-EM maps using deep learning. Nature Methods, 2023."

Files

Files (1.2 GB)

Name Size Download all
md5:70b2998b0864c7905db6e6d1ad4d890c
1.2 GB Download

Additional details

Related works

Is published in
Dataset: 10.1101/2024.01.20.576370 (DOI)

Dates

Submitted
2024-06-19
DiffModeler: Large Macromolecular Structure Modeling in Low-Resolution Cryo-EM Maps Using Diffusion Model

Software

Repository URL
https://github.com/kiharalab/DiffModeler
Programming language
Python
Development Status
Active

References

  • Wang, X., Zhu, H., Terashi, G., Taluja, M., & Kihara, D. (2024). DiffModeler: Large Macromolecular Structure Modeling in Low-Resolution Cryo-EM Maps Using Diffusion Model. bioRxiv, 2024-01.