Published November 23, 2025
| Version v2
Journal
Open
PropMolFlow: Property-Guided Molecule Generation with Geometry-Complete Flow Matching
Description
Github:
https://github.com/Liu-Group-UF/PropMolFlow
arXiv preprint:
PropMolFlow: Property-guided Molecule Generation with Geometry-Complete Flow Matching
- "qm9-sdf-data.zip" contains two versions of revised QM9 SDF data: rQM9_v0.sdf is an earlier version used in the paper, and rQM9_v1.sdf is the latest one with fewer problematic molecules that cannot be fixed.
- "baseline-models.zip" contains the sampled structures in xyz and converted SDF formats using distance-based methods to add bond orders. It also includes the full csv results for PoseBusters analysis. Note JODO directly outputs SDF files, hence no xyz files. Also the conditional checkpoint models for GeoLDM are included.
- "dft-xyz-files.zip" includes the filtered structures and their DFT-calculated properties (if existing) with and without structural relaxation in both in-distribution (ID) and out-of-distribution (OOD) tasks.
- "propmolflow-full-results.zip" contains the MAE results for both ID and OOD tasks for all checkpoint models. Structural validity metric results (atom stability, molecule stability, RDKit validity, Uniquenss and validity, Closed-shell ratios, and PoseBusters ratio) in the ID tasks for all checkpoint models. The full PoseBusters results for all checkpoint models.
- "paper-figure-source-data.zip" contains the raw data for main text figures and extended data figures.
- "sampling-property-values-and-atom-counts.zip" includes the numpy "npy" files for property values and the corresponding atom counts used to generated structures in the ID and OOD tasks.
- "generated-molecules-in-sdf.zip" comprises of the sampled structures in SDF format for both ID (10000 molecules) and OOD (1000 molecules) tasks.
- "egnn-classifier.zip" contains the pretrained model and property normalizer file required to use an EGNN property predictor based on the Equivariant Diffusion Model work. Two notebooks are included for demonstration: xyz_to_sdf.ipynb shows how to convert generated xyz files to sdf files based on bond distances, and egnn_property_predictor.ipynb shows how to load and do property predictions with pretrained models. Note that bond orders are actually not used for EGNN property predictors, we use SDF files for inputs because our PropMolFlow models output SDF files. Pretrained models are from https://github.com/gracezhao1997/EEGSDE
- "checkpoints.tar" includes the best-performing model checkpoints, selected based on the lowest MAE for both in-distribution and out-of-distribution tasks. (see version 1)
- "with_gaussian.tar" and "without_gaussian.tar" contain 180 model checkpoints (top-3 model checkpoints for each embedding method with lowest validation losses), corresponding to different property handling methods and the inclusion or exclusion of Gaussian expansion. (see version 1)
Files
paper-figure-source-data.zip
Files
(3.7 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:c67407f035b420821e1c33108e76e13e
|
3.0 GB | Preview Download |
|
md5:aeb996533fa206c78368a495a4072421
|
13.1 MB | Preview Download |
|
md5:434e4c6c78c1b131cf235240726399cb
|
242.7 MB | Preview Download |
|
md5:77730210c74ac47e4a34f40d0d36661f
|
17.5 MB | Preview Download |
|
md5:418e35d9525c5fc6143fd3027bb959a1
|
723.7 kB | Preview Download |
|
md5:4a5e33262ba8687246c8bc0fb3a75ee7
|
302.2 MB | Preview Download |
|
md5:acaae713ee4e6ee9c3ed3dc18ce6264e
|
106.8 MB | Preview Download |
|
md5:314731a7a0803e80d8c5a6065225ed18
|
269.6 kB | Preview Download |