Published November 26, 2025 | Version v3
Journal Open

PropMolFlow: Property-Guided Molecule Generation with Geometry-Complete Flow Matching

  • 1. ROR icon University of Florida

Description

Github:

https://github.com/Liu-Group-UF/PropMolFlow

Paper at Nature Computational Science:

PropMolFlow: Property-guided Molecule Generation with Geometry-Complete Flow Matching

  •  "qm9-sdf-data.zip" contains two versions of revised QM9 SDF data: rQM9_v0.sdf is an earlier version used in the paper, and rQM9_v1.sdf is the latest one with fewer problematic molecules that cannot be fixed. The original QM9 SDF file and the property value CSV file are also provided. (In zenodo version 2, problematic indices are the same for rQM9_v0.sdf and rQM9_v1.sdf, and this issue is fixed in zenodo version 3)
  • "propmolflow-full-results.zip" contains the MAE results for both ID and OOD tasks for all checkpoint models. Structural validity metric results (atom stability, molecule stability, RDKit validity, Uniquenss and validity, Closed-shell ratios, and PoseBusters validity) in the ID tasks for all checkpoint models. The full PoseBusters results in CSV format are also included.
  • "generated-molecules-in-sdf.zip" comprises the sampled structures in SDF format for both ID (10000 molecules) and OOD (1000 molecules) tasks.
  • "baseline-model-results.zip" contains the sampled structures in xyz and their converted SDF files using distance-based methods to add bond orders. It also includes the full csv results for PoseBusters analysis. Note JODO directly outputs SDF files, hence no xyz files. Also the conditional checkpoint models for GeoLDM are included.
  • "sampling-property-values-and-atom-counts.zip" includes the numpy "npy" files for property values and the corresponding atom counts used to sample molecules in the ID and OOD tasks.
  • "dft-xyz-files.zip" includes the filtered structures and their DFT-calculated properties (if existing) with and without structural relaxation in both in-distribution (ID) and out-of-distribution (OOD) tasks.
  • "paper-figure-source-data.zip" contains the raw data for main text figures and extended data figures. 
  • "egnn-classifier.zip" contains the pretrained model and property normalizer file required to use an EGNN property predictor based on the Equivariant Diffusion Model work. Two notebooks are included for demonstration: xyz_to_sdf.ipynb shows how to convert generated xyz files to sdf files based on bond distances, and egnn_property_predictor.ipynb shows how to load and make predictions with pretrained models. Note that bond orders are actually not used for EGNN property predictors, we use SDF files for inputs because our PropMolFlow models output SDF files and we compared the performance of the EGNN and GVP predictors. Pretrained models are from https://github.com/gracezhao1997/EEGSDE
  • "checkpoints.tar" includes the best-performing model checkpoints, selected based on the lowest MAE for both in-distribution and out-of-distribution tasks. (see version 1)
  • "with_gaussian.tar" and "without_gaussian.tar" contain 180 model checkpoints (top-3 model checkpoints for each embedding method with lowest validation losses), corresponding to different property handling methods and the inclusion or exclusion of Gaussian expansion. (see version 1)

Files

baseline-model-results.zip

Files (3.4 GB)

Name Size Download all
md5:dee7d542fa13417bd5b62822124c0778
2.7 GB Preview Download
md5:aeb996533fa206c78368a495a4072421
13.1 MB Preview Download
md5:e22d990e22d74b413f2b4ff73d36efeb
242.7 MB Preview Download
md5:77730210c74ac47e4a34f40d0d36661f
17.5 MB Preview Download
md5:35a6c4b8a77f0597ff56965cf5ecf421
605.7 kB Preview Download
md5:4a5e33262ba8687246c8bc0fb3a75ee7
302.2 MB Preview Download
md5:b8146a429edaac13ab82d12570181d25
106.8 MB Preview Download
md5:314731a7a0803e80d8c5a6065225ed18
269.6 kB Preview Download

Additional details

Dates

Updated
2025-11-26