MolCore: A Rust-Accelerated AI-Native Cheminformatics Substrate for Scalable Molecular Intelligence
Authors/Creators
Description
Modern cheminformatics systems are increasingly constrained by the architectural mismatch between legacy molecular toolchains and the computational demands of contemporary artificial intelligence. This paper introduces MolCore, a Rust-accelerated, AI-native cheminformatics substrate designed to unify high-performance molecular computation, graph representation learning, and scalable interoperability with modern deep learning frameworks. MolCore integrates immutable graph-based molecular structures, parallelized Extended-Connectivity Fingerprint generation, host-side zero-copy tensor interoperability, and deterministic preprocessing into a cohesive execution engine.
By replacing Python-bound graph conversion loops with native Rust multithreading and memory-contiguous tensor arrays, MolCore reduces data-loading bottlenecks in molecular machine learning workflows. We evaluate MolCore against standard cheminformatics baselines on one million molecules sampled from PCQM4Mv2, using ablation studies that isolate the effects of Rust execution, multiprocessing, memory-copy elimination, and PyO3-backed tensor transfer. In our benchmark configuration, MolCore demonstrates a 14.8× throughput improvement over optimized Python multiprocessing for parallelized ECFP4 generation, alongside a substantially reduced peak memory footprint, and a 68× speedup over serial PyTorch Geometric graph extraction.
Files
molcore_chem.pdf
Files
(235.2 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:ce0e76365a223e83021744e6d93caf5a
|
235.2 kB | Preview Download |
Additional details
Software
- Repository URL
- https://github.com/Anteneh-T-Tessema/molcore
- Programming language
- Python
- Development Status
- Active