Published June 2, 2026 | Version v1.0.0
Software Restricted

PHegde62/HRMS-Predict: HRMS-Predict v1.0.0

  • 1. Genesis Molecular AI

Description

First public release of HRMS-Predict: an open-source ensemble platform for in silico Phase I and Phase II metabolite prediction with HRMS target list generation.

Features:

  • SyGMa Phase I/II rule-based metabolite generation
  • SMARTCyp soft-spot profiling: 69 SMARTS rules across 11 enzyme classes (CYP3A4, CYP2D6, CYP2C9, UGT, SULT, FMO, AO, MAO, NAT, COMT, GST)
  • Exact [M+H]+ and [M-H]- adduct masses to 4 decimal places
  • Enzyme-coloured SVG soft-spot atom map
  • One-click LC-MS target list export (.xlsx)
  • FastAPI backend + Streamlit frontend, runs fully locally
  • MetXBioDB validated: 67% recall on drug-like compounds
  • HRMS mass accuracy: 0.0000 ppm error across 10 reference compounds (129-558 Da)

Abstract

Background: Early identification of metabolic soft spots and metabolite structures is a critical but time-consuming step in drug discovery. Current in silico tools are either limited to cytochrome P450 (CYP)-mediated metabolism, require expensive commercial licences, rely on external servers that compromise compound confidentiality, or lack integration with high-resolution mass spectrometry (HRMS) workflows. No freely available tool combines multi-engine metabolite structure generation, non-CYP enzyme coverage, and HRMS adduct mass annotation in a single locally deployable platform.
Results: We present HRMS·Predict, an open-source ensemble platform that integrates five independent prediction engines — SyGMa (rule-based Phase I/II), BioTransformer (mammalian and gut microbial), MetaTrans (deep learning sequence-to-sequence), Meta-Predictor (site-of-metabolism deep learning), and SMARTCyp (DFT-derived activation energies) — into a unified FastAPI/Streamlit interface. Uniquely, the platform extends beyond CYP metabolism with 69 SMARTS-based soft-spot rules covering eight non-CYP enzyme classes: UGT, SULT, FMO, AO, MAO, NAT, COMT, and GST. For each predicted metabolite, exact [M+H]+ and [M−H]− adduct masses are calculated to four decimal places for direct import as LC-MS target lists. Validated against the MetXBioDB benchmark dataset, HRMS·Predict achieves 63.0% pooled metabolite recall across 2,102 known metabolites, with 53.1% recall and F1 = 0.305 within the top-10 predictions, competitive with GLORYx (F1 ≈0.28) and MetaPredictor (F1 ≈0.29) evaluated on smaller curated subsets. Application to a ten-compound reference panel — including CYP1A2 (caffeine), CYP2D6 (dextromethorphan), CYP2C9 (diclofenac), UGT/SULT (acetaminophen), NAT2 (isoniazid), and AO (carbazeran) substrates — correctly identifies [X]/10 major known metabolites at the correct transformation type and enzyme class.
Conclusions: HRMS·Predict achieves F1 = 0.305 at rank 10 on the full MetXBioDB benchmark (1,245 substrates), competitive with GLORYx (F1 ≈0.28) and MetaPredictor (F1 ≈0.29) evaluated on smaller curated datasets, while being the only open-source tool with local HRMS target list export and enzyme-annotated soft-spot visualisation. HRMS·Predict provides a free, locally deployable complement to in vitro metabolic stability assays, enabling DMPK scientists and medicinal chemists to generate HRMS target lists, identify metabolic liabilities, and rank discovery compounds before committing experimental resource. The platform runs entirely on local hardware, ensuring that pre-patent compound structures never leave the user's network. Source code, documentation, and a versioned release with Zenodo DOI are available at https://github.com/PHegde62/HRMS-Predict.

Files

Restricted

The record is publicly accessible, but files are restricted. <a href="https://zenodo.org/account/settings/login?next=https://zenodo.org/records/20518280">Log in</a> to check if you have access.

Additional details

Related works