CHAMP: A Coupled Hierarchical Atom-Motif Predictor

Southwest Petroleum University

doi:10.5281/zenodo.19044801

Published March 19, 2026 | Version v2

Journal article Open

CHAMP: A Coupled Hierarchical Atom-Motif Predictor

Southwest Petroleum University

This repository provides the public PyTorch implementation of CHAMP (Coupled Hierarchical Atom-Motif Predictor) for molecular property prediction.

The code released here serves as the main maintained implementation accompanying the manuscript. It contains the core CHAMP model components, motif-construction modules, configuration utilities, and the public training entry point currently documented for the released pipeline.

Overview

CHAMP is a hierarchical graph neural network framework designed to combine:

fine-grained atomic structure,
coarse-grained motif semantics,
and motif-guided cross-scale fusion

within a unified molecular representation learning pipeline.

The framework is organized around three conceptual stages:

Motif construction and structural encoding CHAMP builds motif-level representations on top of atom-level molecular graphs and models internal motif topology to preserve structural information.
Function-aware motif refinement CHAMP refines motif embeddings through supervised contrastive constraints so that structurally similar motifs with different functional roles can be distinguished more effectively.
Hierarchical atom-motif fusion CHAMP uses motif-level semantics to guide atom-level aggregation and performs cross-scale fusion through gating and inter-head interaction mechanisms.

The current public release focuses on the core modules and the main training workflow implemented in this repository.

Repository Scope

The released codebase includes:

the core model components in Model/,
motif extraction and motif-graph construction in motif_extract/,
shared helper utilities in utils/,
argument configuration in Args.py,
motif-aware dataset preparation in motif_spilit.py,
the main public training script in main_classification.py,
the dependency specification in requirements.txt.

Local folders such as dataset/, best_model/, .idea/, and __pycache__/ may appear in the working directory, but they should be interpreted as local resources or development artifacts rather than as the conceptual core of the released source implementation.

Repository Structure

The current directory structure of the released code is:

Code/
├── Args.py
├── main_classification.py
├── motif_spilit.py
├── overview.png
├── README.md
├── requirements.txt
├── Model/
│   ├── HMSAF.py
│   ├── atom_motif_attention.py
│   ├── contrastive_learning.py
│   └──  motif_embedding.py
└──  motif_extract/
    ├── mol_motif.py
    └──  motif_graph.py

For readers who only want to understand or reuse the main implementation, the primary source files are:

main_classification.py
Args.py
motif_spilit.py
Model/*.py
motif_extract/*.py

Installation

pip install -r requirements.txt

Main dependencies:

PyTorch (1.12.0+cu113)
PyTorch Geometric (2.6.1)
RDKit (2024.9.3)
scikit-learn (1.7.2)
UMAP-learn (0.5.7)

Usage

Parameter Configuration

Training parameters can be configured via Args.py:

--dataset: dataset name
--data_dir: dataset directory
--node_feature_dim: atom feature dimension
--edge_feature_dim: edge feature dimension
--hidden_dim: hidden representation dimension
--batch_size: batch size
--epochs: number of epochs
--lr: learning rate
--weight_decay: optimizer weight decay
--patience: scheduler patience
--factor: scheduler decay factor
--loss_fn: loss function option
--alpha: ring-level contrastive loss weight
--beta: non-ring contrastive loss weight
--Pair_MLP: whether to enable the pairwise motif encoder option
--is_contrastive: whether to enable contrastive learning
--use_Guide: whether to enable motif guidance
--use_gating: whether to enable contextual gating
--use_head_interaction: whether to enable inter-head interaction
--label_thresh_ratio: threshold ratio used in motif comparison
--save_dir: checkpoint directory
--log_dir: log directory
--device: execution device

Running Experiments

# Example for a classification task
python main_classification.py --dataset MUTAG --use_head_interaction True --use_gating True

Supported Datasets

The framework supports a wide range of datasets from MoleculeNet, including:

Regression Tasks: ESOL, FreeSolv, Lipophilicity.
Classification Tasks: MUTAG, HIV, BACE, Tox21.

Datasets are expected in a standard graph format, containing node features, edge connectivity, and molecular labels.

Files

requirements.txt

Files (6.2 MB)

Name	Size	Download all
Args.py md5:8588115f1a06bdef9be672386be4322f	2.7 kB	Download
atom_motif_attention.py md5:2afc08dfb7344e50d151f2a943bc50d3	10.6 kB	Download
contrastive_learning.py md5:f2ac09a1d1a933d6c94b7c49833b0609	15.7 kB	Download
HMSAF.py md5:0e54f281969e6e7339fba9c20d488043	7.5 kB	Download
main_classification.py md5:f55df2e67fe985660725fbbbb774af1d	34.0 kB	Download
mol_motif.py md5:16fed858697871992112765414377533	26.9 kB	Download
motif_embedding.py md5:fceaadda80e4ef0b47d2a38070e9c639	13.4 kB	Download
motif_graph.py md5:098878c2e5af8066b402389a943ebc1c	17.0 kB	Download
motif_spilit.py md5:2bf397dd7b210ae9c5380a9542cd3f40	9.0 kB	Download
overview.png md5:145dac42c0182df74e57b731a48d70e0	6.1 MB	Preview Download
README.md md5:07425c6a9deb1715a9410313c789c570	7.4 kB	Preview Download
requirements.txt md5:cc65cc0a54bccb92b567e6f01a6293f9	189 Bytes	Preview Download
utils.py md5:e8647e53263935a3407b7c4074ba868a	8.8 kB	Download

	All versions	This version
Views	38	15
Downloads	122	46
Data volume	26.9 MB	454.8 kB

CHAMP: A Coupled Hierarchical Atom-Motif Predictor

Authors/Creators

Description

Overview

Repository Scope

Repository Structure

Installation

Usage

Parameter Configuration

Running Experiments

Supported Datasets

Files

requirements.txt

Files (6.2 MB)