CHAMP: A Coupled Hierarchical Atom-Motif Predictor
Authors/Creators
Description
This repository provides the public PyTorch implementation of CHAMP (Coupled Hierarchical Atom-Motif Predictor) for molecular property prediction.
The code released here serves as the main maintained implementation accompanying the manuscript. It contains the core CHAMP model components, motif-construction modules, configuration utilities, and the public training entry point currently documented for the released pipeline.
Overview
CHAMP is a hierarchical graph neural network framework designed to combine:
- fine-grained atomic structure,
- coarse-grained motif semantics,
- and motif-guided cross-scale fusion
within a unified molecular representation learning pipeline.
The framework is organized around three conceptual stages:
-
Motif construction and structural encoding CHAMP builds motif-level representations on top of atom-level molecular graphs and models internal motif topology to preserve structural information.
-
Function-aware motif refinement CHAMP refines motif embeddings through supervised contrastive constraints so that structurally similar motifs with different functional roles can be distinguished more effectively.
-
Hierarchical atom-motif fusion CHAMP uses motif-level semantics to guide atom-level aggregation and performs cross-scale fusion through gating and inter-head interaction mechanisms.
The current public release focuses on the core modules and the main training workflow implemented in this repository.
Repository Scope
The released codebase includes:
- the core model components in
Model/, - motif extraction and motif-graph construction in
motif_extract/, - shared helper utilities in
utils/, - argument configuration in
Args.py, - motif-aware dataset preparation in
motif_spilit.py, - the main public training script in
main_classification.py, - the dependency specification in
requirements.txt.
Local folders such as dataset/, best_model/, .idea/, and __pycache__/ may appear in the working directory, but they should be interpreted as local resources or development artifacts rather than as the conceptual core of the released source implementation.
Repository Structure
The current directory structure of the released code is:
Code/
├── Args.py
├── main_classification.py
├── motif_spilit.py
├── overview.png
├── README.md
├── requirements.txt
├── Model/
│ ├── HMSAF.py
│ ├── atom_motif_attention.py
│ ├── contrastive_learning.py
│ └── motif_embedding.py
└── motif_extract/
├── mol_motif.py
└── motif_graph.py
For readers who only want to understand or reuse the main implementation, the primary source files are:
main_classification.pyArgs.pymotif_spilit.pyModel/*.pymotif_extract/*.py
Installation
pip install -r requirements.txt
Main dependencies:
- PyTorch (1.12.0+cu113)
- PyTorch Geometric (2.6.1)
- RDKit (2024.9.3)
- scikit-learn (1.7.2)
- UMAP-learn (0.5.7)
Usage
Parameter Configuration
Training parameters can be configured via Args.py:
--dataset: dataset name--data_dir: dataset directory--node_feature_dim: atom feature dimension--edge_feature_dim: edge feature dimension--hidden_dim: hidden representation dimension--batch_size: batch size--epochs: number of epochs--lr: learning rate--weight_decay: optimizer weight decay--patience: scheduler patience--factor: scheduler decay factor--loss_fn: loss function option--alpha: ring-level contrastive loss weight--beta: non-ring contrastive loss weight--Pair_MLP: whether to enable the pairwise motif encoder option--is_contrastive: whether to enable contrastive learning--use_Guide: whether to enable motif guidance--use_gating: whether to enable contextual gating--use_head_interaction: whether to enable inter-head interaction--label_thresh_ratio: threshold ratio used in motif comparison--save_dir: checkpoint directory--log_dir: log directory--device: execution device
Running Experiments
# Example for a classification task
python main_classification.py --dataset MUTAG --use_head_interaction True --use_gating True
Supported Datasets
The framework supports a wide range of datasets from MoleculeNet, including:
- Regression Tasks: ESOL, FreeSolv, Lipophilicity.
- Classification Tasks: MUTAG, HIV, BACE, Tox21.
Datasets are expected in a standard graph format, containing node features, edge connectivity, and molecular labels.
Files
requirements.txt
Files
(6.2 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:8588115f1a06bdef9be672386be4322f
|
2.7 kB | Download |
|
md5:2afc08dfb7344e50d151f2a943bc50d3
|
10.6 kB | Download |
|
md5:f2ac09a1d1a933d6c94b7c49833b0609
|
15.7 kB | Download |
|
md5:0e54f281969e6e7339fba9c20d488043
|
7.5 kB | Download |
|
md5:f55df2e67fe985660725fbbbb774af1d
|
34.0 kB | Download |
|
md5:16fed858697871992112765414377533
|
26.9 kB | Download |
|
md5:fceaadda80e4ef0b47d2a38070e9c639
|
13.4 kB | Download |
|
md5:098878c2e5af8066b402389a943ebc1c
|
17.0 kB | Download |
|
md5:2bf397dd7b210ae9c5380a9542cd3f40
|
9.0 kB | Download |
|
md5:145dac42c0182df74e57b731a48d70e0
|
6.1 MB | Preview Download |
|
md5:07425c6a9deb1715a9410313c789c570
|
7.4 kB | Preview Download |
|
md5:cc65cc0a54bccb92b567e6f01a6293f9
|
189 Bytes | Preview Download |
|
md5:e8647e53263935a3407b7c4074ba868a
|
8.8 kB | Download |