Unveiling Molecular Moieties through Hierarchical Graph Explainability
Description
The dataset consists of 89373 small molecules and has a 1:100 active/inactive ratio for the smallest class that is CLK2.
This ratio allows assessing of the model generalization as a real world screening task.
The file df_Smiles_SDF_preparations.sdf contains the complete set of 89373 molecules (training set, validation set), while the remaining files are the individual sdfs of the drugs active on CDK1, used as tests in the paper.
The graphs used for training divided into train_set and validation_set were saved in pickle files (train_set.pkl, val_set.pkl) and could be opened and used for architecture training. "
The implementation code that was used for the dataset can be found at the following link: https://github.com/CHILab1/HGE.git
Please cite our work as follows:
@misc{sortino2024unveiling,
      title={Unveiling Molecular Moieties through Hierarchical Graph Explainability}, 
      author={Paolo Sortino and Salvatore Contino and Ugo Perricone and Roberto Pirrone},
      year={2024},
      eprint={2402.01744},
      archivePrefix={arXiv},
      primaryClass={q-bio.QM}
}
Files
      
        Files
         (450.2 MB)
        
      
    
    | Name | Size | Download all | 
|---|---|---|
| 
            
            md5:91fbff127ad2384680a8198fc254e6cb
             | 
          9.5 kB | Download | 
| 
            
            md5:37741c13c34d65a7a63593104c51bac8
             | 
          8.6 kB | Download | 
| 
            
            md5:b6e6d6970047e157a91d00f8bd9bd2f5
             | 
          8.3 kB | Download | 
| 
            
            md5:010b465e64cf89acbded540132c7104d
             | 
          7.4 kB | Download | 
| 
            
            md5:8fdf930e0a816b4a8d62fbe0d13df022
             | 
          8.6 kB | Download | 
| 
            
            md5:05ebc4346077191be21084ad0127a02b
             | 
          9.4 kB | Download | 
| 
            
            md5:77557b3bec5e4f86b630c938cd1a8910
             | 
          7.3 kB | Download | 
| 
            
            md5:18d8fb8c632d5e97204def835fba7bd3
             | 
          5.2 kB | Download | 
| 
            
            md5:ee5323d72a71c2c9591daf430ef9db5a
             | 
          9.0 kB | Download | 
| 
            
            md5:e54b51ceec0aa54497336e788c77dc02
             | 
          7.3 kB | Download | 
| 
            
            md5:ec7a3871d1b09d080f29f63074f32079
             | 
          8.9 kB | Download | 
| 
            
            md5:790d7acaa564665666262bbaf054f6f5
             | 
          8.4 kB | Download | 
| 
            
            md5:b99086837ad0c2146d18a9d171ca921c
             | 
          7.8 kB | Download | 
| 
            
            md5:bb3d77993c236d75be7685af37cb8916
             | 
          8.6 kB | Download | 
| 
            
            md5:7dad271d1114004de41f5d381ff40610
             | 
          6.4 kB | Download | 
| 
            
            md5:af2de5b49fa4a9f5d5672e8b7ce6e42b
             | 
          9.2 kB | Download | 
| 
            
            md5:3ba5ac875087a6de9b044e2b8c9c08b3
             | 
          8.6 kB | Download | 
| 
            
            md5:d6b0f18c2ff05fb8233599e93d854df7
             | 
          6.3 kB | Download | 
| 
            
            md5:0b04efd8aea24a0bafd1d427ef9eae9c
             | 
          5.7 kB | Download | 
| 
            
            md5:dbc2bd3c326538c3dfe395fefc646c9d
             | 
          450.1 MB | Download | 
Additional details
Identifiers
- arXiv
 - arXiv:2402.01744
 - DOI
 - 10.21203/rs.3.rs-4206999/v1
 
Related works
- Is original form of
 - arXiv:2402.01744 (arXiv)
 
Dates
- Submitted
 - 
      2024-05-07
 
              
                Software
              
            
          - Repository URL
 - https://github.com/CHILab1/HGE.git