Published February 12, 2026 | Version 0.1
Dataset Open

ScrambleBench: A Workflow for Comparative Assessment of Structure-based de novo Generative Models

Authors/Creators

Description

Data Availability for ScrambleBench

The zipped file contained all of the information needed to reproduce the main and supplementary figure of the manuscript: ScrambleBench: A Workflow for Comparative Assessment of Structure-based de novo Generative Models (Updated as of 12 Feb 2026)

You can explore the folders which contains the README.md for further information.

Main Figure

Figure 1

directory: output_analysis_supporting/diversity_result
file: SixProtein_Benchmark_DiversityAnalysis_HamDivMCES.csv

Figure 2-7

(Distribution Figure shown in Raincloud plot)

directory: output_scramblebench_data_warehouse
file: data_warehouse_without_molblock.csv

(Docking Figure of Generated Ligand)

directory: output_virtual_hits
file: best_compound_23dec_*.sdf

(Docking Figure of Native Complex)

directory: input_protein
file: [protein_name]_complex_autoprepared.pdb

Main Table

Table 3

directory: output_generation/Docking/phase_score/hypo
file: all files inside

Table 4-5, 8-9, 12-13

(Time)

directory: output_generation/generation_log
file: all files inside

(Generated Ligand)

directory_1: output_generation/*nov_*/summary
file_1: all files inside

directory_2: output_generation/*nov_*/cheminformatics_input_prepared
file_2: all files inside

Table 6-7, 10-11, 14-15

directory: output_virtual_hits
file: virtual_hits*.csv

Supplementary Figure

All Docking Figure containing Native Complex is available here:
directory: input_protein
file: [protein_name]_complex_autoprepared.pdb

Figure 1

directory: input_external_data/SuppFig1
file: SupplementaryFigure1.xlsx

Figure 2

directory: output_analysis_supporting/diversity_result
file: ThreeProtein_Benchmark_DiversityAnalysis_RASCALConfig_HamDivMCES.csv

Figure 3

directory: output_analysis_supporting/GenBench3D_finetune
file: data.yaml

Figure 4

directory: output_generation/Docking/Glide/generated_ligand
file: kinase_gsk3b_glide-dock_SP_nonforce_planar_lib.sdfgz (ligand ID: control_36)

Figure 5

directory: output_generation/Docking/phase_score/hypo
file: all files inside

Figure 6

directory: output_analysis_supporting/diversity_result
file: SixProteinBenchmark_DiversityAnalysis_HamDivECFP.csv

Figure 7

directory: output_analysis_supporting/diversity_result
file: SixProteinBenchmark_DiversityAnalysis_Average_Tanimoto.csv

Figure 8

directory: output_scramblebench_data_warehouse
file: data_warehouse_without_molblock.csv

Figure 9

directory: output_generation/*nov_*/genbench_analysis/output_json
file: all json files

Figure 10

directory: output_generation/Docking/Vina/output_generated_ligand
file: gpcr_5ht2c_input_docking_vina.sdf (ligand ID: Pocket2Mol_num500_444)

Figure 11

directory: input_external_data/SuppFig11
file: 1Q3D_prepared.pdb

directory: output_virtual_hits
file: best_compound_23dec_DiffSBDD.sdf

Figure 12

directory: input_external_data/SuppFig12
file: crossdocked_set_ids.txt and uniprot_protein_family_list.txt

Figure 13

directory: output_generation/Docking/Glide/generated_ligand
file: /kinase_cdk2_glide-dock_SP_nonforce_planar_lib.sdfgz (ligand_ID: PMDM_num100_10)

Figure 14

directory: input_external_data/SuppFig12
file: wu-04_docking_pose.sdf (4th pose)

Figure 15-20

directory: output_virtual_hits
file: best_compound_23dec_*.sdf

Technical info (English)

Inputs, Configs, Outputs

Please find the files within the zipped file below for technical implementations:

Docking parameters and config files

directory: output_generation/Docking

Protein Target used as benchmark

directory: input_protein

Generated Ligand Datasets

directory: output_generation/*nov_*/summary

Model Configs

Pocket2Mol: output_generation/*nov_*/Pocket2Mol/sample_for_pdb.yaml

(Other Models Are Described in the Github repository https://github.com/Feriolet/ScrambleBench)

Files

scramblebench_data.zip

Files (2.2 GB)

Name Size Download all
md5:139521cae5ffb4d0c7ac955292cff854
2.2 GB Preview Download

Additional details