Published February 1, 2026 | Version v4
Dataset Open

RareFold

Authors/Creators

Description

All compressions are made with zstd here. You can get it here (and it's amazing - thank me later): https://github.com/facebook/zstd

 

params50000.npy - parameters for RareFold single-chain protein structure prediction and design

finetuned_params25000.npy - parameters for RareFold peptide binder design (less tuned)

1ssc.tar.zst - design results for PDB ID 1ssc

val.tar.zst - validation results

test.tar.zst - RareFold test set predictions using the baseline model (The same data AF3 is/can be evaluated on)

test_LC.tar.zst - RareFold test set predictions using the large-crop fine-tuned weights

af3.tar.zst - AF3 predictions

native_valid_structures.tar.zst - native validation structures from the PDB

train_meta_20.csv - meta for train/val/test splits and 20% sequence identity cluster annotations

aa_count_per_chain.csv - Number of AAs of each type per chain id

aa_hist.csv - total occurrence of AA types in the dataset

immunogenicity_analysis.xlsx - results from the immunogenicity analysis of the peptide binders compared to the wild-type binder

Files

aa_count_per_chain.csv

Files (11.7 GB)

Name Size Download all
md5:7f25f9eb6efe302a6c9b2fa968f4bf52
9.2 GB Download
md5:305e20715aa52deb30b6eb1bbe0c3cc3
100.5 MB Preview Download
md5:677bd8a7448b04ef7480a93329101f8d
2.2 kB Preview Download
md5:3da6b1c79f487be4dfd4b9bdc39c036b
1.2 GB Download
md5:c2ff48594dbbbac2986e11905e808eea
371.6 MB Download
md5:6a12945258e72e4f1869bb01b1f3f181
29.7 kB Download
md5:4866c200dfba93be01df73db14af3d84
44.9 MB Download
md5:f78c384b7aa11a8cd8c5aee10c8c003a
371.6 MB Download
md5:89b0a48a4fe1d26fc86c623f8b072589
10.6 MB Download
md5:ec55f4a36c32c6274564abee2bfb299c
10.7 MB Download
md5:a52b719fef4ac0fff22916d05e6d24fa
6.0 MB Preview Download
md5:8eba99c4d79a21b09357e5d976aefce2
453.7 MB Download