Published October 24, 2024 | Version 2024.2
Dataset Open

Dataset - Papyrus 2024 - A large scale curated dataset aimed at bioactivity predictions

Description

This update of release 2024.1 fixes the following:

  • Metadata in the columns type_IC50, type_EC50, type_KD, type_Ki, and type_other did not contain multiple values when multiple pChEMBL values where available but reported only a single value. This fix ensures all values are reported.
  • Molecules were incorrectly standardized and mixtures were included in the dataset. Standardization (using the papyrus_structure_pipeline) is now correctly enforced and mixtures have been removed.

Changes since version 05.6

Papyrus++:

Previous versions mistakenly considered a deviation of 2 log units around compound-target pairs to determine the reproducibility of assays (see published article for more details). This has been fixed to 0.5 log units to ensure data points fall within a maximum range of 1 log unit. As a result, the number of entries in the Papyrus++ set from this release has drastically reduced compared to previous releases.

Files

05.7_additional_files.zip

Files (13.6 GB)

Name Size Download all
md5:9a325b965fb2ab043679d974d1f79549
56.8 MB Download
md5:76bfd8a276734dd40b2ea60f807536ed
113.9 kB Preview Download
md5:5f9e75e69e7a142f98bd7aacb142a8dd
2.2 GB Download
md5:ad9be73b4a0ff0afb4790f30c6104bbc
99.9 MB Download
md5:72c2ad53d71bf3b63b567e9c065c6b2f
1.5 GB Download
md5:63cf3d52855a8e216794d5b91e98478c
3.1 GB Download
md5:0cb3c70501d45b8a4ec5fafbfb8dc0e1
453.7 MB Download
md5:c87b0fdd25214c51107944b4fd2e7fc0
120.0 MB Download
md5:d35ca2162d2a0625b7ea364c419753a4
3.4 GB Download
md5:2b0dc25af55c169654cb62da0046cf17
507.9 MB Download
md5:3dfc294d351343c1a4855078b4fd051b
435.0 MB Download
md5:c269800f4c00f79d198aad5b9758329f
210.0 MB Download
md5:672f03fd78b73c044ff2eb7bd8373869
1.8 MB Download
md5:4b87a61774ddce549106f5e3823a94cb
715.7 MB Download
md5:f6dd07da52ca92029cc4cdde47c3c5cf
751.5 MB Download
md5:5976fad868762597f103ece517e198fe
12.8 kB Preview Download

Additional details

Related works

Is described by
Journal article: 10.1186/s13321-022-00672-x (DOI)
Is new version of
Dataset: 10.5281/zenodo.7821775 (DOI)

Funding

European Commission
eTRANSAFE - Enhacing TRANslational SAFEty Assessment through Integrative Knowledge Management 777365