Published October 24, 2024
| Version 2024.2
Dataset
Open
Dataset - Papyrus 2024 - A large scale curated dataset aimed at bioactivity predictions
Contributors
Researchers:
Supervisors:
Description
This update of release 2024.1 fixes the following:
- Metadata in the columns type_IC50, type_EC50, type_KD, type_Ki, and type_other did not contain multiple values when multiple pChEMBL values where available but reported only a single value. This fix ensures all values are reported.
- Molecules were incorrectly standardized and mixtures were included in the dataset. Standardization (using the papyrus_structure_pipeline) is now correctly enforced and mixtures have been removed.
Changes since version 05.6
- ChEMBL data was updated to ChEMBL version 34
- data from the IUPHAR/BPS Guide to PHARMACOLOGY has been included
- data from Pickett et al.'s publication on MMP-12 has been included (ACS Med Chem Lett. 2011 Jan 13; 2(1): 28–33. DOI: 10.1021/ml100191f)
Papyrus++:
Previous versions mistakenly considered a deviation of 2 log units around compound-target pairs to determine the reproducibility of assays (see published article for more details). This has been fixed to 0.5 log units to ensure data points fall within a maximum range of 1 log unit. As a result, the number of entries in the Papyrus++ set from this release has drastically reduced compared to previous releases.
Files
05.7_additional_files.zip
Files
(13.6 GB)
Name | Size | Download all |
---|---|---|
md5:9a325b965fb2ab043679d974d1f79549
|
56.8 MB | Download |
md5:76bfd8a276734dd40b2ea60f807536ed
|
113.9 kB | Preview Download |
md5:5f9e75e69e7a142f98bd7aacb142a8dd
|
2.2 GB | Download |
md5:ad9be73b4a0ff0afb4790f30c6104bbc
|
99.9 MB | Download |
md5:72c2ad53d71bf3b63b567e9c065c6b2f
|
1.5 GB | Download |
md5:63cf3d52855a8e216794d5b91e98478c
|
3.1 GB | Download |
md5:0cb3c70501d45b8a4ec5fafbfb8dc0e1
|
453.7 MB | Download |
md5:c87b0fdd25214c51107944b4fd2e7fc0
|
120.0 MB | Download |
md5:d35ca2162d2a0625b7ea364c419753a4
|
3.4 GB | Download |
md5:2b0dc25af55c169654cb62da0046cf17
|
507.9 MB | Download |
md5:3dfc294d351343c1a4855078b4fd051b
|
435.0 MB | Download |
md5:c269800f4c00f79d198aad5b9758329f
|
210.0 MB | Download |
md5:672f03fd78b73c044ff2eb7bd8373869
|
1.8 MB | Download |
md5:4b87a61774ddce549106f5e3823a94cb
|
715.7 MB | Download |
md5:f6dd07da52ca92029cc4cdde47c3c5cf
|
751.5 MB | Download |
md5:5976fad868762597f103ece517e198fe
|
12.8 kB | Preview Download |
Additional details
Related works
- Is described by
- Journal article: 10.1186/s13321-022-00672-x (DOI)
- Is new version of
- Dataset: 10.5281/zenodo.7821775 (DOI)