Published April 27, 2018 | Version v2
Dataset Open

Compound profiling matrices extracted from screening data

  • 1. Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, D-53115 Bonn, Germany.

Description

Compound profiling matrices record assay results for compound libraries tested against panels of targets. In addition to their relevance for exploring structure-activity relationships, such matrices are of considerable interest for chemoinformatic and chemogenomic applications. For example, profiling matrices provide a valuable data resource for the development and evaluation of machine learning approaches for multi-task activity prediction. However, experimental compound profiling matrices are rare in the public domain. Although they are generated in pharmaceutical settings, they are typically not disclosed. Herein, we present an algorithm for the generation of large profiling matrices, for example, containing more than 100,000 compounds exhaustively tested against 50 to 100 targets. The new methodology is a variant of bi-clustering algorithms originally introduced for large-scale analysis of genomics data. Our approach is applied here to assays from the PubChem BioAssay database and generates profiling matrices of increasing assay or compound coverage by iterative removal of entities that limit coverage. Weight settings control final matrix size by preferentially retaining assays or compounds. In addition, the methodology can also be applied to generate matrices enriched with active entries representing above-average assay hit rates.

Files

Matrix1.csv

Files (110.3 MB)

Name Size Download all
md5:d81e26145133c6c96e82d04eaa186178
13.1 MB Preview Download
md5:6f01f876e37023cf8c57eabeb4ae7baf
17.3 MB Preview Download
md5:ee7c22d0e8fc4bf9b10ecce454b87ab3
79.6 MB Preview Download
md5:9a73823f8f9c0376b2fb5313daac1971
961 Bytes Preview Download
md5:1f6412684da5ece127c3ee8b138e315a
114.5 kB Preview Download
md5:fefe8eb2078b1f81d3d022086bb8525e
177.5 kB Preview Download