AUTOMATIC WORKFLOW FOR IN VITRO HIGH-THROUGHPUT SCREENING DATA FAIRIFICATION, PREPROCESSING AND SCORING: A CASE STUDY ON NANOMATERIALS
Creators
- 1. Ideaconsult Ltd., University of Plovdiv, Faculty of Chemistry, Department of Analytical Chemistry and Computer Chemistry, Bulgaria
- 2. Karolinska Insitutet, Institute of Environmental Medicine, Sweden
- 3. Misvik biology, Division of Toxicology, Finland
- 4. Ideaconsult Ltd.
Description
The field of toxicological research relies heavily on high-throughput screening (HTS) to evaluate the potential hazards of various chemical substances. With the advancement of technology, HTS has enabled the generation of vast amounts of data. In the realm of chemical safety and HTS data, The ToxPi software has gained popularity as a means of conveying risk prioritization and profile information to scientists, regulators and stakeholders. Tox5-Score is a novel concept for evaluating and prioritizing toxicity in vitro and is applied in two stages: (i) normalization of the HTS metrics for each time point and endpoint; (ii) combination of the normalized metric values to obtain final Tox5 endpoint scores.
Data management based on FAIR (Findability, Accessibility, Interoperability, and Reuse) guiding principles supports consistent machine-driven curation and reuse of the accumulated data by the nanosafety, cheminformatics and bioinformatics communities. We address the HTS FAIRification challenges – namely data pre-processing reproducibility and efficient data storage by 1) ToxPi score automation approach implemented as add-on of the user friendly Orange Data Mining software 2) introducing well known and reusable binary format optimized for data matrices.
The Tox5-scorng approach is automated as Python package ToxFAIRy and follows exactly its original version implemented in Excel. It takes raw data files (containing results associated with each 384-well plate) and metadata file on input, where the metadata file is integrated in eNanoMapper Template Wizard for future reuse. To enhance accessibility for non-programmers, we have created an Orange Data Mining add-on, Orange3-ToxFAIRy. The Orange Data Mining System https://orangedatamining.com/ is an open-source, visual programming tool designed primarily for data analysis and data mining. We demonstrate the workflow with two different HTS datasets, existing data from caLIBRAte project and new HTS dataset from HARMLESS project.
The ToxFAIRy package and Orange3-ToxFAIRy add-on available from Git link: https://github.com/ideaconsult/orange3-toxfairy extends a previously developed FAIRification workflow (the eNanoMapper workflow) towards application to HTS data. The HTS data is parsed and pre-processed with the ToxFAIRy. The data structures are then converted into the eNanoMapper data model using pynanomapper library and stored as HDF5 file. The file has hierarchical structure with rich metadata and includes both raw, normalized and interpreted data (scores) in machine-readable format, which can be distributed as database independent archive and/or integrated into the eNanoMapper database and Nanosafety Data Interface.
Files
poster_nanotox_2024.pdf
Files
(1.9 MB)
Name | Size | Download all |
---|---|---|
md5:2ba4567caec18d184cd6f5561c226688
|
1.9 MB | Preview Download |
Additional details
Funding
Software
- Repository URL
- https://github.com/ideaconsult/orange3-toxfairy
- Programming language
- Python