Published February 6, 2018
| Version v1
Dataset
Open
Chemical outlier dataset
Creators
Description
The objects are numbered. The Y-variable are boiling points. Other features are structural features of molecules. In the outlier column the outliers are assigned with a value of 1.
The data is derived from a published chemical dataset on boiling point measurements [1] and from public data [2]. Features were generated by means of the RDKit Python library [3]. The dataset was infused with known outliers (~5%) based on significant structural differences, i.e. polar and non-polar molecules.
- Cherqaoui D., Villemin D. Use of a Neural Network to determine the Boiling Point of Alkanes. J CHEM SOC FARADAY TRANS. 1994;90(1):97–102.
- https://pubchem.ncbi.nlm.nih.gov/
- RDKit: Open-source cheminformatics; http://www.rdkit.org
Files
Files
(46.7 kB)
Name | Size | Download all |
---|---|---|
md5:9561cb5d5ec9c677eee2a159200db3e3
|
46.7 kB | Download |