Presentation Open Access

Data Curation: The Forgotten Practice in the Era of AI

Daga, Pankaj R.

Pankaj R. Daga from Simulation-Plus visited the Mobley group at UC Irvine on Sep 13, 2019 and gave a talk as a part of OFF seminar series about all the hazards that can appear in trying to automate mining of chemical and chemistry-related databases.


Abstract: Availability of large databases of chemical structures along with experimental data provides a great opportunity to build predictive and robust QSAR/QSPR models for application in various fields. The most common concern while using these databases is the quality of the chemical structures and associated biological data. It is very important to deal with correct chemical structure since incorrect structure will lead to the errors in calculation of molecular descriptors. Incorrect biological data will ultimately lead to meaningless results. This seminar will discuss experiences while curating these bioactivity databases with focus towards ADMET properties in drug discovery. Various sources of these errors and measures to find and correct these errors will be discussed.

Files (105.5 MB)
Name Size
PankajDaga-UCIrvine-DataCuration.mp4
md5:bdbc5a493e0ef4816a6c10e075a326c2
98.7 MB Download
PankajDaga-UCIrvine-DataCuration.pdf
md5:2fca143dda590e3f45a8c904cc3d7e7c
6.8 MB Download
456
281
views
downloads
All versions This version
Views 456456
Downloads 281281
Data volume 3.0 GB3.0 GB
Unique views 426426
Unique downloads 246246

Share

Cite as