Presentation Open Access

Mt Sinai Special Seminar: Finding Small Molecules in Big Data

Schymanski, Emma


For more information please visit:

Title: Finding Small Molecules in Big Data

Speaker: Dr. Emma Schymanski

Bio: Dr. Emma Schymanski is an Associate Professor and Head of the Environmental Cheminformatics group at the Luxembourg Centre for Systems Biomedicine, University of Luxembourg, and a Luxembourg National Research Fund (FNR) ATTRACT Fellowship awardee. Her research combines open science, cheminformatics, and computational mass spectrometry approaches to elucidate the unknowns in complex samples and relate these to environmental causes of disease, along with supporting several European and worldwide activities to improve the exchange of data, information, and ideas between scientists.


Abstract: Metabolomics and exposomics are amongst the youngest and most dynamic of the omics disciplines. While the molecules involved are smaller than proteomics and the other, larger “omics”, the challenges are in many ways greater. Elements are less constrained, there are no given “puzzle pieces” and there is a resulting explosion in terms of potential chemical space. It is impossible to even enumerate all chemically possible small molecules. The challenges and complexity of identifying small molecules even using the most advanced analytical technologies available today is immense. Current “big data” methods for small molecules rely heavily on chemical databases, the largest of which presently available contain ~100 million chemicals. Despite this large number, high resolution mass spectrometry (HR-MS) measurements contain tens of thousands of features, of which only a few percent can be annotated as “known” and confirmed as metabolites or chemicals of interest using these chemical databases. How can we find relevant small molecules in the ever increasing data loads? How can we annotate more of the unknown features in HR-MS experiments? This talk will present European, US and worldwide initiatives to help find small molecules in big data - from chemical databases to spectral libraries, real-time monitoring to retrospective screening. It will touch on the challenges of standardized structure representations, data curation and deposition. Finally, it will show how interdisciplinary communication, data sharing and pushing the boundaries of current capabilities can facilitate research efforts in metabolomics, exposomics and beyond.


Date: Tuesday, July 23rd 2019

Time: 12:00pm-1:00pm

Location: CAM Building, 17 East 102nd Street, West Tower Elevator, 5th Floor, D5-122

Files (7.8 MB)
Name Size
7.8 MB Download
All versions This version
Views 8686
Downloads 5555
Data volume 428.3 MB428.3 MB
Unique views 7777
Unique downloads 4949


Cite as