Published November 14, 2016 | Version v1
Poster Open

An Infrastructure for Reproducible Exposomic Research

  • 1. Department of Biomedical Informatics, and Center for Clinical and Translational Science, University of Utah, Salt Lake City, Utah, USA
  • 2. Center for Clinical and Translational Science, University of Utah, Salt Lake City, Utah, USA
  • 3. Department of Biomedical Informatics, Department of Chemical Engineering, University of Utah, Salt Lake City, Utah, USA
  • 4. Department of Biomedical Informatics, University of Utah, Salt Lake City, Utah, USA
  • 5. Department of Biomedical Informatics, Center for Clinical and Translational Science, College of Nursing, University of Utah, Salt Lake City, Utah, USA

Description

Understanding effects of the modern environment on human health requires generation of a complete picture of environmental exposures, behaviors and socio-economic factors. The concept of an exposome encompasses the life-course of environmental exposures (including lifestyle factors) from prenatal periods and complements the genome by providing a comprehensive description of lifelong exposure history1. Exposomic research requires the integration of diverse data types for supporting different research use-cases. While there exist gaps and sparseness in data points needed to generate sufficiently complete exposomes, using available data with an understanding of their limitations could enable reproducibility research.

In order to systematically generate air quality exposomes for the Pediatric Research using Integrated Sensor Monitoring Systems (PRISMS) grant, we are developing a scalable computation infrastructure. Eliciting use-cases, we conceptually designed a data model to integrate different types of data as related to individuals and populations. Supporting proper use of such heterogeneous data requires the discovery, storage and presentation of metadata about these data. We use a graph database implementation of OpenFurther’s metadata repository2 for authoring and storage of these metadata. Using the OpenFurther platform we are developing metadata-driven big data infrastructure that generates an event-document store (EDS) of integrated data as needed for different use-cases. The EDS captures the spatio-temporal variations of various events (e.g. air pollutant concentrations, occurrence of conditions), and locations of the individuals and populations. In addition, to fill gaps in measurements and combine different data source we use mathematical models with characterized uncertainties. Our metadata-driven approach ensures reproducibility as it informs the end-user not only on the specifics about the data but also its limitations (including reducible and exposure uncertainties) for using the data in different use-cases. It is generalizable for integrating multi-scale and multi-omics data and provides robust pipeline for reproducible research data delivery.

References

  1. C. P. Wild, “The exposome: from concept to utility,” Int. J. Epidemiol., vol. 41, no. 1, pp. 24–32, Feb. 2012.
  2. An Informatics Architecture for an Exposome, R. Gouripeddi, Session II06 – Secondary Use of Data for Research (Interactive Learning), AMIA 2016 Joint Summits on Translational Science, March 22nd, 2016, San Francisco. https://www.amia.org/sites/default/files/2016-joint-summits-program-book.pdf

Notes

PRISMS is supported by NIBIB, NIH U54EB021973. OpenFurther is support NCRR/NCATS UL1RR025764, 3UL1RR025764-02S2, AHRQ R01 HS019862, DHHS 1D1BRH20425, U54EB021973, UU Research Foundation

Files

OF2.0_Ks.pdf

Files (2.0 MB)

Name Size Download all
md5:805dfbeb5c684c0733d52c207df0e2f4
2.0 MB Preview Download