Info: Zenodo’s user support line is staffed on regular business days between Dec 23 and Jan 5. Response times may be slightly longer than normal.

Published January 11, 2020 | Version 1
Presentation Open

Predicting Phenotype from Multi-Scale Genomic and Environment Data using Neural Networks and Knowledge Graphs: An Introduction to the NSF GenoPhenoEnvo Project

  • 1. Oregon State University
  • 2. Utrecht University
  • 3. University of Arizona
  • 4. Tufts University
  • 5. Oregon State Unviersity
  • 6. Michigan State University

Description

To mitigate the effects of climate change on public health and conservation, we need to better understand the dynamic interplay between biological processes and environmental effects. Machine learning (ML) methods in general, and Deep Learning (DL) methods in particular, are a potential way forward because they are able to cope with the nonlinearity of natural systems. However, there are several barriers that exist, including the opaque nature of the algorithm output and the absence of ML-ready data. We propose to develop a machine learning framework capable of predicting phenotypes based on multi-scale data about genes and environments. A critical part of this framework is a visualization system to contextualize the results of an ML model, that is, to examine model decisions, connect decisions to input samples, and test alternative decisions. Further, we will develop data transformation methods that map the heterogeneous input data, ranging from simple vectors to complex images, into formats that are consumable by the ML techniques.  The central hypothesis of this research is that deep learning algorithms and biological knowledge graphs will predict phenotypes more accurately across more taxa and more ecosystems than do current numerical and traditional statistical modeling methods. Our long term goal is to develop predictive analytics for organismal response to environmental perturbations using innovative data science approaches. This pilot project on predicting emergent properties of complex systems and multidimensional interactions is funded by the NSF (Award # 1940330).

Files

Thessen PAG 2020.pdf

Files (4.3 MB)

Name Size Download all
md5:3e0ebeb9dee3818e5c05b299bc124cb2
4.3 MB Preview Download