Poster Open Access

Linked Data Platform for Plant Breeding & Genomics

Kuzniar, Arnold; Singh, Gurnoor; Ortiz, Carlos Martinez; Visser, Richard GF; Finkers, Richard

Genetics research is focusing more and more on mining fully sequenced genomes and their annotations to identify the causal genes associated with specific traits (phenotypes) of interest. However, a complex trait is typically associated with multiple quantitative trait loci (QTLs), each with hundreds of genes positively or negatively affecting the desired trait(s). To help breeders to rank candidate genes, we developed an analytics platform that provides semantically integrated geno- and phenotypic data on Solanacea species. This Linked Data platform combines both unstructured data from scientific literature and structured data from publicly available biological databases. In particular, QTLs were extracted from tables of (open access) articles using our recently developed tool called QTLTableMiner++ while the genomic annotations were obtained from the Sol Genomics Network (SGN), UniProt, and Ensemble Plants databases. The resulting RDF graphs include cross references to many other relevant databases such as Gramene, Plant Reactome, InterPro and KEGG (KO) etc. Users can query or analyse the linked datasets through a web interface, SPARQL and RESTful services (APIs).

Our aim is to provide a plant-oriented resources according to FAIR principles that aids breeders in predicting candidate genes for complex traits using available knowledge in the databases and literature.

Files (830.6 kB)
Name Size
830.6 kB Download
All versions This version
Views 7373
Downloads 4545
Data volume 37.4 MB37.4 MB
Unique views 5858
Unique downloads 4343


Cite as