Report Open Access
Juty, Nick ; McMurry, Julie ; Jupp, Simon ; Burdett, Tony ; Jenkinson, Andy ; Parkinson, Helen ; Chambers, Jon ; Morris, Chris ; Winn, Martyn ; Gormanns, Philipp ; Schneltzer, Elida ; Bild, Raffael ; Krauth, Christian ; Kuchinke, Wolfgang ; de Bruijn, Freek ; Blondé, Ward ; Beliën, Jeroen ; Klein, Stefan ; Vast, Erwin ; Hendriksen, Dennis ; Charbon, Bart ; van Enckevort, David ; Swertz, Morris
European e-Infrastructure projects are increasingly turning to Semantic Web (SemWeb) technologies to address data integration challenges. This approach is proving to be a solution to some of the emerging challenges in the life sciences. The BioMedBridges semantic web pilot spans deliverables 4.4, 4.6, 4.7, and 4.8; its goal is to test the suitability of a semantic web approach to the task of integrating research data and to report on our experience of running an RDF-based platform integrating multiple data resources.
In order to leverage experience where it exists and minimise the risks inherent in novel technology projects, a three-stage delivery was chosen for the pilot. As summarised below, these three stages are reported in separate deliverables. This deliverable D4.8 reports on the strategy, implementation and lessons learned for the semantic web pilots for BioMedBridges.
By following this schedule and aligning partner-specific roadmaps to the blueprint delivered in Phase I (D4.4), the pilot projects were developed synchronously, allowing knowledge to be shared efficiently (Phases 2-3: D4.6-4.8). This enabled infrastructures to collaborate effectively and address common issues as they arose. In support of this effort, a knowledge-exchange workshop ran on the 29-30 April 2014 at TMF, Berlin, Germany. The programme and course materials are available at BioMedBridges website and can be found in D4.7 Appendix 2 (‘Resources’), item 3 (‘SWAT4LS SemWeb training materials’). In December 2013 and December 2014, at SWAT4LS and in May 2015 at an industry workshop, tutorials were delivered demonstrating the queries and analyses the RDF platform makes possible. A training course at the School of Computer Science, University of Manchester in December 2014 delivered a summary of the practicalities of working with and running an RDF platform, which summarised the technological approach and technology experience.
All of the SemWeb pilot work was informed by the work of WP3, with respect to the choice and use of ontologies, as well as provision and re-use of identifiers, reflecting the application of standards derived from the use case work packages (e.g. WP7 and WP10).
BioMedBridges played a major role at the CRI (Clinical Research Informatics) Day organised by ECRIN: "First CRI Solutions Day" (26-27 May 2014 at Heinrich-Heine University, Düsseldorf, Germany). The presentation about BioMedBridges (S. Suhr) was accompanied by presentations of software tools and hands-on sessions with their developers. Tools developed or employed in BioMedBridges (BBMRI Catalogue, CTIM, tranSMART, MOLGENIS / BiobankConnect, XNAT) and the approaches for data sharing of BioMedBridges could be compared with those of many other Research Infrastructures and EU projects, such as EATRIS, BBMRI, ECRIN, BioSHaRE, TRANSFoRm, EHR4CR, and p-medicine.
The first impression of participants at the CRI Solutions Day was that EU projects often seem to cope with similar problems and have developed similar solutions. For example, all domains struggle with the same rigid conditions for data protection and the challenge of semantic interoperability. It was suggested that it might be better to bring projects together and to work jointly on software solutions to common problems. Especially in the area of semantic interoperability and legally compliant data sharing, BioMedBridges has developed generic solutions that may be of help for other projects.
A workshop on translational research infrastructure including tools like XNAT, tranSMART, MOLGENIS, OpenClinica and Galaxy and the interfaces between them, co-organised by Dutch representatives of BBMRI, EuroBioImaging, and EATRIS, was held at the OpenBridges symposium (see section 126.96.36.199, Pilots 1 through 3). Materials such as presentations, documentation on discussions relating to the OpenClinica-tranSMART and tranSMART-Galaxy connections, as well as XNAT at the Dutch DTL Programmers Meeting are available elsewhere.
Various other outreach and dissemination activities have taken place during the course of BioMedBridges WP4, and are detailed within this report, in the relevant sections.
While many of the most refined pilots (below) are centred on the use of RDF, it is not the only appropriate solution for data integration; our experience indicated that certain kinds of data are better suited to different distribution and integration mechanisms. Therefore, this report covers an array of solutions that achieve integration, including RDF-based, as well as those that provide integration points for future efforts; D4.6 included both RDF-based and pilots using alternative methodologies, D4.7 was targeted specifically at judging the suitability and scalability of a purely RDF-centric data integration solution, with other deliverables providing various interfaces to data (REST, widgets, GUIs). Here we summarize our processes, products, and lessons learnt during the execution of this work package. We also describe the further work that has been conducted in the period intervening D4.6 and this report (D4.8), as well as detailing the sustainability of these pilot activities, and lessons learnt throughout the entirety of WP4.