Published April 28, 2017 | Version v1
Poster Open

The Swiss Data Science Center on a mission to empower reproducible, traceable and reusable science

  • 1. Swiss Data Science Center (SDSC), Swiss Federal Institute of Technology (ETH), Zurich, Switzerland
  • 2. Swiss Data Science Center (SDSC), École polytechnique fédérale de Lausanne (EPFL), Switzerland

Description

Our abilities to collect, store and analyse scientific data have sky-rocketed in the past decades, but at the same time, a disconnect between data scientists, domain experts and data providers has begun to emerge. Data scientists are developing more and more powerful algorithms for data mining and analysis, while data providers are making more and more data publicly available, and yet many, if not most, discoveries are based on specific data and/or algorithms that "are available from the authors upon request".
In the strong belief that scientific progress would be much faster if reproduction and re-use of such data and algorithms was made easier, the Swiss Data Science Center (SDSC) has committed to provide an open framework for the handling and tracking of scientific data and algorithms, from raw data and first principle equations to final data products and visualisations, modular simulation models and benchmark evaluation algorithms. Led jointly by EPFL and ETH Zurich, the SDSC is composed of a distributed multi-disciplinary team of data scientists
and experts in select domains. The center aims to federate data providers, data and computer scientists, and subject-matter experts around a cutting-edge analytics platform offering user-friendly tooling and services to help with the adoption of Open Science, fostering research productivity and excellence.
In this presentation, we will discuss our vision of a high-scalable open but secure community-based platform for sharing, accessing, exploring, and analyzing scientific data in easily reproducible workflows, augmented by automated provenance and impact tracking, knowledge graphs, fine-grained access right and digital right management, and a variety of domain-specific software tools. For maximum interoperability, transparency and ease of use, we plan to utilize notebook interfaces wherever possible, such as Apache Zeppelin and Jupyter.
Feedback and suggestions from the audience will be gratefully considered.

Notes

Poster presented at European Geosciences Union General Assembly 2017, id: EGU2017-12179.

Files

SDSC_Poster3a.pdf

Files (3.2 MB)

Name Size Download all
md5:ba914ff03484f06c97c5d94389aa2cda
3.2 MB Preview Download