"RESPONSES TO ONLINE QUESTIONNAIRE TITLED ""RDA Publishing Workflows: Research Workflows"" ",,,,,,,, A BRIEF ONLINE SURVEY CONDUCTED BY THE RESEARCH DATA ALLIANCE WORKING GROUP ON RESEARCH DATA PUBLISHING WORKFLOWS FROM DEC 16 2015 TO JAN 11 2016,,,,,,,, Timestamp,Contact author,Other authors' names,Workflow name,Motivation,Workflow description,Achieved results,Results yet to be achieved,Comments / Other details 12/17/2015 6:30:57,"Sarah Callaghan, STFC",CEDA Team,CEDA data ingestion and publication workflows,To capture the steps required to ingest and publish data submitted to CEDA (Centre for Environmental Data Analysis),Workflow diagrams and discussion can be found at http://proj.badc.rl.ac.uk/preparde/attachment/wiki/DeliverablesList/D2_1_D2_2_PREPARDE_Workflows_combined_draft1.pdf,A better understanding of the processes that have been built up over many years of operations.,Simplification!, 01/06/2016 11:45,"Pauline Ward, University of Edinburgh",Rory MacNeil of ResearchSpace; Angus Whyte of the Digital Curation Centre,Electronic lab notebook to data repository,"To be able to deposit research data and associated metadata, from life sciences experiments into an institutional data repository, retaining the structure of the data at the time of entry and organisation. The electronic lab notebook [ELN] _repository combination is suitable because: (1) the ease of use and flexibility of the ELN stimulates takeup by researchers, leading to increased data capture (2) Structure is added to the data naturally in the course of documenting the experiments (3) The structure and related metadata from the experimental process can be retained, without additional or repetitive effort on the part of the researchers, when the data is deposited into the repository.","Please see diagram at https://www.wiki.ed.ac.uk/display/datashare/RSpace+-+DataShare+integrated+workflow . Tools used: RSpace electronic lab notebook and Edinburgh DataShare (DSpace) repository Roles: RSpace: PI, Lab Admin, Researcher DataShare: Curator Before beginning the process, the PI liaises with the DataShare curation team to agree on a title for a Collection which the curators will create in DataShare, and into which the data will be deposited. Step 1. (Optional) Lab Admin, Researcher or PI 1.1 Create workflow form to add structure to research data Forms can be created by any user without the need for special skills or training, and can have as many fields as needed. Different kinds of fields, e.g. text, number, radio button, etc., can be created. A simple example would be an experiment form with fields for the date created, objective, method, results and discussion. Step 2. (Optional) Lab Admin 2.1 Create templates, e.g. protocol(s) and experimental workflow, using form(s) 2.2 Share with lab members Step 3. Document experiments (Researcher) 3.1 Use template to write up experiments In many cases, for a particular project the lab will be running multiple experiments of a similar kind, following a particular protocol. The protocol and the experimental template will be prepared at the beginning of the project, usually by the Lab Admin or a postdoc. As the project is being carried out, researchers use the template to carry out individual experiments. The Method and Objective fields can be saved with common content, so that the researcher only needs to complete the Date Created, Results and Discussion each time the experiment is re-run. The structure/workflow captured through the use of forms and templates is automatically maintained when documents are deposited in DataShare.3.1.1 Enter data, link to other RSpace documents, link to institutional file stores, link to files on Box, Dropbox, OneDrive or Google Drive, and to web urls, link to Mendeley. These links can include other relevant data and also grant information, data management plan, data paper; review; journal articles (including Mendeley). 3.2 Each document automatically receives a unique id. 3.3 Documents can be organised in folders or Notebooks, which, as well as individual documents can be deposited into DataShare. 3.4 (Optional) Add tags to, e.g, identify grant information, data management plan, data paper; review; journal articles. N.B. none of this data is exchanged with the institutional system in advance of deposit. Step 4. (Optional) Review by PI and/or Lab Admin (PI, Lab Admin) 4.1 PI and/or Lab Admin can review documents and make Comments in documents. Step 5. (Optional) Sign (Researcher) and Witness (Lab Admin, PI) documents 5.1 Document owner can sign documents. 5.2 Document owner can request PI or Lab Admin to witness documents. Step 6. Deposit documents directly from RSpace to DataShare (Researcher/Curator),the normal deposit workflow for which is described here https://zenodo.org/record/33899#.VnqHAfFULAo. 6.1 Select documents to be deposited. 6.2 Complete form with required metadata about deposit. 6.3 Sign DataShare submission agreement. 6.4 Receive submission approval email. 6.5 Completed DataShare deposit screen appears. 6.6 (Optional) Submit additional metadata to DataShare (Curator, upon request of Researcher) Step 7. Review by curator: 7.1 The submission is checked for conformity with DataShare policies including compliance with Dublin Core metadata standard _ see the eight-step Checklist: https://www.wiki.ed.ac.uk/display/datashare/Checklist%3A+Checking+a+new+Item+Submission+to+DataShare . 7.2 (Optional) Depending on the result of the checking process (7.1) the curator may request additions or amendments, usually to the metadata. 7.3 The curator approves the Item. 7.4 The dataset (ïItemÍ) is automatically given a handle persistent identifier. The metadata becomes publicly visible online. Metadata is available for harvesting by Google and other services (OAI-PMH compliant). If no embargo has been specified, the data files become accessible online. 7.5 The Item is automatically given a DOI persistent identifier. 7.6 If an embargo date has been specified, when that date is reached the files become accessible online. Until then, a button allows users to send a request for the data to the depositor. ALTERNATIVE WORKFLOW _ Expert user The DataShare curation team may identify an expert user who is planning a large number of data deposits over a period of time as one who might be trained to apply the curation standards themselves (as per step 7.1 and the eight-step checklist). In this case, such an expert user would be given administrative rights over the relevant Collection in the database, and the curation step would be entrusted to them for the relevant Collection. This would mean that automatic deposits would not need to be approved, and would automatically be assigned a handle, the metadata would become publicly visible (and the files, unless an embargo date had been specified). Then the Item would automatically be assigned a DOI. ","The integration of RSpace with DataShare (http://datashare.is.ed.ac.uk/ ) to enable automatic deposit has been implemented, fully tested, and made available to several research groups. Currently they are evaluating its suitability and early indications are that this route to depositing data in DataShare will be adopted in the near future . Use of RSpace, and the integration of RSpace with DataShare, should benefit labs, and individual researchers, in a variety of ways: 1. Use of forms and templates allow the lab to easily establish standard workflows that enables common approaches to carrying out and documenting experiments, which can be implemented more quickly and efficiently. 2. Using these structuring tools as a natural part of the research process, at the time when the research is being carried out -- as opposed to post hoc structuring when a deposit is about to be made _ results in capture of more and higher quality research data. 3. The structure achieved through use of forms, templates, folders and Notebooks can be maintained in deposits to DataShare, saving time at the time of deposit.","The DataShare curation team and RSpace support team are working with users to facilitate the production of this automatic data deposit, which we expect to document in the coming weeks and add to this record. It is hoped that at least one PI will be able to contribute to a further version of this workflow document before the end of January 2016. ", 01/06/2016 13:12,"JoÜo Aguiar Castro, University of Oporto","Cristina, Ribeira, Faculdade de Engenharia Universidade do Porto JoÜo, Rocha da Silva, Faculdade de Engenharia da Universidade do Porto Ricardo, Carvalho Amorim, Faculdade de Engenharia da Universidade do Porto",Ontologies for research data description and publication,"The scientific goal of this workflow is the creation of metadata for datasets that can easily carry the associated models, i.e. the aggregation of descriptor definitions and descriptor values. The technical motivation is the existence of mature semantic web technologies that can be integrated into operational environments suited for researchers as well as curators. The opportunity for implementing the workflow comes from the mismatch between the requirements for data management in research groups and the availability of the corresponding tools.","Data description is a very important step in any research data management workflow. Without proper metadata data reuse is limited, since researcherÍs wonÇt be able to understand, or even access data. In fact, despite all the technological achievements data loss is growing over the years. Our workflow aims to integrate metadata production at the beginning of the research process, together with data creation. It is recognized that data description at the end of the research cycle leads to poorer metadata. Having detailed metadata records as predicted outputs from this workflow, we believe that researchers themselves should be active participants in the overall process. Being the ones with more knowledge about the data they create, researchers should contribute in the definition of the metadata models for their domains, and also be major stakeholders in the description task. To do so, as data curators we invite researchers to an interview about their data activities, requirements and their expectations regarding data sharing. This interview is based on the Data Curation Profile Toolkit. Our process is complemented by performing content analysis in researcherÍs publications, and discussing with the researchers the fragments of information that should be provided along with the dataset to help others interpret it. At the end of this cycle we are able to design the domain-specific metadata models, that use familiar concepts to the researchers. In this context we are working with researchers from eleven different domains, representing experimental, simulational and observational research configurations. These metadata models are formalized as lightweight ontologies, incorporated into Dendro as a source of descriptors that researchers can use to create their metadata records. For a more generic representation of the datasets Dendro also preclude elements from well recognized standards, like Dublin Core. This workflow also includes Labtablet, a mobile application, integrated with Dendro, designed to allow researchers to capture metadata on field work. At the end of this cycle it is expected that the datasets, associated with good quality metadata, are deposited in external data repositories, for long-term preservation, dissemination and citation opportunities. All together this workflow leans to engage researchers in data management by showing them the benefits of sharing the data they are creating. Diagram here: https://drive.google.com/file/d/0B9NCKJDWYmjOMzhJek1SNEc1NG8/view?usp=sharing","Collection of datasets from multiple domains, and requirements for data description in the same domains; Domain specific metadata models, formalized as ontologies; Preliminary results on data description experiments with the researchers.","The work involved in this workflow is incremental, thus future work include: - Evaluation of the current tools, like the data management platforms and the ontologies, by running data description experiments with the researchers - Development of new ontologies and reuse of existing standards - Definition of quantitative and qualitative metrics to verify and improve the quality of the metadata records", 01/07/2016 10:17,"Angelina Kraft, TIB, German National Library for Science and Technology","Matthias Razum, FIZ Karlsruhe _ Leibniz Institute for Information Infrastructure, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen; E-Mail: matthias.razum@fiz-karlsruhe.de Jan Potthoff Karlsruhe Institute of Technology (KIT), Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen; E-Mail: jan.potthoff@kit.edu Andrea Porzel Leibniz Institute of Plant Biochemistry (IPB), Weinberg 3, D-06120 Halle (Saale); E-Mail: aporzel@ipb-halle.de Thomas Engel Chemistry Department, Ludwig-Maximilians-Universit_t MÙnchen, Butenandtstr. 5-13, D-81377 Munich; E-Mail: thomas.engel@cup.uni-muenchen.de Frank Lange Leibniz Institute of Plant Biochemistry (IPB), Weinberg 3, D-06120 Halle (Saale); E-Mail: flange@ipb-halle.de Karina Van den Broek Chemistry Department, Ludwig-Maximilians-Universit_t MÙnchen, Butenandtstr. 5-13, D-81377 Munich; E-Mail: karina.vandenbroek@cup.uni-muenchen.de ",RADAR services,"Goal of RADAR is to establish an interdisciplinary research data repository, which is sustained by research communities and based on a supported by a stable business model. The data management processes and tools needed include ´ guidelines for researchers to introduce and facilitate research data management in general and to store and/or publish their research data ´ a secure data preservation service including adequate storage periods (5, 10 and 15 years as well as permanent storage) by the use of distributed data storage mechanisms ´ (optional) data publication with Digital Object Identifier (DOI)-assignment to secure traceability, access and citeability ´ a technical implementation support for research institutions (e.g. by open API, the possibility of frontend branding as well as the option for data peer-review) The heterogeneity of research data is a serious issue for many research data repositories. RADAR is facing this problem by focusing on real scientific workflows and elaborates a generic best practice approach that is evaluated and tested with data provided by scientific partners from different research areas.","Homepage: http://www.radar-project.org General workflow please see RADAR Flyer: http://www.radar-projekt.org/download/attachments/1212839/Flyer_RADAR_ENG.pdf Roles: http://www.radar-projekt.org/download/attachments/1212837/RADAR_Datamangement_Roles_ENG.JPG What research data management services does RADAR provide? For research projects and institutions: RADAR offers a two-stage service with a starter package for preserving research data and a superior package for data publication with integrated data preservation. Starter package ïPreservationÍ In the starter package, RADAR offers format-independent data preservation with a minimum metadata set. Data providers are given the opportunity to store their data in compliance with specified long-term storage periods (e.g. 10 years, according to DFG recommendations). This service ensures a secure storage of research data (without publication). By default, the metadata will not be published, unless specified otherwise by the data provider. For further information on the starter package, please see Data Preservation in the RADAR Glossary. Advanced package ïData publication with integrated data preservationÍ In the advanced package, RADAR offers a combined service of research data publication and preservation. For each published dataset, RADAR provides a Digital Object Identifier (DOI) to enable researchers to clearly reference data and to guarantee data accessibility. Additionally, datasets can be enriched with discipline-specific metadata. For further information on the superior package, please see Data Publication on the RADAR Glossary. For publishers: Publishing companies may also benefit from the RADAR services, by archiving or publishing the research data supporting the published manuscripts. This increases both the visibility and transparency of the scientific work. One aim is to integrate research data in the manuscript peer-review process. An interface for the joint submission of the manuscript and the corresponding data is currently being developed and RADAR is actively seeking cooperation with publishing companies. For details on tools, access rights, interoperability, technical review, (API, ORCID implementation, OpenAire....), please see FAQ: http://www.radar-projekt.org/display/RE/FAQ",In the RADAR Test system (http://www.radar-projekt.org/display/RE/Test+System) we mainly provided creators of datasets an option to give access to not yet published datasets to reviewers and created a first API version for integrating with manuscript submission systems like ScholarOne.,"To develop a process for integration with publishers downstream in the research workflow together with international initiatives like your WG (no stand-alone RADAR solution, but synchronised effort)",Sorry for late submission - please also see Agnus Whyte email (07th Jan) 01/11/2016 02:56,"Wouter Haak, Elsevier","Wouter Haak, Anita de Waard, Elena Zudilova-Seinstra, Joe Shell, Mike Jones, Helena Cousijn",Elsevier RDM solutions workflow,A view on how a publisher could help connect the various pieces so that data use and re-use is better enabled,https://www.dropbox.com/s/k6r3y1wn6frbper/151201%20RDM%20-%20workflow%20connectivity.pdf?dl=0,We have already achieved results with the data journals and data linking program (results can be shared) but the integrated vision is something we are developing right now - seeking partnerships,Integrated workflow solution requires longitudinal research with institutions,Submission upon request of Amy Nurnberger