RefID,Outcome 01,Results (RDM),Outcome 02,Results (RDM),Outcome 03,Results (RDM),Outcome 04,Results (RDM),Outcome 05,Results (RDM),Outcome 06,Results (RDM),Outcome 07,Results (RDM),Outcome 08,Results (RDM),Outcome 09,Results (RDM),Outcome 10,Results (RDM),Outcome 11,Results (RDM),Outcome 12,Results (RDM),Outcome 13,Results (RDM),Outcome 14,Results (RDM),Outcome 15,Results (RDM),,,,,,,,,,,,,,,,,,,,, Akers & Green 2014,Few U Mich researchers were depositing data into Dryad,"? 6/91 publications had associated Dryad deposits ? A further 2 datasets not associated with Dryad-integrated journals were identified",Significnat Dryad use potential exists,"Because a large number of papers are published in Dryad-integrated journals, the library could encourage Dryad use by becoming a member to take advantage of the discounted pricing",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Akers 2012,Data management planning - storage,"? No significant variations among different ranks of faculty members in the amount of research data they were storing or their methods for data storage and back-up ",Data management planning - familiarity with funding agency requirements,"? Variations among faculty ranks in their familiarity with federal funding agency requirements (e.g., National Science Foundation (NSF), National Institutes of Health (NIH), National Endowment for the Humanities (NEH)) for data management or data sharing plans as components of grant applications (?2 (3, n = 210) = 13.5, p = 0.004 ? majority of full (70%) and assistant professors (75%) stated that they were either somewhat or very familiar with data management plans, ? over half of associate professors (60%) also expressed familiarity with these requirements. ? most non-tenure track faculty members (45%) were not familiar with data management plan requirements",Data sharing - willingness to share,"? Faculty rank did not predict faculty members? willingness to share their research data with other people ",Data sharing - reasons for not sharing,"? However, different ranks of faculty members expressed different opinions on why they might not share their data. ? Specifically, full (50%) and associate professors (55%) were more likely than assistant professors and non-tenure track faculty members to state that it takes too much time or effort to share their research data (?2 (3, n = 199) = 10.1, p = 0.018",Data sharing - other reasons that might prevent sharing,"? There were no differences among faculty ranks in other reasons that might prevent data sharing, including having data that contain private or patentable information, having data that require restricted access, fear of not getting credit for their data, fear of possible misinterpretation or misuse of their data, or belief that their data are of little use to others. ",Data sharing - data deposit,"? Different ranks of faculty were also equally likely to deposit their data in data repositories or ",Data sharing - metadata,Different ranks of faculty were also equally likely to express familiarity with data documentation and metadata.,Interest in Data Services,"List of ten potential research data services and asked faculty to select which services they would use if available ? faculty workshops on general data management. This service was desired by non-tenure track faculty members (75%) more than by assistant (50%), associate (40%), or full professors 45%( (?2 (3, n = 191) = 11.6, p = 0.009 ?There were no rank-related differences in interest for the other potential services including assistance preparing data management plans, consultation on data confidentiality and/or legal issues, personalized consultation on research data management for specific researchers or research groups, an institutional repository for research data, assistance with data documentation or metadata creation, research data management workshops for trainees (i.e., graduate students or postdocs), digitization of physical research materials, assistance identifying appropriate disciplinary data repositories, or methods for data citation.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Akers 2013,Data Storage and Back-Up - size of storage,"? The amount of digital research data currently stored by individual faculty researchers at Emory University mostly falls within the gigabyte range ?gigabyte range - A&H 45%, SocSci 55%, Med 40%, Basic 60% ?However, compared to researchers in other fields of study ? basic science researchers are more likely to have larger quantities (i.e., terabytes) of data than researchers (A&H 0%, SocSci 10%, Med 10%, Basic 30%) ?arts and humanities researchers are more likely to state that they do not know how much data they are storing.",Data Storage and Back-Up - type of storage,"? The most common methods of storing or backing-up data are via desktop or laptop computer hard drives, external hard drives (including USB drives), and university- or department-based servers ?Basic science researchers are more likely to rely on external hard drives, university-based servers, the hard drives of the instruments used to collect data, and lab books, field notes, or other printed/handwritten materials ?By contrast, arts and humanities researchers are more likely to rely on computer hard drives and internet-based storage services, such as Dropbox and Google Drive. ?There were no significant differences among fields of study in use of CDs, DVDs, tapes or ?other? methods of data storage and back-up.",Data Management Planning,"? Overall, most (~82%) faculty researchers are only somewhat or not at all familiar with requirements for data management or data sharing plans as components of grant applications for federal funding agencies, such as the National Science Foundation (NSF), National Institutes of Health (NIH) and the National Endowment for the Humanities (NEH) ?Furthermore, arts and humanities researchers are most likely to be completely unfamiliar with these funding agency requirements for data management plans. (80%)",Data sharing - outside of research group,"? Most faculty researchers at Emory University do not currently share their research data with people outside of their research group, although researchers in basic sciences were more likely to share their data than researchers in other fields of study. (A&H 35%, SocSci 35%, Med 25%, Basic 55%) ?",Data sharing - methods of sharing,"Of those researchers who do share their data: ?emailing data upon request is the most common method of sharing data (75%), followed by supp info linked to journal (40%), data repo (25%), uinversity website (25%), personal website (20%), other (15%) ?Basic science researchers are most likely to share data via supplementary material linked to journal articles (50%) or posted on department or university websites (30%) ?No arts and humanities researchers share their data via data repositories or databanks. ?There were no significant differences among researchers in ?other? ways of sharing data, the most frequently noted of which were internet storage services and sponsored accounts or restricted access to university-based servers.",Data sharing - willingness to share,"Of those willing to share: ?the vast majority are willing to share their data with other researchers (e.g., principal investigators, students, staff) (85%) working on the same projects (Figure 3C), although arts and humanities researchers are least willing to do so. ?fewer faculty researchers are willing to share their data with a wider audience. ?Most medical science researchers are not willing to share data with researchers outside of their projects or with instructors interested in using the data as a teaching tool. (30%) ?Arts and humanities researchers, however, are more willing to share their data with the general public than researchers in other fields of study (40%) ?nearly half of all faculty members are not willing to share their data with project funders ?no significant differences among fields of study in the proportion of researchers willing to share data with ?other? individuals or not willing to share data with anyone ?other included IRM, university admin",Data sharing - prevent sharing,"? Top three reasons: 1.The data contain personal or sensitive information; 2.Researchers might not get credit for their data in terms of acknowledgement, citation, or authorship; 3.The data might be misinterpreted or misused. ?Medical and social science researchers are most likely to not share data because they contain personal or sensitive information, or require secure and/or restricted access, ?Researchers in basic sciences and arts and humanities are more concerned that they might not get credit for their data than researchers in medical and social sciences. ?Basic and medical science researchers are more likely to withhold data because the outputs of their research could be patentable or commercialized. ?There were no differences among fields of study in concerns that data might be misinterpreted or misused, sharing would require too much time and effort, data may be of little value to others, researchers are not licensed to share their data, or ?other? reasons preventing data sharing. ?other included data sharing is prevented by the IRB or the Health Insurance Portability and Accountability Act (HIPAA), that data would not be shared if still in the collection or analysis phases or if manuscripts related to the data had not yet been published, or that data stored on university servers are not easily accessible by others.",Data Preservation,"? Vast majority of faculty researchers do not deposit their data in data repositories or databanks ?However, researchers in basic sciences are more likely to do so than researchers in other fields of study. ? those who answered ?no? to data depositing - slightly over half (~55%) stated they are somewhat or very interested in starting to deposit their data. In particular, medical science researchers are most interested in starting to deposit their data in a data repository/databank ",Data Preservation - most commonly used data repos,"National Center for Biotechnology Information ? including GenBank, Sequence Read Archive (SRA), Gene Expression Omnibus (GEO), and Database of Genotypes and Phenotypes (dbGaP) ? and Protein Data Bank (PDB). Less commonly used data repositories/databanks include Yale University?s NeuronDB and ModelDB, the Cambridge Crystallographic Data Centre (CCDC), Mouse Genome Informatics (MGI), Alzheimer?s Disease Neuroimaging Initiative (ADNI), HIV Drug Resistance Database (HIVdb), Rutgers University?s Cell and DNA Repository, Collaborative Initiative on Fetal Alcohol Spectrum Disorders (CIFASD), the Dataverse Network, and the Inter-university Consortium for Political and Social Research (ICPSR).",Data Documentation,"? Most faculty researchers are not at all familiar with documenting and/or creating metadata for their data so the contents of their datasets can be understood by others and/or computer-readable - not at all 60%, somewhat 35%, 10% ?no significant differences among fields of study",Interest in Data Management-Related Services,"? Two services that garnered the most interest are: 1.Faculty workshops on data management practices, 2.Assistance preparing data management plans for grant applications. ?Other services in which faculty expressed interest are: 3.Assistance with data-related confidentiality, privacy, legal, or intellectual property-related issues; 4.Personalized consultation on data management for specific labs or research groups; 5.An institutional data repository; 6.Assistance with data documentation/metadata creation; 7.Workshops on data management practices for students, technicians, administrative assistants, or postdoctoral fellows; 8.Digitization of print or other types of physical records, and 9.Assistance with identifying or using appropriate data repositories/databanks. ?Faculty expressed less interest in data citation services (e.g., assignment of permanent digital object identifiers). ??other? included support for setting up their own servers, reliably storing terabytes of data, creating and managing databases, designing data collection tools, and more easily sharing data with other researchers or research groups. ? A couple of faculty members explicitly stated they are not interested in any services. ?Compared with researchers in other fields of study, researchers in medical science are more interested in faculty workshops on data management, assistance with data-related confidentiality/legal issues, and identifying appropriate data repositories. ?Also, arts and humanities researchers are most interested in digitization of research materials in physical formats.",,,,,,,,,,,,,,,,,,,,,,,,,,,,, Akers 2014,Institutional Timelines of Building RDM Support,"?timelines show that the launch of services supporting the acquisition and analysis of data (?data services?) and infrastructure supporting the preservation and sharing of scholarly manuscripts (?IR?, institutional repository) typically precedes the planning (?assessment?) and implementation of services specifically designed to support the management of data around the research lifecycle (?RDM services?), including data preservation and dissemination (?data IR?). ?This provision of assistance with the discovery, use, preservation, and dissemination of digital scholarly materials set the stage for academic libraries to begin providing support for RDM",Motivation for Providing RDM Support,"? NSF?s announcement of DMP requirements in 2010 and implementation of those requirements in 2011 ? Attendance at the ARL/DLF/Duraspace E-Science Institute ? Action of forward-thinking university or library administrators ? The recognition of opportunities for libraries to provide more comprehensive research support with the growth of e-science and other forms of technologically intensive data-driven investigation ? What could be considered ?peer pressure? or a desire to follow the lead of other academic libraries with burgeoning programs of RDM support.",Campus Partnerships and Administrative Backing for RDM Support,"? In many of the profiled universities, programs of RDM support either grew out of or currently depend on campus-wide collaborations and initiatives, with university research offices, advanced research computing facilities, and campus information technology departments being prominent library partners. ?In other cases, the library plays more of a single-handed role in building RDM support. ?Institutional approaches to developing RDM support could also be categorized as top-down (i.e., propelled by administrative decisions) versus bottom-up (i.e., ?grassroots?, propelled by staff interests)",Multidisciplinary Nature of RDM Support,"? Libraries are now broadening their support to encompass not only science data curation but also curation of social science and arts and humanities data. ",Assessment of RDM Needs and Services,"? Formal assessments of researchers? RDM needs have been conducted many academic libraries ?Fewer assessments have focused on measuring the perceived or actual effectiveness of library-provided RDM services and infrastructure in terms of researcher uptake, satisfaction, or funding proposal success.",Changes in Staffing and Job Responsibilities,"? The recent or planned hiring of new staff members was often noted among the eight profiled universities (these new positions include postdoctoral fellows (Emory University, University of Michigan, and Penn State University), digital humanities librarians (Penn State University and University of Illinois), life sciences and engineering data services librarians (University of Illinois), data management specialists (Emory University), and research data services managers (University of Michigan) or directors (University of Illinois)) ?some universities discussed ways in which existing librarians received RDM-related training and/or guidance on how to integrate RDM support into their job responsibilities ?In addition to enhancing skill sets, efforts to more fully support RDM may also involve re-defining the job responsibilities of librarians, such as less time spent on collection development and basic library instruction and more time spent directly engaging with graduate student and faculty researchers ?our findings indicate that libraries often adopt more than one approach to changes in staffing",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Aleixandre-Benavent 2014,Journal allowed storage in thematic or institutional repositories,"? 67.6% of the journals specified that it was possible ? 32.4% did not specify such a possibility",Journal had a re-use policy,"? 64.7% of the journals support this possibility ? 35.3% did not specify",Repository suggested for data deposit by journal ,"? In 4 publishing entities that publish 8 journals(23.5%), PubMed Central is specified as the repository ? In 8 publishing entities that publish 14 journals (41.2%), no repository was advised ? In one journal, the recommended repositories were related to clinical trial registries ? 3 journals indicated the possibility to deposit manuscripts or data into repositories, but without specifying any locations ? Biomed Central Ltd., with 2 journals, is considered an open access repository",There is no relationship between openness policies and the impact of the journals according to their quartile or position ranking by impact factorin the JCR.,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Allard 2012,Theme 1 - data,Data is most important for publication,Theme 2 - storage,"Data storage activities exist but do not address sustainability, Most researchers keep their data on personal computers",Theme 3 - data-intensive research,Data-intensive science is not yet a regular part of the research environment,Theme 4 - data sharing,"Data sharing is only engaged in on a limited basis, reason extends beyond technological limitations to several socio-cultural reasons, lack of institutional support becomes an important barrier",Theme 5 - data collaboration,Collaboration with the broader scientific community is limited.,Theme 6- academic vs government,Attitudes towards the use and storage of data vary with the research environment,Scholars of information science,"? Very few felt they could discuss data practices, particularly in reference to science information",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Allen 2011,Sharing of samples and data,"? 11 stated date might be shared with researchers some way affiliated with project ? 16 stated data could be shared with anyone ? 3 did not reference data sharing ",Consent forms different in terms of the options for participant to withdraw,"20/30 stated that upon withdrawal samples would be destroyed, but that information already incorporated into the research would not be removed. 3/30 said that samples would be destroyed and information removed from research. 1/30 said no identifying info would be collected so no option to withdraw. 2/30 did not mention withdrawal at all.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, AlsheikhAli 2011,Policies,"? 44 (88%) out of 50 journal had a statement in their instructions to authors related to public availability and sharing of data from submitted manuscripts ?wide variation in journal requirements, ranging from requiring the sharing of all primary data related to the research to just including a statement in the published manuscript that data can be available on request ?journals used different phrases to indicate how strict these requirements were (??with minimal restriction . . . in a timely manner??, ??non-compliance ? may result in denial of future rights to publish??, ???a condition of publication . . . is . . to make materials, data and associated protocols promptly available . . . without preconditions?? ?Some specific types of data had very high frequency of requirement for public deposition. This included public deposition of primary microarray, nucleic acid and protein sequencing data, and macromolecular structures",Policy and impact factor,"? Journals where provision of materials and protocols was a condition of publication had higher impact factors compared to journals with non-binding instructions or no instructions at all (median [25th, 75th percentiles]: 15.14 [11.09, 19.78] versus 12.68 [9.72, 15.98] versus 9.83 [9.13, 11.05], P=0.04 by Kruskal-Wallis analysis of variance).",Articles,"? 149/500 (30%) were not subject to any data availability policy ?of the 351 that were covered by some data availability policy 208/351 papers (59%) did not fully adhere to the data availability instructions of the journals they were published in ?most commonly (73%) by not publicly depositing microarray data ?The other 143 papers that adhered to the data availability instructions of the journals did so by publicly depositing only the specific data type as required, making a statement of willingness to share, or actually sharing all the primary data ?only 47/500 papers (9%) did deposit full primary raw data online; ?None of the 149 papers not subject to data availability policies made their full primary data publicly available",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Alvaro 2011,Number of job postings (break-down),"? e-science librarian = 3 ? subject-specific librarian (w/data-related responsibilities) = 15 ? data/digital librarian = 9 25 (89%) University positions 2 (11%) Other","Personal skills The technical skill appearing most frequently was science subject knowledge (50%) ",The personal skills appearing most frequently in the job advertisements (Table 1) were communication (86%) and collaboration (82%),Technical skills,The technical skill appearing most frequently was science subject knowledge (50%),Data responsibilities,The data responsibilities most frequently observed in our sample were managing (50%) and storing (46%) data,People responsibilities,"The most prevalent duties associated with working with people (Table 4) were collaboration (75%) and outreach (68%), ",Technical responsibilities,Reference (64%) and collection development (54%) were the most frequently mentioned technical responsibilities,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Amorim 2015,Data disclosure,"? Some projects are not expected to disclose data ? some projects data need to be anonymized before their disclosure ? with data disclosure they would be able to cite datasets in publications and their peers could access and reuse such data in subsequent analysis",LabTablet for data disclosure,"LabTablet proved to be capable of handling all needs in terms of data production, as well as helping researchers identify some equally important descriptors thatcould be added to provide extra context",Data description,"DDI based, included ? Data Collection Methodology ? Data Source ? Sample Size ? External Aid ? Kind of data ? Universe ",Labtablet/Dendro for data description,"Ontology can be loaded at any time into the LabTablet application and be used to describe data in this area. The same is true for Dendro, our staging repository",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Amorim 2015b,Architecture of software,"? Either the institution signs for an external service or installs and customizes its own repository, and thus has to support the infrastructure maintenance costs. ? DSpace and CKAN can be installed and run completely under the control of the university, giving full control to them ? both CKAN and Zenodo provide collaborative tools and allow users to fully manage their group members and policies",Metadata,"? All the platforms require descriptions when datasets, DSpace supports domain-specific metadata schemas ? DSpace allows for METS metadata records, enabling the ingestion of the packagaes into a long-term preservation workflow","Interoperability and Dissemination Exposing repository","? Zenodo and DSpace natively comply with OAI-PMH, allowing for interoperability between repositories. ? Room for improvement regarding the compatibility with the description of data from different research domains, to further improve data reuse.","Bottom Line: 1) depending on local requirements, different data repositories can meet some of the stakeholders requirements 2) there is still room for improvements, mainly regarding the compatibility with the description of data from different research domains, to further improve data reuse",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Amos 2010,Focus group summary,"? Researchers? data management and data sharing practices were associated with specific research contexts ? Research practices were associated with pre-existing patterns of communication and collaboration among researchers at different stages of their research career and in particular fields of research",Data reuse,"Focus groups - disciplines such as astronomy and climatology have long traditions and established methods for ?open access? reuse of research data, participants from social science and public health were more inclined to share data within research relationships and partnerships with known colleagues",Frameworks for data storage,"Focus groups - In social science and public health disciplines, a condition of institutional ethics approval to conduct research is that clearly defined protocols for storage, retention and disposal of research data are followed ? Researchers from physical sciences and engineering had a more subjective and pragmatic approach to determining how research data is stored and disposed of",Maintaining privacy and confidentiality,Focus groups - social science and public health researchers - major consideration in determining whether to provide access to data,Access to results and research data,"Focus groups - physical sciences and engineering - common practice for data and procedures recorded in lab notebooks to be used by other researchers to replicate experiments ",Some key issues were articulated across all focus groups,Importance of secure storage; importance of formal data archiving policies and practices in fostering broader research collaboration; importance of detailed documentation across the research lifecycle for replication of experiments/conducting meta-analyses,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Anagnostou 2015,Most researchers in human paleogenetics share their data,"97.6% had made their genetic information fully avaialble and reusable. Presenting only data-derived statistics was the main modality of withholding data. The five withheld datasets were published in the last 6 years. 58% shared within body text, 18% through online primary databases (e.g. Genbank) and 24% through supplementary material",Researchers agreed on the importance of sharing,"97% of questionnaire respondence indicated the importance of ""making my own study open to scientific inquiry"" and 94% said that ""data sharing should be a common practice in scientific research""",Paleogeneticists shared at a higher rate than other fields of genetic research,"A comparison to other studies revealed that 80% in evolutionary genetics, 90% in forensic genetics, and 64% in medical genetics shared",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Anderson 2007,"Theme 1 - current state of data management","there continues to be widespread use of basic general-purpose applications for core data management",Theme 1 (current state) - data handling problems,"? 84% have experienced data management problems ? 52% of them sought to solve their data handling problems ? clear correlation between the size of a lab and the likelihood it had experienced problems",Theme 1 (current state)- currently use Laboratory Inforamtion Management System,"? 14% reported currently having a LIMS ? Least likely to have a LIMS -developmental biologists (4%) ? Most likely to use a LIMS - proteomicists (18%), pathologists (17%), and cell biologists (16%) ? Large labs were most likely to be using a LIMS (22%) but interestingly were closely followed by the smallest labs (18%)",Theme 1 (current state)- storage,"Most researchers (59%) were already storing at least some of their images digitally while roughly a third (34%) partly relied on hard copy archiving",Theme 1 (current state)- data handling,"? 50% structural biologists and proteomicists and 48% of genomicists reported that at least 10 employee-hours per week are spent in data handling tasks ? Over 50% of survey respondents reported spending more than 5 person-hours per week in data handling tasks. ? Larger size labs were shown to spend more employee time each week at data handling.",Theme 1 (current state)- interviews outcomes,"Two core themes surrounding the current state of biomedical data management and analysis emerged: the widespread use of non-specialized applications, and the difficulty of organization, storage, and retrieval of data.",Theme 2 - Data management needs,"There is broad perceived need for additional support in managing and analyzing large datasets (from abstract)",Theme 2 (needs) - Data handling/management caused a backlog in productivity,"? Data handling/management caused backlog - 41% responded ?Yes?, 22% ?Somewhat.? ? problem was greater in larger labs where 51% responded, ?Yes.? ? most highly reported by structural biologists (62%), followed by proteomicists (59%) and genomicists (50%)",Theme 2 (needs) - Most urgent computational or data handling problem,"? 28% of respondents reported Issues involving long-term digital archiving of a variety of data types ? 26% mentioned various forms of computerized data analysis (not involving a specific software product) ? 21% access to a specific software product would solve most urgent problems ? 6% aid acquisition of hardware was a problem. ? 18%naccess to some form of computer science or informatics expertise.",Theme 2 (needs) - interveiws,"Two common anticipated needs: improved methods of managing large datasets, and improved ways to process and analyze data.",Theme 3 - barriers to data management,"The barriers to acquiring currently available tools are most commonly related to financial burdens on small labs and unmet expectations of institutional support (from abstract)",Theme 3 (barriers) - funding sources,"? The single largest consensus on appropriate funding sources for tools was from indirect costs in research grants (37%), commonly 50% or more of direct costs ? 8% felt that tools and support should be funded as subscriptions directly from research grants, ? 5% stated subscriptions funded from other sources, ? 38% thought that it would be appropriate to support this through all three of these categories",Theme 3 (barriers)- interviews,"? Lack of Time to Invest in Changing Data Management Practices and Improving Training ? Limited Availability of Institutionally Provided Expertise and Systems",,,,,,,,,,,,,,,,,,,,,,,,, Antell 2014,Awareness of NSF Mandate (n=175),"94.9% yes, 5.1% no","Does your university provide support to help scientists develop data management plans? (n=165)","? 60.1% yes ? 17.8% being planned ? 6.7% no ? 15.3% unsure","Which campus entities provide (or will provide) support for scientists developing data management plans? (n-153)","? 78.8% library ? 38.4% research offices ? 22.5% IT departments ? 25.% unsure (multiple responses allowed)",Number of library employees who provide data management assistance (n=153),"? 11.3% 0 library employees ? 41.1% 1-5 library emplyees ? 15.9% 6 or more library employees ? 31.8% unsure","Job include duties related to institutional repositories, data repositories, or data management? (n=162)","? 40% yes ? 16.7% being planned ? 43.8% no ? comments (n=49) - (39%) said job include ?liaise, consult, or refer?, (16%)?just starting? to perform data management tasks, (14%)their work included ?promoting, publicizing, or advocating for? the library?s data management services or repositories ","Describe [their] job duties related to institutional repositories, data repositories, and data management (n=82)","? Comments (n=152) -(25.0%) indicated that their job duties include ?liaising, consulting, or referring.?, (15.8%) ?help researchers develop data management plans,?, (15.1%)?promote, publicize, or advocate? for the library?s data management services, (9.2%) ?just starting? to work in this area, (9.2%) working to educate themselves about data management ","What skills do you think science librarians need in order to help scientists with data management? (n=136)","? Comments (n=333) - ?knowledge of the data lifecycle,?(17.1%), ?subject-specific knowledge or skills? (13.8%), ?communication, networking, and reference skills? (13.2%), ?metadata skills? (10.8%), ?software or computer skills? (9.9%), ?Knowledge of the research process? (6.3%)","Do You Think You Have the Skills Needed to Help Scientists with Data Management? (n=155)","? 23.2% yes ? 31.6% actively acquiring ? 31% no ? 14.2% unsure",Please Describe the Data Management Skills that You Have or Are Acquiring (n=75),"? Comments (n=214) - 80% skills already have, 20% currently acquiring skills - ?knowledge of the data lifecycle,? (22.4%) ?subject-specific knowledge or skills? (12.1%), ?willingness to undertake continuing education? (also 12.1%), ?communication, networking, and reference skills? (9.8%), and ?experience working with IRs or DRs? (also 9.8%), ?metadata skills? (7.9%), ?knowledge of the research process? (5.1%), ?software or computer skills? (4.2%), and ?experience helping researchers develop data management plans? (1.9%). ",Data repositories were much less likely than institutional repositories to be housed in the library,90% for IR vs 36% for DR (note the unsure cateogry was 2% for IR vs 54% for DR so this could be explained by the fact that the DR is not yet developed in many institutions),"Many science librarians do work with Irs, DRs or data management","40% said yes their job includes duties related to those areas, with a further 17% planned to get involved in these areas. By far the most common type of involvement was ""Liaise, consult, refer""",Science librarians overall were not confident they had the needed skills for this role.,"23% said yes, they had the skills, 32% said they were acquiring the skills, 14% said they did not have the skills, and 31% were not sure",Knowledge of the data life cycle was seens as the most needed skill,17% of respondents said it was a needed skill (2nd place was subject specific knowledge at 14%) and 22% said they have or are acquiring knowledge in this area (2nd place was subject-specific knowledge at 12%),,,,,,,,,,,,,,,,,,,,,,,,, Arguillas 2015,Summary,"case study describes how CISER used ?Data ReCap? a capability model for data repositories. Data ReCap, introduced in the DCC Guide, is intended to help research data management professionals highlight organisational (non-functional) or technical (functional) gaps in their data repository or catalogue capabilities. CISER staff used Data ReCap to support their planning of service improvements. CISER is currently reviewing its technical infrastructure, to map and translate existing tasks and functions to the OAIS reference model. Applying the Data ReCap model identified priority areas for this review.",Challenges were identified,Upgrading an established data archive can be challenging in terms of staff resources; meeting international standards can be demanding and archives may need to prioritize which standards to comply with.,DCC Data ReCap model has been very helpful,"In identifying gaps between capabilities met and unmet, to inform development roadmap",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Averkamp 2012,Data Management Plans,"Nearly one quarter of respondents report writing data management plans as part of their funding requirements ? 186/551 (34%) respondents reported they require a DMP as part of a grant proposal",Data Management Needs,"? Of those repsondents reporting a required DMP 76/273 (27.8%) report not receiving assistance with data management and that they would liek assistance ? 53/273 (19.4%) respondents said they do not need help with DMPs ? Among those who receive assistance, most find help within their departments or colleges 109/273",Data Storage,Availability of storage is a concern for many researchers,Repositories,"34 respondents (out of 784) answered that they store or share data in a disciplinary or institutional repository, though only 18 listed a specific repository.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Aydinoglu 2014,data management,"83% ??well-maintained data helps retain data integrity??"" 82% ??reanalysis of data helps verify results data?? 78% ??data sharing reduces redundant data.?? more that 75% ""data sharing encourages collaborative science"" more than 75% ""data sharing reduces redundant data"" more than 70% ""data availability provides safeguards against misconduct, data fabrication and falsificiation"" 70% ""replication studies help in the training of next generation of researchers"" ",Types of data (top 3),"? 121 - Experimental data (with some manipulation) is the most frequently used type ? 99 - observational data (no manipulation involved) and ? 74 data models ",Data formats used to store data,"? 128 use spreadsheets ? 104 use txt files (ASCII text file?flat, rectangular, hierarchical, comma- or tab-delimited) ? 82 use freetext, and ? 70 use comma-separated values ? 60 use Statistical data package formats ? jpg/jpeg are the most common image format",Data back up frequency,"? 19% back up their research data immediately, ? 23% daily, and ? 20% weekly",Data back up medium,"? First choice is portable backup (external hard drives, 151; thumb drives, 4). ? Internal hard drives (68) and cloud (67) are the second choice. ? CD/DVDs (41) and magnetic tapes (6) are still around (n = 194). Only a small fraction of the participants use institutional repositories (12).",Where they store their research data,"? 148 of them responded ??on a personal computer connected to the network??; ? 82 ??on a computer operated by my school, company, or organization??; ? and 81 ??on a computer operated by my research team?? (n = 194). ? Only a limited number of them were publicly accessible, as follows: 34 ??on a public website?? and 27 ??on an open access repository.??",Who do you share data with,"? 71% share their data (n=177) or those ? (79%) share it with other members of their research team (n = 178) ? 90% ??other researchers at my institution,?? ? 89% ??other researchers in my discipline,?? and ? 88% ??scientific community at large?? ? 25.8% funding agency",Data repositories used,"62 different repositories - most common disciplines represented in these repositories are astrophysics, genetics, exoplanets, and earth sciences research","Why they do not make data available to others","? ??People don?t need them [research data]?? (more than 50) (n = 194). ? lack of time (43), ? funding (35), and space (32). ? ??don?t have technical skills & knowledge?? (22).",Satisfaction with their process/tools through the data life cycle,"? Process for collecting their research data (??agree?? 79%, n = 180) and ? process for searching their own data (??agree?? 73%, n = 179). ? process for cataloging/describing research data (??agree?? 68%, n = 179) ? process for storing data during the life of the project (??agree?? 70%, n = 178). ? process for storing data beyond the life of the project (??agree?? 45%, n = 179), which indicates a data preservation issue. ? 19% of the participants are satisfied with the tools they are using for preparing metadata (n = 171) ? 43% are satisfied with the tools for preparing their documentation (n = 176). ",Institutional support for data management,"? 36% of the participants? organization or project has a formal established process for managing data (n = 181), and ? 39% provide the necessary tools and technical support for data management (n = 180). ? 25% are provided the necessary funds to support data management (n = 180), ? 16% received training on best practices for data management by their organizations and projects (n = 180).",Conditions that encourage sharing,"? Formal acknowledgement of the data providers and/or funding agencies in all disseminated work making use of the data (90.6%); ? preliminary data should be labeled as such so that people know when data are not completely vetted (90%); ? formal citation of the data providers (87.60%); the opportunity to collaborate on the project (86.40%); ? coauthorship on publications resulting from use of the data (80.50%); and ? availability to the team but not outside until publication (79%).",,,,,,,,,,,,,,,,,,,,,,,,,,, Bamkin 2014,General consensus that data should be shared [Focus Group],"? Aids in collaboration with other people ? As part of an open ethos or for a researcher to gain the endorsement of their peers.",Data needs to be refined before it can be shared [Focus group],"? Raw data should be refined to eliminate error, to be understandable and anonymised ? The context of the data should be taken into consideration ? The form of the data ? Currency of data","Key drivers to to increase access to research data [Cross Sectional Survey] ","? Openness ? accountability ? increased access (?) ? increased efficiency of research resources",Concerns about sharing data [Survey],"1) Attribution of intellectual property right to the data being shared 2) institutional and establishment models and mindset which create barriers to sharing data",Data sharing policies [Survey],"? Most respondents believe that journals should provide data sharing policies (74%) ",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Bardyn 2012,How do translational researchers work with data? What data workflows are used for translational projects,"? Dedicated IT infrastructure and staff ? established routine control procedures for handling data quality ","Where are data stored? ","Third-party commercial vendors to store or process data, or proprietary software to collect and/or analyze data",What data curation techniques are used? What software is used for data curation?,"? Third-party commercial vendors to store or process data, or proprietary software to collect and/or analyze data ",What are the most difficult aspects of working with data?,"Barriers (1) a perceived lack of adequate funding for good data management by federal funding granting agencies (e.g., the NIH); (2) a low perceived ability to access and extract data out of the current EMR; (3) a high perceived ignorance of Web-based open source data sets and how to access Web-based data sources; and (4) the complexity of skills required to adequately analyze and synthesize clinical research data. Challenges ? Data relevancy - extent to which data are applicable and helpful ? timeliness - extent to which the data are sufficiently up-to-date ? completeness - extent to which data are not missing and are of sufficient breadth and depth ? accuracy - extent to which data are correct and free of error ",Do translational researchers share data? Do they deposit data into institutional,"? Concerns of balancing access to data with data security, and data practice and policy issues ? None of the focus group participants were aware that the university had an institutional repository","What resources do translational researchers wish were available at the university? ","? Repository management, ? training in searching databases and open sources for data on the Web, ? metadata description and discovery, ? data curation assistance and services",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Belter 2014,Implications for data curation,"? A data set that is quality controlled, adjusted, and merged with other similar data sets can be used to create a more comprehensive, overarching data product ? most articles do not refer to these data sets in a section indexed by Web of Sceince, calling into question the appropriateness of citation-indexing databases for compiling citation counts for these data sets ? there are wide disparities in the methods used to cite or refer to these data sets, the development of a standard citation format is necessary, that format by itself is not sufficient to guarantee consistent citation of data sets ? consistent use of a DOI to refer to a data set would enable a researcher to search full-text or citation-indexing databases for that doi to retrieve a reasonably accurate set of articles citing that data set",Freely avaialble oceanographic datasets are highly cited,2 of the 3 datasets have higher citation counts than 99% of all articles in oceanography in Web of Science from any single publication year from 1995 to the present,Majority of dataset references occur within the full text rather than in the reference list,,"Methods of citation vary widely, even though data producer suggests a citation format for the datasets",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Berlinicke 2012,A system was developed that can reduce barriers to laboratory data management,,"""Implementation research"" approach may lead to greater productivity in drug discovery research laboratories",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Beskow 2008,Duration of Storage,"? Almost three quarters of interviewees thought their blood would be kept as long as it was needed, or forever ? The remainder generally knew that the consent form made reference to indefinite storage, but had doubts about how long their blood would actually last",Medical Record Access,"? Almost all interviewees thought that personal medical information would be collected from their medical record ? More than half of interviewees said that providing access to their medical record would not affect their decision about participating ? those who expressed discomfort about medical record access were asked, ??Are there any ways the Biorepository?s procedures could be revised to make you more comfortable??? Among 10 interviewees asked this question, 6 said they wanted to be asked permission each time. One wanted an ??iron-clad guarantee?? of confidentiality, one wanted to know exactly what was being studied, one wanted to designate certain parts of her medical record ??off limits,?? and one said there were no circumstances under which she would agree to ongoing access.",Access to Research Results,"? Nearly two thirds of interviewees were comfortable with the statement, ??You should not expect to get individual results from research done with your blood.?? ?The remaining third of interviewees expressed concerns about not getting individual results, citing a desire for reciprocity for having rendered assistance, and a desire to be contacted if the findings had clinical relevance",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Beskow 2010,IRB representatives thought more of the form was important than prospective study participants,"IRB reps highlighted 72% of the sentences, researchers 54% and participants 40%.",Rankings: Topics of sentences most often highlighted,"? Participants - individual research, privacy risks, data sharing ? researchers - purpose of the biorepository, privacy protections, costs, and participant access to individual individual results ? IRB representatives - collection of basic personal information, medical record access, and duration of storage","Proportions: Agreement and disagreement in the sentences most often highlighted","Large-scale data sharing, privacy risks, the privacy protection afforded by not placing research data in medical records, and the narrow circumstances in which individual research results would be offered to biorepository participants.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Bigagli 2014,Lit review outcome,"? Infrastructure and technology challenges are not considered the most important obstacles to Open Access to research data, compared to financial, legal, and policy challenges. This finding is furthermore echoed in our online survey results and in our case study findings, which are presented in chapters 4 and 5 respectively. ?five key issues of data: heterogeneity and interoperability, accessibility and discoverability, preservation and curation, quality and assessability, and security ",Survey - general results,"? Out of 45 responses, 62% came from stakeholders self-identifying as Disseminators/Curators, and 29% as Producers ? from natural and computer sciences, 7% respondents identified themselves as End users, 2% funder ?Heterogeneity and interoperability emerge as important issues overall ?The need for data documentation and quality assessment emerge again as key issues ?technological issues seem to be rather low on the priorities of those who replied to the survey",Survey - preservation,"? Do you think research data should all be preserved idefinitely, in principle - yes 62%, no 38% ?If not who should decide what to preserve and until when - data producers 29%, librarians/repo managers 18%, funders 4%, end users 21%, disciplinary assoc/peer review 29% ?who should be primarily responsible for storing European research data and making them accessible - data producers 12%, digital libraries/institutions 25%, national institutionalized repos 25%, centralized EU registry 20%, publishers 3%, funders 7%, other 8%",survey - barriers to implementation,"? With the increase of openly accessible data, what factors do you think will have the greatest impact - heterogeneity of data formats 14%, application interoperability 11%, data access/download 9%, catalogues and search engines 7%, storage capacity on the client side 1%, obsolescence of data formats 4%, data preservation 13%, bandwidth 1%, data compleeness, 3%, data documentation (metadata) 22%, data quality 13%, privacy and security issues 6%, other 1%",Survey - quality of data,"? Who should evaluate the qulaity of reserch data - data producers 7%, librarians/repo managers 11%, end users 18%, disciplinary assoc/peer reivew 38%, nobody 2 %, other 24%",Survey - specifically for disseminator/curator,"? Do you have direct epxerience of implementing open access to research data - yes 46%, no 54% ?do you take user feedback into account for improving the quality of your data - yes 54%, no 46% ?do you offer tools in order to associate research data to scientific publications - yes 46%, no 54% ?what kind of additional information would you consider relevant to complement scientific publications - raw data 28%, data documentation (metadata) 37%, supplemental information 24%, other 11%",Survey - specifically for producers,"? Does your institution have a data management plan - yes 31%, no 46%, don't know 23% ?do you have direct experience of releasing your data accouring to an OA policy - yes 46%, no 54% ?",Case study - particle physics,"? Key technological issues in implementing OA to physics data - bandwidth; technological barriers to access, analysis and use ??what is missing is linked access, discovery mechanisms and suitable provenance tracking systems.? ",Case study - health and clinical science,? Key technological issues in implementing OA to health and clinical data - heterogeneity of data; different disciplines have different views/ preservation/curation of data; security and privacy; more proactive collaboration of end users and producers needed for health information systems design,Case study - bioengineering,? Key technological issues in implementing OA to bioengineering data - Outputs reproducibility; devlopemnt of ontologies; lack of technical experitse of Disseminator/Curators doing storage and maintenance,Case study - enviornmental science,"? Key technological issues in implementing OA to environmental data - legal aspects and institutional arrangements are prominent over technical ones; standard metadata; cost of data preservation means community needs to decide what to keep; user feedback for development of datasets",Case study - archaeology,? Key technological issues in implementing OA to archaeology data - heterogeneity of data formats; access control for sensitive data; distributed responsibility for storage and preservation,Interviews - funder perspective,"? Technological barriers are not reported as of high priority or concern for implementing Open Access to research data, whereas financial, cultural and legal challenges are higher on the list of concerns",,,,,,,,,,,,,,,,,,,,,,,,, Bishoff 2015,College and Department Analysis,"Of 450 possible DMPs, 182 (41%) were analyzed ?science and engineering 80.22% ?biological sciences 8.24% ?liberal arts 4.4% ?food, agricultural and natural resource sciences 3.85% ?academic health center 1.65% ?education and human development 1.1% ?public affairs 0.55% ",Data sharing methods,"? Data sharing mentioned in 96% of DMPs ?categories (n=145): publications (specific) (5), publications (non-specific) (129), conference presentations (26), theses or dissertations (8), personal websites (46), other websites (38), disciplinary repositories (70), local institutional repositories (IRs) (14), and sharing by request (79)",Access Levels Based on Sharing Method,"? 40% of all 415 references to data sharing would potentially make the underlying data publicly accessible ?public/unrestricted - website (50%), public repository (local 8%, disciplinary 42%) >60% of all data sharing strategies (n=247) would potentially make data inaccessible for certain audiences ?public/available by request - available on request (32%) ?researcheres in any field - publication (54%), thesis (3%) ?researcherin Pis field - conference (11%) ",Audience for Reuse,"? 131 plans named at least one intended audience for a total of 202 mentions ?58.4% (118) fall into the public/unrestricted category while 41.6% (84) fall into a more specific audience category: public/restricted (118), researchers in the field (44), public/available by request (16), research in any field (16), students (4), project team (4)",Sharing Timeline and Retention Period,"? Less than half of the PIs include a timeline for sharing (43.4%, 79) and even fewer (29.7%, 54) specify a period of data retention ?distributio fo timelines mentioned for data sharing (n=91):after publication (48), asap (10), after data is generated (10), after project is complete (9), other (5), after quality control (4), before publication (3), upon approval by external entity (2) ?PIs in the sample most commonly plan to keep their data for either three years or indefinitely (n=59) ?permanently (18), 3 years (18), 5 years (7), 10 years (5), references to NSF (3), >20 years (2), 6 months (2), 1-2 years (2), 6 years (1), 7 years (1)",Private Data,"? 18% (33) of the DMPs included one or more mentions of private or sensitive data ?categories of private or sensitive data (n=39): personally identifiable information (22), proprietary data (6), FERPA protected data (4), other (non-human sensitive) (4), HIPAA (1), location data for protected species (1), culturally senstive material (1)",Data ownership and IP,"? Data Ownership and/or Intellectual Property Rights Mentioned (n=182) ?yes (29%), no (66%), NA (5%)",Long-Term Sharing and Archiving,"? Data archiving plans were included in 80% (145) of the DMPs and ranged from welldefined digital archives to more ad hoc, individual techniques (n=201) ?remain in storage location post-project (58), individual techniques (46), deposit to national archive (40), deposit to local archive (24), deposit to institutional archive (21), data published in ""archival journal"" (11), deposti to instituaion arcive (non-UMN) (1)",University Services,"? University services, including data services available from the Libraries, were mentioned in 36% (65) of plans ?The UDC, our campus IR, was mentioned in 11.5% (21) of the plans, ?University Libraries was mentioned in 2.2% (4) of the plans ?A variety of other campus services for data analysis, storage, andtraining were mentioned as well, including the local high-performance computing center (n=17), local file storage providers (n=3), and secure data training offered by the Office of Information Technology (n=1)",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Blumenthal 1996,Industry support and refusal to share research,"Faculty members with industry support were significantly more likely than those without such support (11.1 percent vs. 5.8 percent, (p=0.008) to report that they had refused requests from other academic scientists to share research results or biomaterials ",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Blumenthal 1997,"Reported refusing to share research results or materials with other university scientists in the last 3 years","? 8.9%, of those ? 46% to protect their scientific lead, ? 27% limited supply or high costs of the materials requested; ? 18% a previous informal agreement with a company; ? 6% to protect the financial interest of the university; ? 4% because of a formal agreement with a company; and ? 2% to protect their own financial interests in the results of their research.",Denied access to research results and products from other university scientists,"? 34% ? Only 4% of those who had never been denied results or materials reported refusing a similar request themselves, compared with 18% ofthose who had requested and been denied research results of others (P<.001).",Refusal to share,No significant associations between refusal to share and sex or academic rank,Denied access and Academic-industry research relationship,"Faculty with AIRRs were also significantly more likely than faculty without AIRRs to report having denied other university faculty access to research results or biomaterials (11% vs 8%, P<.01).",Sharing and academic-industry research relationship,"14% of respondents receiving less than one third of their research support from industry, 13% with between one third and two thirds of their budget from industry, and 4% with more than two thirds of their funding from industry reported denying others access to research results or materials (P<.05).",Refusing to share and productivity,Faculty who reported refusing to share research results also reported higher rates of publication than those who did not.,Denied access and commercialization of research,"Faculty engaging in commercial activities were also more likely than colleagues without such activities to have denied other scientists access to research results or biomaterials (13% vs 5%, P<.001).",Data withholding is not widespread,20% reposrted delay of publication of more than 6 months; 9% reported refusing to share results or materials with other university scientists.,Involvement with commercial interests are significantly associated wit the tendency to withhold results,"Among faculty who engage in commercialization, 31% reported publication delays longer than 6 months, compared with 11% for those who were not engaged in commercial activities. 13% involved in commercialization had denied other scientists access to results or biomaterials va 5% who were not involved in commercialization. The most commonly reported reason was the need to allow time for filing patent applications.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Blumenthal 1997b,Withholding data,56 vs 40 (genetics researchers with academic-industry research relationshps v. non-genetics researchers with academic-industry research relationships) reported keeping research results secret beyond the time required to file a patent application,Denial of request for data (sharing),Genetics researchers with academic-industry research relationships twice as likely as non-generic researchers with academic-industry research relationships to report they had denied such requests (18 v 9; p=0.002),,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Blumenthal 2006,Sharing discouraged in training,"40% yes, 60% no","Formal instruction on data sharing ","11% yes, 89% no",Sharing outcome,"17% negative outcome only , 37% mixed outcome, 47% positive outcome only",Geneticists more likely to withhold data than non-geneticists,44% of genetisists vs 32% of non-geneticists reported particulating in any one of 13 forms of data withholding in the prior 3 years,Several factors significantly affected the likelihood of withholding in both genetics and non-genetics researchers,"The significant factors were male gender, participation in relationships with industry, mentors' discouraging data sharing, receipt of formal instruction in data sharing, and negative past experience with sharing.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Bohemier 2011 (+ Companion),Data policy availability,"? Uneven across institutions ? mean number of policies per institution was 3.8, although most institutions had only one data policy ? some had more than ten policies ? only two had no retrievable data policy ",Disciplines,"? Most policies applied to the disciplinary- or departmental level, although several universities had policies that applied to the entire institution. ? social sciences and earth sciences provided the most policies",Policies are implemented unevenly across institutions,"Only 15% of all policies applied to the institution as a whole, most applied only to pseicific disciplines, collections, or projects.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Borgman 2012 (+ Companion),"What are the ?data? in science and technology research collaborations ","? Data is a complex notion, and one that is not well understood even by the parties creating and using them. ? What one team or individual considers to be data may not be recognized as such by another. ? Concepts of data vary considerably by research activity and by individual.","Role of data within and between science and technology collaborations","? Science and technology researchers depend upon each other?s data for interpretation of their own data ? What are data to one researcher are context to another ",Data curation,? Data curation practices are not consonant with the interdependence of the teams on each other?s data.,Data re-use,"Interdependent field research of the type described here tends not to produce data that are easily reusable, particularly for outside researchers. Neither the scientific nor the technological research data can be interpreted without the other, yet these datasets are quickly separated, never to be reconciled again. As a result, none of these data remain useful beyond the teams that generated them. Even those teams may not be able to reuse them, given the difficulty of obtaining the context data necessary for interpretation, which often are held by others",Four main data types were identified,Sensor-collected performance data; sensor-collected proprioceptrive data; sensor-collected scientific data; hand-collected scientific data,"For each discipline (science vs technologists), what is data to one is context to the other",Sensor-collected and hand-collected sicentific data are data to the science teams and context to the technologists. The performance and proprioceptrive data are data to the technology teams and context to the science teams,"Science teams were more likely to support data preservation and sharing, whereas technology team rarely return to the data and don't see their purpose in the future",Based on observation,Different views of data resulted in some tensions when the two groups collaborate,Based on observation,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Borgman 2015,Scale of research,"? The larger projects made more explicit plans for data release and invested more heavily in digital libraries systems. Datasets, as sustained inSDSSdigital library systems, are their primary scientific legacy. Data resources are similarly central to the research goals of LSST. These large sites have to negotiate with multiple stakeholders to make their data accessible In contrast, investments in digital libraries for C-DEBI began in earnest several years into the research project and have yet to be fully implemented. CENS, which was established well before funding agency requirements for data management plans were promulgated, made minimal investments in digital library systems or services. CENS researchers were willing to share their data, but had few mechanisms, incentives, or resources to do so. Negotiations about data access weremore often internal to CENS or between CENS researchers and external parties who requested access to their data",Scientific goals,"? Role of digital library may depend not only on the scale of data for a research project, but also on its scientific goals. ? The astronomy projects built digital library services into their research goals to ensure that the datasets are a legacy product. Astronomy data are reused formany years after they are collected. Much effort is devoted to the design of systems and the curation of data. ? In contrast, smaller science research conducted at CENS and C-DEBI is more concerned with immediate scientific breakthroughs than with the data that lead to those findings. The data are a means to an end, which is the publication of findings in scientific papers.",Temporal scale,"? In the smaller scale projects comprising CENS and C-DEBI, data management tools are selected, designed, and used by the same individuals. Technologies can be readily adapted to the problem at hand ? Conversely, in the multi-decade time scale of developing larger research instruments and facilities in astronomy, data management technologies, policies, and practices are designed for anticipated future uses and users. Those developing the digital libraries may be different individuals, with different expertise, than those who curate the data.",Openness and scale,"Types and degrees of openness vary along these dimensions of scale, interacting with many other factors. Data release is central to the scientific goals of SDSS and LSST. Data repositories are part of the initial project goals and design, which leads to standardized methods of data collection and management. However, decisions about what can be released, when, and to whom vary between SDSS and LSST, and between observational data and software code. Other stakeholders, including funding agencies, may be the final arbiters of openness. The two smaller science projects also vary on types and degrees of openness. CENS aspired to releasing more of their data, but had great difficulty finding the means to do so. C-DEBI has similar aspirations and is developing the means to manage, share, and reuse their data. In all four of these sites, releasing software code appears easier to accomplish than is releasing research data. Again, these factors vary considerably by local circumstance.",Openness and timeframe,"The time frames of these projects also influence their types and degrees of openness. For example, SDSS and LSST investigators may have proprietary access to the data prior to their public release. LSST is proposing a multipronged approach to data release, which includes making unprocessed data available immediately. CENS launched long before the current pressures from funding agencies to release data at the time of publishing journal articles. Two research domains within CENS, seismology and genomics, had discipline-specific data release requirements, and these obligations were met. C-DEBI launched prior to current NSF data management plan requirements. They implemented these requirements retroactively and are incorporating data management practices in themid-stages of project life cycles",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Bracke 2011,Data curation librarian role,"Past experience demonstrated that there were departments in the College of Agriculture, especially Agronomy, which saw potential in working with the Libraries to archive and make their data accessible. The agricultural librarian identified the most appropriate faculty member to work with for the pilot data repository, through both personal interactions and the application of the DCP tool for identifying data needs. During the DCP interview, one faculty member expressed frustration with his current data management practices, including storing, organizing, and sharing data, as well as frustration at the lack of data curatorial explorations in his discipline. He also was willing to be part of the research and learning process as the task force developed the prototype. His data collection, ultimately selected for the repository, documented the influence of phosphorous and potassium on alfalfa growth. In addition to metadata, the librarian and the agronomist discussed copyright issues. The agronomist asked that we include links to, or ideally the full-text of, articles and presentations resulting from the data set. This opened a conversation about the researcher?s need to proactively secure the most rights possible prior to publication, so that the full-text can be legally included in the repository. However, he retains full rights to the primary data; others reusing the data attribute it to the original researcher. The librarian sent the data and all metadata files to the graduate student for ingest into the repository. ",Skills and aproaches,"? RELATIONSHIP-BUILDING AND OPPORTUNISTIC THINKING - ? WORK UPSTREAM - meet information needs at earlier stages of research ? TEAMWORK AND EXPERTISE ? EXPERIMENTATION, NOT PERFECTION - no right way to engage in data curation and there are no set answers to data management issues ? LOOK BEYOND THE LIBRARY - be aware of information work being done within the disciplines",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Bradbury 2010,Data management plans applicable to research,"? 5.9% =reporting = data management plans as not applicable to their research ? 22.4% unaware of how to prepare a DMP",Strongest skills in managing storage needs and backing up data,"? 87.5% s reported skill level in backing up digital data, as basic or greater ? 78.3% reported skill level in managing digital data storage needs as basic or greater",Assigning descriptors or metadata to datasets,"? 10.6% notapplicable to their research ? 28.3% unaware of the practice ? 25.2% aware but have no experience in the practice",Data retention,? 20.5% aware of the need to retain their digital research data according to legislative and funding body requirements but have no experience in doing so.,Data analysis,"Spreadsheets ? 47.2% high rates of competency in managing spreadsheets ? 12.2% requests for training (highest number of training requests) desire for more advanced skills (focus group) MS access ? 85% rated skills basic or higher ? desire for more advanced skills (focus group) ? rely on their own informal networks of expert users and need more distinct support for these products",Sharing,"? 39.4% had made data available to other researchers they had ? 17.3% had difficultiy sharing their data (related to size 61.4%, software 50%",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Bradic-Martinovic 2014,Principal activity of researchers,"? BH - economics (12%), sociology (13%), pyshchology (8%), education/teacher training (8%), poli sci (7%), journalism (7%), law (21%), other (24%) ? Croatia - econonmics (16%), sociology (19%), pyshchology (28%), education/teacher training (14%), poli sci (4%), journalism (2%), law (1%), other (16%) ? Serbia - econonmics (30%), sociology (9%), pyshchology (9%), education/teacher training (11%), poli sci (6%), journalism (3%), law (10%), other (22%)",Producing data,"How many datasets were produced during last 5 years? ? In each country over 50% of researchers confirmed that they produced five or more datasets during the past 5 years ? In all countries the largest numbers of researchers have produced between 6 and 10 datasets (BH 19%, Croatia and Serbia 24%) ? in BH 13% of researchers produced 11-20 datasets and 12% produced 21 and more. In Croatia 19% of researchers produced 5 datasets, while only 8% produced 10-20. In Serbia 15% of researchers produced 5 datasets ? Within last five years researchers completed 45% of all datasets in 2012, 36% in 2011 and the rest in 2010 and earlier (growth trend)",Methods of data gathering,"? In BH most researchers used either questionnaires (32%) or interviews (47%) in data collection in the last five years. ? In Croatia, the dominant method was surveys (53%) and quantitative (70%) or qualitative (33%) questionnaires (focus groups and interviews). Questionnaire (49%) was the dominant method of data collection in Serbia.",Funding,"In Croatia and Serbia approximately 40% of all research was financed by public funding through national science funding bodies, while in BH most of the research (40%) was financed through international funds/projects while only 7% had the support of national funding. International funding is also provided in the other two countries, but with a much smaller share compared to BH. In Croatia, 17% of research was funded that way, and in Serbia 26%. The rest of the projects had been funded by institutions which conducted projects, publicly funded from other sources, private sector, and other",Archiving practices and preferences - type of stored data,"? Most of the researchers keep the data in the raw form, data prepared for analysis (with transformations, created index, and recorded), or as cleaned data ?very little is well documented with metadata",Archiving practices and preferences - location,"? The dominant number of researchers keep the data in their own computers (in average over 50%) or several copies in different computers (in average over 40%), and only a few of them keep the data in some form of institutional repository (approximately 3%).",Archiving practices and preferences - access,? In all three countries current access to the data is dominantly limited to the research team; however most researchers think that data should be publicly available (open access) or at least available to the broader scientific community.,Archiving practices and preferences - archiving,"? In all three countries want to provide research data to archive if the data would be safe with regulated access because 45% of them (in average) answered with Yes, certainly and 40% answered with Yes, probably.",Use of data and secondary analysis - importance of sharing,"? More than half of the respondents (BH 75%, Croatia 51%, Serbia 64%), have stated that the sharing of research data is very important in their discipline and only 2% (on average) find it not very important ? Researchers in Bosnia & Herzegovina were more likely to share data publicly and also to think data sharing was an ideal ? 20% are currently sharing data publicly, vs 2-5% in the other two contries. 44% In B&H thought publicly available data was the ideal situation, compared to 30-35% in the other two countries.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Brandt 2014,The Toolkit questions required little modification for arts and humanities topic,"Where something seemed inapplicable, a relevant analogy was easily made - examples in text narrative","Types of ""data"" produced in Dance research were identified","These included: ephemera, concert dance documentation, collaborative . These items were digitized or in born-digital format such as photographis, graphics files and video.",Technical tools used in Dance were identified,"Tools used are common in digital humanities: Omeka, VenteCo Timeline JS, CSS/HTML/Javascript, Google docs, Youtube",No established metadata standard exists for dance and the performing arts,For Purdue projects they developed a schema constructed using elements of Dublin Core and VRA Core,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Bresnahan 2013,Training is needed for liaison librarians,"Top priorities were ""data lifecycles"" and ""data anlsysis"" followed closely by ""data sharing"", ""metadata"", ""data preservation"", ""data maangement plans"", and ""data citation""",Relevance of topics to job," ? All aspects of data are relevant to the job ? relevance scores increased for relevance in 5 years",Frequency of interaction with researchers,"Low for data-related topics - Data Lifecycles, Data Analysis, Data Management Plans, Data Sharing, Metadata, Data Preservation ",Preferred training formats,"? One-dayworkshops (74 percent), ? panels/presentations (68 percent), ? print handouts/guides (63 percent), and ? informal discussions (63 percent). ? online tutorials (47 percent), ? one-on-one consultations (42 percent), ? webinars (32 percent), and ? multi-day workshops (26 percent).",Common concerns ,"? Need for practical, hands-on training ? concerns about how best to provide outreach to researchers ? worried that the disciplinary differences among researchers would prove challenging",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Brewerton 2015,Summary,"? The RDM platform was delivered on-time, within budget and has exceeded the expectations of both the University and its research funders ? It is emerging as a showcase of Loughborough research and means that the University is in a fantastic position to take advantage of funding opportunities and hopefully attract future collaborators ? The project is a great example of what public/private partnerships can achieve and this platform is one that other institutions could readily adopt.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Briney 2015,University type and data services,"? Half (50%) of all the universities studied offered some data services through their library, with larger, more research-focused universities being more likely to offer data services than their counterparts. ? universities with a higher Carnegie classification (?2 = 63.13, p < 0.01), ARL membership (?2 = 28.03, p < 0.01), higher research expenditures (?2 = 58.91, p < 0.01), and larger faculty size (?2= 23.65, p < 0.01) are all more likely to offer data services through their university libraries. ? Notably, all universities with over $1 billion per year in research expenditures offer data services and a place to host data. The vast majority (89%) of these institutions also have a data librarian on staff. Additionally, this group shows the highest chance of having a data repository (33%).",University type and data librarian,"? Fewer universities have a data librarian on staff (37%) than offer data services. ? the tendency is for larger universities conducting more research to employ a data librarian. This is true for universities with a higher Carnegie classification (?2 = 50.45, p <0.01), ARL membership (?2 = 30.01, p < 0.01), higher research expenditures (?2 = 55.23, p < 0.01), and larger faculty size (?2 = 31.54, p < 0.01). ? there is no significant difference between public and private universities (?2 = 0.06, p < 0.80).",University type and repository,"? More than half of all libraries (65%) offer a place to host research data, either in an IR or in a repository specifically for data. ? However, fewer universities (11%) have dedicated data repositories as compared with IRs that accept data (58%). ? larger universities conducting more research are more likely to have someplace to host research data, as measured by Carnegie Classification (?2 = 8.98, p < 0.01), ARL membership (?2 = 5.33, p = 0.02), research expenditures (?2 = 19.36, p <0.01), and number of faculty (?2 = 8.88, p = 0.03).",University type and policy type,"? Out of 206 universities, only 90 had some type of university-level policy covering research data (44%). ?One-third of these university policies are IP policies that specifically include data (15% overall) while the remaining two-thirds are standalone data policies (29% overall). ?correlation exists across Carnegie Classification (?2 =6.15, p = 0.01) and ARL membership (?2 = 5.38, p = 0.02), but not research expenditure (?2 = 8.64, p = 0.12) or faculty size (?2 = 2.14, p = 0.54). ?universities conducting more research are more likely to have a standalone data policy. This is significant for Carnegie Classification (?2 = 9.37, p < 0.01), ARL membership (?2 = 4.99, p = 0.03), and research expenditure (?2 = 11.41, p = 0.04). The same trends do not exist for university types when data falls under the normal IP policy",Data services and policy type,"? Nearly half (44%) of all universities studied have some type of policy covering research data ?Half of all libraries with data services have some data policy, but this is not a significant difference from the average (?2 = 3.40, p = 0.07). ?However, universities employing a data librarian are statistically more likely to have some type of data policy (?2 = 7.38, p < 0.01). ?standalone data policies are more likely to be found at universities with data services (?2 = 4.23, p = 0.04) and a data librarian (?2= 5.76, p = 0.02), but not those that host data in any repository (?2 = 0.01, p = 0.93) or specifically a data repository (?2 = 0.68, p = 0.41). ",Policy type and policy contents,"? For the 90 universities that had data policies: over half of the policies designated an owner of research data generated at the university (67%) and required that the data should be retained for some period of time (43% for a specific period and 9% for a vague period). ?IP policies primarily covered data ownership (76%) with little attention given to other data management issues. Standalone data policies, on the other hand, covered many topics ?Over half of the data policies defined data (61%), identified a data owner (62%), state a specific retention time (62%), identified who can have access to the data (52%), and described what happens to the data when a researcher leaves the university (64%). Almost half of the policies (46%) also designate a data steward.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Brooking 2009,TIMTAM,"? TIMTAM documents precise experimental protocols via textual descriptions or by uploading external documents ? These protocols describe the exact laboratory procedures for each experiment type. ? The Web interface enables team leaders to monitor progress of experiments, any time and any where ? Data input was streamlined and data quality was enhanced through the incorporation of additional services that assist with target selection and construct design ",DIMER,"? The Diffraction Image Experiment Repository (DIMER) is an online archive for raw diffraction images ? Provides a secure online indexed storage of the images, prior to analysis, structure determination and publication ? Makes them accessible so they can be linked to from publications, searched by researchers, and integrated into other online databases",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Broom 2009,Qualitative data,"Whereas quantitative research produced ?mere data? that could be abstracted from all context and generalized via statistical modelling, qualitative research involved reciprocity and interactivity that did not easily fit with the idea of the data being taken ?out of the researcher?s hands?. The key distinguishing factor permeating each of the focus group discussions was qualitative research as ?art? and ?relationship? ? a topic worthy of more in-depth discussion. researchers viewed the data as their own production, rather than simply the outputs",Sharing,"? The idea that ?no-one else can understand my data? invariably permeated much talk around data sharing ? there was a strong theme in two focus groups around ?public knowledge? and the imperative of data as a community resource, not for the researcher?s own professional or research interests",Preservation,An archive was seen to exclude the kinds of intuitive and interactional elements that are key to understanding the nature of the data.,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Bruin 2005,SRB (Storage Resource Broker),"? The SRB provides a single logical file structure and a single access point for users, even though data are distributed over several locations. ? The SRB has a central metadata catalogue (MCAT) server that maintains information about all files within the SRB. ? A user job?s compute/data workflow follows a logical sequence: 1. The user places all input files into the SRB. 2. The compute job downloads the relevant files from the SRB to the eMinerals mini-grid clusters. 3. The compute job executes, generating several output files. 4. The job?s workflow places the output files into the SRB. ? One advantage to this approach is that the job life-cycle process generates an archive of the entire process, which is maintained on the SRB ? This is particularly useful for collaborative work",Overall,Setting up and managing clusters within grid environments is feasible without large investments in support or user testing,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Budroni 2014,"University of Vienna has the organizational framework, policies, infrastructure (data repository), and support services necessary to support RDM","? The paper focuses on the many agents, strategies and techniques who have to work together to keep digital information accessible and usable over the long term and sheds light on three important actors in the whole process: data creators, repositories and downstream users ?What was the starting point? ?What kind of research data is targeted? ?What is the organisational framework? Policies in place? ?What kind of support services are provided to researchers? ?What kind of infrastructure is provided? ?What have you learned so far? What?s next?",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Bull 2015,Empirical Research,"? 5 empirical studies found ? study 1 - 5 (24%) of respondents were in favor of sharing data via a databank; only four most common reasons for not sharing data were: inappropriate to conduct analyses not defined in the protocol, sharing on request for a specific purpose instead, other researchers may conduct inappropriate analyses, and further analysis is post hoc and data dredging. ? study 2 - 9 (60%) of authors receive specific requests and 12 (86%) receive general requests; 10 were prepared to release the data in principle subject to further discussions/conditions, 3 would not release data, 1 shared dataset, 1 willing to share dataset, 6 non-commital ? study 3 - 1 author asked for further details and then shared the dataset. 4 authors declined to share the data. When reminded of the journal policy, 1 said that a formal request to the research group was required, 2 said it was too much work, 1 said that he was not permitted to share the data and wouldn?t have published in the journal if he?d known it was a requirement, 1 declined to share data because more analyses were proposed. ? study 4 - 236 (74%) of respondents supported sharing de-identified data via repositories, and 229 (72%) thought investigators should be required to share data on request. 56 (18%) were required to deposit trial data in a repository by funders and 149 (47%) had received an individual request to share data; Concerns about sharing data through repositories included: potentially misleading secondary analyses, ensuring appropriate data use, ensuring clarity of data elements, indirect costs associated with sharing, colleagues? abilities to publish original research, ability to publish own research, scientific or academic recognition, direct costs associated with sharing, protecting commercially sensitive information, obtaining consent, and maintaining confidentiality ? study 5 - 25 (83%) of respondents thought a central repository would be valuable, 25 (83%) would be willing to deposit data in such a repository provided conditions met; 5 most commonly suggested conditions that needed to be met for deposition were: approval from primary data source, appropriate acknowledgment of primary data source, involvement of original investigators in the process, reassurance about who would access the data, and the presence of a scientific committee to review data access requests. ",Best Practices in Data Sharing - Principles and Considerations ,"? Ensure sufficiently broad access to realize the benefits to scientific innovation and public health, which are the main justification for sharing. ? Ensure data are used responsibly so that poor quality analyses do not harm public health. ? Treatment of researchers qualified to access data must be evenhanded. ? Data sharing processes must be accountable and transparent. ? Equitable: The needs of researchers, secondary users, communities, and funders should be recognized and balanced. ? Ethical: The privacy of individuals and dignity of communities should be protected and public health promoted by productive data use. ? Efficient: Proportionate approaches should build on existing practice to improve the quality and value of research. ? Ensure fair trade and not free trade in data. ? Ensure the rights and responsibilities of researchers generating data and data accessors are balanced. ? Ensure the benefits of data sharing outweigh the harms, and consider whether restricting the flow of information to avoid rare adverse events is appropriate. ? Clearly specify public interests in data sharing and clearly specify any legitimate reasons to restrict access to research data (following market approval of an intervention). ? Ensure that the analytic value of the data is preserved during the protection of privacy and confidentiality. ? Ensure data sharing processes are responsive to the context within which datasets were collected. ? Honor the altruism of research participants. ",Governed Data Sharing - Potential Benefits,"? Adequate safeguards can be established, bona fide access restrictions can be put in place. ? Patient privacy is increased. ? Poor quality research, which may lead to erroneous conclusions, can be prevented following review and requirements to adhere to a rigorous analytical plan. ? Permits compliance with legislation and or regulation. ? Promotes adherence to commitments made during the consent process. ? Enables researchers to fulfill responsibilities to ensure data are used ethically. ? Curation can be responsive to the types of data being shared. ? Differing approaches can be taken to aggregate and individuallevel data, particularly valuable or sensitive datasets, and analyses that require detailed data that could potentially identify participants. ",Priority Areas for Policy Development.,"? Appropriate analytic methods, data and meta-data standards, including means of preserving privacy ? Determining where, how, when, and which data are archived and made available ? Determining for which trials data will be shared, which data and supporting documents will be available, the process for data sharing, how transparent the process will be, who will get access, what types of analyses are permitted, who will decide, what criteria will be used, and what ongoing role the trial sponsor might have. ? Methods to permit evaluation of individual applications, including to ensure that the use does not harm participants and is in conformity with ethical approvals ? Transparent, explicit, and reasonable criteria for case by case decision making ? Requirements and rewards for the collection and curation of datasets for sharing ","Best Practices in Sharing Data From Low- and Middle-Income Settings","? Similar to those expressed in higher income settings ?differences include - stakeholders from low income setting suggested that co-authorship or at least the chance to publish an associated response or commentary should be offered to the researchers who produced the datase, the contribution of data creators may not be sufficient to warrant co-authorship of the secondary ?Although limited resources may be a hindrance to data sharing in higher income settings, they were identified as a very significant barrier in lower income settings - significant investment in human resources, technology, and infrastructure will be required ?Stakeholders from lower income settings also focused on the value of such collaborations to build capacity among researchers generating datasets", ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Burge 2012,Challenges for the future of biocuration,"? 78% indicated that securing funding to maintain and develop biodatabases was the major threat, ? 71% considered that dealing with the increased volumes of data was a significant challenge. ? 57% respondents to the difficulty of impressing on other scientists the importance and hence the need for funding of biocuration. ? 40% identified with the threat that biocuration might be perceived to be irrelevant if curators cannot keep pace with the current flow of data.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Buys 2015,Understanding of data storage & retention trends and needs,"? 6% use external data repositories ? 66% use computer HDs ? 47% use external HDs ? 50% use departmental servers ? 38% store data on instrument that generated it ?27% use flash drives ?31% use cloud ? most respondants expected to store published data indefinitely ? 31% did not know how much storage they required for their data",Understanding of data sharing behaviours,"? 60% share or plan to share their data 17% unwilling to share their data 23% unsure ? researchers more willing to share their data once results are published ? before publication, majority only willing to share within research group ? most common method to data sharing was through personal request (41%); repositories 9%; university-managed repository 7%",Need for consulting services,"? 45% had data management plan (DMP) ? those that did not have DMP, 58% said that they lacked information ? participants selected the following services: long term data access and preservation (63%), services for data storage and backup during active projects (60%), information regarding data best practices (58%), information about developing data management plans or other data policies (52%), assistance with data sharing/management requirements of funding agencies, and tools for sharing research (48%)",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Caetano 2014,Proportion of analysed articles that made data available,"Animal Behaviour (13%) and Behavioral Ecology (7%)",How common data policies are,"In category Behavioral Science in ISI (34%, N=49) explicitly encourage authors to store data in digital repositories or supplementary files, and requirements for data archiving prior to or after publication are nonexistent",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Campbell 2002,Percentage of faculty who made requests for data that were denied,"? 47% reported that at least 1 of their requests had been denied in the preceding 3 years ? 10% of all postpublication requests for additional information were denied ",Percentage of respondents who denied requests,"? 12% they had denied another academician?s request for data in the previous 3 years of those who denied access ? 80% reported that it required too much effort ? 64%, said they were protecting the ability of a graduate student, postdoctoral fellow, or junior faculty member to publish ? 53%, said they were protecting their own ability to publish ? 45% Financial Cost of Actually Providing the Materials or Information Transfer ? 28% Likelihood That the Other Person Will Never Reciprocate ? 27% Need to Honor the Requirements of an Industry Sponsor ? 23% Need to Preserve Patient Confidentiality ? 21% Need to Protect the Commercial Value of the Results","Influences on and consequences of withholding data","? 28% of geneticists reported that they had been unable to confirm published research because they were denied access to data ? 28% reported that they had ended a collaboration as a result of withholding ? 24% publication significantly delayed ? 21% abandoning a promising line of research ? 18% delaying sharing with that person or group ? 13% refusing to share with that person or group ",Changes over time in perceived willingness to share data,"? 35% said that sharing had decreased during the last decade 14% said sharing had increased",Comparison to other disciplines,"? Geneticists were as likely as other life scientists to deny others? requests (odds ratio [OR], 1.39; 95% confidence interval [CI], 0.81-2.40) and to have their own requests denied (OR, 0.97; 95% CI, 0.69-1.40). ? However, other life scientists were less likely to report that withholding had a negative impact on their own research as well as their field of research.","Factors that were significantly associated with an increased likelihood of denying others? requests","? Received a high number of requests in the last 3 years ? having engaged in commercial activities",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Campbell 2003,Measures of Attitudes Towards Data-Sharing and Data- Withholding,"Agree 30.5% Freely share information with all academic scientists prior to publication 98.7% Freely share information with all academic scientists after publication 77.9% Be more cautious when sharing with industry than with other academics 67.5% Keep findings secret to ensure priority in publishing 52% Keep findings secret to ensure priority in commercial applications 66.6% Receive some direct benefit from sharing information 51.3% Refrain from conducting classified research 54.1% Refrain from participating in trade secrecy ",Institutional Polices,"72% Does institution have policy to prohibit research that can never be published without sponsor consent? 37% Does institution have policy to prohibit receipt of biomaterials from other scientists without a MTA? 47% Does institution have policy to prohibit sending biomaterials to scientists without a MTA? 47% Does institution have policy to prohibit agreements that delay publications beyond time needed to file a patent? 93% Does institution have a policy to require invention disclosure prior to seeking independent commercialization? 0% Do any formal policies apply only to research in genetics?",Impact of Withholding on Technology Transfer,"Experienced impact at least once - In the past year, how often has the publication of research outcomes resulted in: Denial of foreign patent application 94% 74 Denial of US patent application 82% 65 Loss of existing commercial partner 50% 39 Inability to attract new partner 71% 55 Threat of a lawsuit 32% 25 Actual lawsuit 13% 10 In the past year, how often have sponsors' restrictions on data sharing resulted in: Failure to negotiate an MTA with another academic institution 46% 36 Failure to negotiate an MTA with a non-academic institution 85% 66 Failure to grant a license 48% 38 Failure to receive a license 27% 21 ","There was agreement that scientists should freely share data and information after publication, and disagreement about such openness before publication",99% agreed with sharing after publication; 30% agreed and 70% disagreed that data sharing should occur prior to publication. 77% agreed that academic scientists should be more cautious about sharing with industry scientists than with other academics.,Institutional policies existed regarding sharing findings,93% reported their institution had a formal policy requiring investigators to file an invention disclosure prior to independently seeking to commercialize results. 72% said their institution prohibited research being conducted that cannot be published without consent of a sponsor. Around half prohibited research agreements that delayed publication beyond the time needed to file a patent. ,Publication sometimes had a negative impact on technology transfer process,"94% indicated they had a foreign patent application denied because research outcomes were already published. 81% for US patent. 71% said they were unable to attract a commercial partner for this reason, and 50% had lost an existing commercial partner. ",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Campbell 2015,Comparison of the literature database with the submitted and approved animal ethics applications for New Zealand,"? 49.2% of all permit applications made between 2000 and 2012 remained unpublished in the peer-reviewed scientific literature. ? for the entire Australasian region, we estimate that approximately 600 animal telemetry studies, consisting of approximately 15,000 tagged animals, remain unreported in the peer-reviewed literature ? we estimate that approximately 50% of projects undertaken in 2010, 2011, and 2012 may yet to be published",Search of animal telemetry data-repositories,"? Approximately half of all Australasian telemetry projects undertaken between 2000 and 2012 were not discoverable via on-line resources ? much smaller proportion of animal location data (8.8%) recorded during the same period were accessible for viewing or downloading.",Re-use of animal telemetry data,"Three levels by which animal telemetry data can be shared to the benefit of the wider community of ecosystem researchers: 1) synopses of data through presentations and publications; 2) storage and discovery of project meta-data; and 3) storage and discovery of the raw animal location data with appropriate meta-data. To ensure that animal telemetry data-collections are secure, consistent, managed efficiently, effectively disseminated, and not lost over time, we argue that the third level (storage and open-access of the raw animal location data) is the most appropriate action.",Approximately half of all Australasian telemtry projects undertaken between 2000-2012 were not discoverable via online resources,,Telemetry data requires a lot of metadata ,"Datum, coordinate system of locations recorded by the collar, time zone, dat and time of release of animal, data and time of collar recovery, detailed descriptions of any filtering or pre-processing, type of capture and tracking technoloyg used, attchment technique, method of caputure and release, weight of tag, etc etc",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Capocasa 2016,Bio-specimens collection,"consent ? open consent - most common ? informed, specific consent - frequent ? opt-out option - very few strategies for collection ? 47.8% store biological samples collected by external research groups",Bio-specimens - accessibility,"? 100% make it possible for researchers to gain access to their bio-specimen collection ? 95.7% require specific conditions to be satisfied in order to give permission to access their samples ? 54.3% refer to a specific legal framework for access to their biological samples.",Data collection ,"? 89.1% store data extracted from the analysis of their own samples. ? of these, 56.1% also store data produced by external research groups that have used their samples. ? 73.9% offers this service only if the external research groups follow the same same legal framework of the biobank in question",Data accessibility,"? 85.4% allow external research groups to access their data in compliance with certain conditions ? 57.1% refer to a variety of legal frameworks, depending on the legislation of the country where they mostly operate",Clarity of answers,"? Most of the biobanks we contacted provided vague or difficult-to-read information about their accessibility criteria ? This is an important result which shows that there is still little clarity and a certain reluctance to share scientific resources",Summary,"? Applicants are requested to explain what they would like to do with the required resources. ? availability, amount and origin (public or private) of research funds are aspects involved in the establishment of fruitful collaborations between biobanks and research labs ? recognition of co-authorship is a requirement for some biobanks in order to grant access to their data ? To sum up, these results suggest that economic and academic aspects are involved in determining the extent of sharing of samples and data stored in biobanks ? observed high heterogeneity of the requirements to gain access to the biobanks' resources",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Carlson 2011,Faculty interviews - overview,"? Generally, faculty in this study expected their graduate students to be able to carry out data management and handling activities. ? Typical responsibilities of graduate students included processing or cleaning the data to enable use or analysis, assuring quality of the data, compiling data from different sources, and organizing the data in ways that it could be accessed and used by project personnel. ? training graduate students received and the methods through which this training was delivered varied widely ? faculty in this study presented a mixed picture in assessing the work of their students in this area ? The overwhelming majority of researchers in this study felt that some form of data information literacy education was needed for their students",Faculty interviews - graduate students' deficiencies in data management,"? Metadata (how to apply it) ? standardizing documentation processes (high-level and local) ? maintaining relationships among data (master files and versioning) ? ethics (intellectual property rights and ownership of data, issues of confidentiality/privacy and human subjects, implications and obligations of sharing (or not sharing) your data with others (including open access), and assigning attribution and gaining recognition of one?s work) ? quality assurance ? basic database skills(relational, SQL) ? preservation (back-ups)",Student results - issues that arose/students struggled with,"? Preservation / Archiving ? Metadata ? Data sharing workflows and technology",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Carlson 2013,Data Documentation and Organization,"? Lack of clear and shared expectations as to how data should be documented and organized ? approaches were largely informed by their previous experiences or training in working with data, to the extent that they had any. ? A significant obstacle cited by students is the lack of known and agreed upon standards for managing, describing, organizing, and sharing data in agronomy and related fields ? Handling and Use of Lab Notebooks - structure of the lab notebooks hinders the utility of the documentation it contains ? Handling and Use of Electronic Data Files - description and organizational frameworks developed by students were generally geared to meet more immediate needs rather than to support mid- to long-term usage of the data",Data Sharing,"? Attitudes toward Sharing Data - In contrast to their general support of sharing research data, it was difficult initially for many of the students to conceptualize what value their data might have for others ? Data Sharing at the Disciplinary Level - disconnection between different fields of study and the intended uses for the data are a serious impediment to sharing data effectively ? Data Sharing at the Local Level - uneven levels of awareness about what data sets were being generated or managed by others at the WQFS ? Lack of Models and Structures for Sharing Data - no large data repositories serving as community resources, journals in which they expect to publish do not accept data as a supplementary file for publication",Long-Term Data Management,"? Data Inheritance - data sets they inherited did not contain much in the way of description or documentation. Instead, their advisor, served as the primary, if not the only, source of information ? Lack of an Infrastructure to Maintain Digital Data Sets - students had really given the long term maintenance of their digital data sets much thought or taken action to ensure long term access to their data. This is in contrast to the physical data samples collected. ? Data Tracking and Security - students generally assumed that the computing resources provided by their academic department were adequate and that the security of the data was being addressed by their IT unit, students did report making back-ups of their data, in several different fashions, frequency of their backups varied, kept earlier iterations of their data files",Ownership and Authority over the Data,"? Students did report that they feel their data is not really ?theirs? at all, but that it belongs to their advisor and the WQFS lab ? When students do not perceive themselves as having decision making power over the data, they are disinclined to do more with the data beyond that which simply satisfies their immediate and individual research goals.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Castro 2013,Outcome,The new Fracture Mechanics profile (metadata) has been validated by the researchers and will be applied to their datasets,Metadata elements requirements,"? Interoperability ? meet specific research data management needs ? retrievability - At first, when confronted with simple and qualified DC elements, the head researcher pointed out that only a few elements were in fact needed to document their data, and that he would be satisfied with a small subset of elements (title; creator; subject; date). Nevertheless he stated that finding a particular document was a major time-consuming activity. This opinion began to change as he became aware of the opportunities brought by more detailed data description, particularly when consider data sharing and retrieval within the research group.",,"Group maintained they did not need a lot of detailed metadata to support their work. However, this opinion did change over the course of the interaction, as the head researcher became aware of the opportunities brought by more detailed description for data sharing and retrieval within the group",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Castro 2015,A systematic process for developing ontologies for description of research data [Take-home Message],? An ontology was created for research data in the context of vehicle simulation.,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Chad 2014,Summary,"? The University is taking a holistic approach and is embedding RDM fully into the wider research management landscape. The focus on quality participation from researchers was a critical success factor, and caused a change of direction in delivering what it is felt is a better solution today.",Key themes identified from interviews,"? There were barriers to sharing information about research. ? Systems were not joined up to enable a comprehensive or coherent approach to provide visibility of local needs and activity across the university. ? There were significant issues about workload and a lack of trust in university systems. ? This contributed to the sensibility of mistrust around any provision of ?top down? solutions that would add to the workload. ? Changes to RDM practices were not high on the list of ?problems? as researchers reported they ?just get on with it?, and so were not inclined to change ? concluded that a solution dedicated solely to RDM would be unlikely to gain a sufficient level of engagement as researchers did not see it as a priority when other issues around the management of research were more pressing",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Chao 2012 (+ Companion),Data curation and small science,"The application of Jackson et al.?s typology of rhythms to scientific data use illuminates several key areas where potential data curation and management services could support small science research environments.",Earth science examples,"In considering the features of rhythms, use of physical rock samples and computational model output data were influenced by multiple rhythms that could be attributed to sustaining or ending their utilization.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Charbonneau 2015,Journals with policy,"? 72% (n=36) of the journals contained some form of data policy ? data sharing was ?required? for 40% (n=20) of journals ? ?optional? for 18% (n=9) ? ?other/partial? for 14% (n=7) of the journals with data policies",Accessibility of data,"Out of 26 with journals that discussed accessibility of data ? majority of the journals (n=23) expected the data to be open ? a few noted that the data would be held in closed repositories (n=2) ? access to the data upon request was sufficient (n=1) ? in closed systems data held by publishers",Potential exceptions to sharing data,"? 28% (n=10) of the journals with data policies had exemptions from providing data if it was difficult to obtain, or if reuse was for commercial purposes ? in 2 of the journals? policies fees could be applied as a way to recoup the costs associated with providing data",Who should have access,"In 53% (n=19) of the 36 journals with discoverable policies ?reviewer, found in 50% of the journals (n=18) ?for general reader (44.44%, n=16). ?Editors, publishers, and researchers were additional roles mentioned, with each appearing within the journal policies of two (5.55%) websites.",Location,? Some policies specified a repository,Length of time,? Approximately 22% (n=8) of the journals with data policies,Range of data types,"Wide variety of data types appeared in policies ?non-specific datasets, found in roughly 78% (n=28) of the journals? policy documentation ?protein/DNA sequences, discovered in approximately 61% (n=22)",Formats,? 44% (n=16) of the journals with policies mention the topic of file formats in the context of the research data,Metadata,"? 17% (n=6) of the 36 journals with data policies discussed metadata. ?Of these, several journals (11.11%, n=4) indicated that descriptive captions should accompany the data.",Consequence and monitoring,"? None of the journals with data policies discussed if, or how, they would monitor authors? compliance with their guidelines. ?Furthermore, less than half (38.88%, n=14) of the journals stated the actions they would take for author?s non-compliance with their data policies. ?The consequences for failing to comply with publishers? policies consisted of contacting the authors? institution(s) (n=7), and not publishing the manuscript (n=6). A final response, not reviewing the manuscript, was found among several (n=3) of the journals? policies.",Policy strength,"? 56% (n=20) strong (journals required data sharing and accompanying evidence, such as accession numbers) ?44% (n=16) weak (policies merely suggested, or requested, that data be shared, but that data sharing was not enforced)",,,,,,,,,,,,,,,,,,,,,,,,,,,,, Cheah 2015,Participant experience with data sharing,"? Researchers? experiences were largely limited to sharing data with collaborators they knew ? Some senior researchers had, however, had experience of making a limited data set of raw data open access following publication of the analysis in journals",Benfits of sharing,"? All of the researchers we interviewed regarded data sharing in a broadly positive light ? Promoting scientific progress ? Better analyses and larger data sets ? Greater accountability and transparency ? Better use of resources ? Benefits for researchers and research groups ",Harms/concerns of sharing,"? Worried about the potential harms of, and barriers to, successful data sharing ? Potential harms to patients and their communities ? Potential harms to researchers and research groups ? Demands on resources ",Suggestions for Best Practices in Data Sharing,"? During their interviews, participants suggested a number of ways in which the harms and worries they have might be addressed ? Resources and capacity to ensure good quality data ? Consent ? Governance ? Open access ? Managed access ",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Cheikhi 2013,Dataset openness,"In spite of the large number of datasets available in PROMISE (84), only 37 provide both available data file and description of attributes of the datasets, and therefore cannot be used directly without contacting the dataset owners",Dataset description,"? The purpose of the PROMISE repository is to provide software engineering communities with reliable and real data that could be used in replication studies. ? However, to conduct this kind of study, information on the past usage of these data, that is, context of use and results, is required. ? This information is only available in published papers that have used these data before. ? Unfortunately, past usage information is not readily available on the majority (70) of the 84 PROMISE datasets",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Chen 2013,Disciplinary needs,level of detail required for each metadata element varied greatly between different disciplines,Format field,"Some research datasets contain very complicated file formats, and might not be successfully machine-harvested. Therefore, we open this metadata field to be both automatically generated and manually input by researchers.",levels of metadata,"Researchers often confused our metadata with item-level metadata. Therefore, the concept and advantages of collection-level metadata would have to be promoted and understood by researchers before fully implementing data curation services.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Choudhury 2008,Data curation,"? Astronomers have identified the preservation aspect of data curation as a critical requirement ? the most highly processed datasets that are derived by individuals from analyses of large databases often reside on websites, individual workstations, etc. ? Without a systematic effort to capture and archive them, these datasets (like those from ?small science? projects) remain at risk. ? There is a real urgency on the part of the astronomers related to this particular topic ? For this reason, this topic of preserving and representing these aggregations of articles and derived datasets has proven useful for promoting the use of repositories, including the institutional repository. ? While depositing these objects into a repository does not constitute preservation, it is an important step toward systematically capturing objects and attaching or generating preservation metadata (e.g., checksums).",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Colledge 2014,Types of consent,"? General - majority preferred general form of censent rather than a specific IC (i.e. one for the current research project only) ? project specific - important limitation to data sharing ",Ethics committee,"? Ethics committees appear to play a crucial role in imposing limitations on biobanking and sample sharing. ",Notable findings,"? Although general consent is not only tolerated but explicitly recommended by the Swiss Academy of Medical Sciences, researchers perceive difficulties in using general consent mechanisms for their research. An important factor seems to be ethics committees' reluctance concerning the approval of general consent forms. ? Instead of obtaining general consent, many researchers reported experiences with using large numbers of samples without explicit consent and the procedure of reconsenting? despite the associated difficulties ? Reconsenting was considered logistically impossible, the concept of using samples without consent was controversial.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Collins 2013,Open data,"Interviewees from industry were generally very supportive of the principle of Open Access and saw no conflict between this and their organisational aims. In both cases, interviewees stressed that external pressure tends to be greater around open data and other kinds of output than OA to journal articles: neither had a firm organisational policy on OA and they did not anticipate this developing in the near future.",Publishing data complications - privacy,"? Publishing data and other kinds of output are very important to funders, although complicated: ?safe data, open by design? - In other words, the presumption is that data will be protected, but the access design will be open to allow researchers with a valid research and analysis plan to reuse the data",Publishing data complications - standards,"Funding and skills needed to make data truly reusable - lack of standardisation and the consequent need for researchers to spend considerable time and effort to explain exactly what the data mean and how they were collected. One interviewee mentioned peer review for data",Publishing data - prioritization,"Prioritise the content that researchers would actually want to see, although it is not clear how this prioritisation would be carried out.",Who publishes data,"Not all interviewees were convinced that publishers should become heavily involved in research data.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Conklin 2013,Ecologists do discuss data curation and preservation at their meeting,"17/470 sessions discussed data curation, and 5/470 were on preservation",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Conner 2013,Benefits,"? Determined that the Hydro Server Lite approach to data management was the most cost effective and efficient way to implement a structured data storage and dissemination plan. Because of the introduction of this approach we can facilitate data sharing within the lab and can introduce novel learning activities in classes that require students to use actual data. ? has enforced a data and, more importantly, a metadata collection standard on data collection in our lab. ? a single repository for all of our data, which means that they are now available to any of the researchers in our lab anytime they need themfor an analysis ? concrete data management plan, with substance, that we can show to prospective funding agencies as we compete for funding sources.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Cooper 2008,Rate of sharing with participants,93.6% had shared data or results in some way with participants,Frequency/timing of sharing,"When asked how often they had shared ? 33.7% ?more or less continually.? ? 30% ?quite often,? ? 29% ?a few times.? ? 6.7% ?once.? Meanwhile, ? 12.7% shared during the course of a project ? 16% after its completion. ? 71.3% shared both during and after a project.",Sharing and project length,"Those whose research is part of a long-term project appear significantly more likely to share more often than those involved in a single project (chi square = 12.692; p = .005).",ways sharing occurred,"? Informal discussions 157 83.5% ? Showing photos or other materials 125 66.5% ? Professional publications 117 62.2% ? Oral presentations 114 60.6% ? Returning interview transcripts 70 37.2% ? Depositing materials in an archive 69 36.7% ? Popular publications 66 35.1% ? Focus group discussions 43 22.9% ? People learning about research through media reports 35 18.6% ? People reading publications without having been given them 30 16.0% ? open-ended answers included:",With Whom Sharing Took Place,"? Main consultants 142 75.5% ? Other study participants 135 71.8% ? Community groups 85 45.2% ? Institutional officials 85 45.2% ? Whole community 63 33.5% ? Governmental officials 56 29.8% ",Parts of Research Not Shared,"? Participants? personal data or identifying details 43 29.1% ? Sensitive (including politically sensitive) material 31 20.9% ? Raw data or interim reports 19 12.8% ? Material that was academic, too theoretical or technical 11 7.4% ? Data on illegal activities 9 6.1% ? Material likely to provoke conflict 6 4.1% ? Gossip 6 4.1% ",Rate the experience,"? 43% strongly positive ? 46% positive ? 10% neither positive nor negative ? 1.2% negative or strongly negative. ? researchers with a larger number of career field projects tended to rate their experience lower than those with fewer field projects (Spearman?s rho = ?.261, p = .001). ? tendency for those who reported having shared more often to evaluate the experience less positively (Spearman?s rho = ?.281, p < .001)",Problems encountered,"? Practical problems (e.g., time, resources for translating materials, difficulties in contacting people or organizing meetings, career conflicts) 29 15.9% ? No problems reported 23 12.6% ? Accessibility of the research to participants (for example,because of the use of technical language or translation issues) 21 11.5% ? Participants having questioned or having been angry about the data or interpretations 18 9.9% ? Participants showing little interest or being unwilling to take the time 14 7.7% ? Internal conflict as a result of sharing (e.g., because of factionalism or the use of the data by community leaders against others) 12 6.6% ? Other problems cited less frequently included findingappropriate means for sharing, internal confidentiality issues, trust issues (for example, researchers thought to be spies or to have breached confidentiality by some), researcher feelings (for example, embarrassment or feeling upset when people attacked the researcher over the results), and conflicts with other researchers on team projects ",Benefits,"Participants ? All benefits to participants 47 17.3% ? Greater knowledge of themselves 10 27.0% ? Feelings of personal recognition 9 24.3% ? Feelings of pride, ownership, or happiness 7 18.9% ? Empowerment 4 10.8% ? More incentive to participate in the research 3 8.1% ? Greater knowledge of others or of their group 3 8.1% Communites or groups ? All benefits to communities or groups 68 25.1% ? Increased group knowledge or awareness 14 25.9% ? Help in solving community problems 10 18.5% ? Maintenance of culture 9 16.7% ? Data useful for future community plans and projects 7 13.0% ? Greater operational insight or improved practices for institutions 4 7.4% ? Advantages vis-…-vis other groups or communities 4 7.4% ? Economic benefits 4 7.4% ? Community pride 4 7.4% researchers ? All benefits to researchers 151 55.7% ? Improving the quality of the research, by correcting or validating data or interpretations 55 36.4% ? Helping the researcher to gain rapport, credibility, or trust 40 26.0% ? Increased learning about the group 17 18.8% ? The researcher gaining good feelings about him or herself 9 12.3% ? Increased learning about oneself and personal growth 6 11.0% ? Helping career development, e.g., through improving the quality of the research 4 8.4% ? Research being viewed more favorably by participants 4 5.8% none ? No benefits reported 4 1.5% ",Harms / threats,"Participants ? All harms to participants 61 34.3% ? Privacy, confidentiality, anonymity 24 39.3% ? Psychological (e.g., embarrassment, sorrow, hurt feelings, etc.) 12 19.7% ? Social (e.g., to reputations, jealousy, risks to relationships, community retribution, community standing) 10 16.4% ? Political (e.g., loss of position or power, personal danger, possible loss of external services) 9 14.8% communities or groups ? All harms to communities or groups 26 14.6% ? Internal conflict 19 73.1% ? Danger from external actors 4 15.4% ? Revealing hidden knowledge or issues 2 7.7% researchers ? All harms to researchers 57 31.8% ? Loss of rapport 13 22.8% ? Threat to future research opportunities 9 15.8% ? Psychological 5 8.8% ? Threatens present research success 5 8.8% ? Inhibits full account of data 5 8.8% ? Physical or political danger or risk of legal sanctions 5 8.8% ? Career concerns 3 5.3% none ? No harms or risks reported 34 19.1% ",Expectations about sharing had changed,"? No expectations of sharing obvious to researcher 44 21.1% ? Expectations had increased or changed 30 14.4% ? Surprise or appreciation at sharing 15 7.2% ? Expectations of receiving finished products 15 7.2% ? No or little change observed 15 7.2% ? Expected to receive transcripts or own data 8 3.8% ? Expectations of receiving other help, not data or results 6 2.9% ? Community never expected to see materials 5 2.4% ",Reasons for Sharing,"All responses on advisability 71 ? Always or generally advisable 33 46.5% ? Indeterminate responses 27 38.0% ? Relative to situation 3 4.2% ? Matters what is shared 3 4.2% ? If researcher feels it will make a difference 2 2.8% All reasons given 89 ? Right, ethical thing to do 22 20.8% ? Helps improve data quality and/or interpretations 16 15.1% ? Makes research more useful to the group or community 11 10.4% ? Helps the researcher improve levels of rapport and trust 6 5.7% ? Required by others (i.e., not the group itself) 6 5.7% ? Helps favorably to alter the power balance between the anthropologist and the people being studied 6 5.7% ? Was required by the group or community 5 4.7% ? Presupposed in participatory or collaborative research 5 4.7% ? Represents reciprocity on the part of the researcher 5 4.7% ",Reasons for Not Sharing.,"? To protect confidentiality, anonymity, and privacy for individuals 72 32.9% ? Not to endanger or harm individuals 58 26.5% ? To avoid fueling internal conflict or because of internal power differences 17 7.8% ? Not to endanger the group or community 13 5.9% ? To protect the researcher 10 4.6% ? Because of a perceived lack of interest 9 4.1% ? Practical issues 5 2.3% ? Protect secret or community knowledge 4 1.8% ? Because of language barriers 4 1.8% ",Who Should Decide About Sharing,"? Researchers alone 66 41.8% ? Researcher with select participants, e.g., elders or main consultants 34 21.5% ? Researcher with the group or community, in some cases specifying that it should be through prior negotiation 22 13.9% ? Depends on the research situation 8 5.1% ? Not sure 8 5.1% ? Ethics boards 7 4.4% ? Participants 5 3.2% ","Should be a part of graduate training for cultural anthropologists?",95.5% agreed that it should be,,,,,,,,,,,,,,,,,,,,, Corrall 2013,Current and planned RDM services,"? Assistance to use available technology, infrastructure, and tools ? Guidance on the handling and management of unpublished research ? data, for example data literacy education and/or training ? Support for data deposit in an institutional repository ? Support for data deposit in external repositories or data archives ? Finding relevant external data sets ? Technical aspects of digital curation ? Developing data management plans ? Developing tools to assist researchers manage their data ? Development of institutional policy to manage data (Table 3 breaks down by country)",Target Users for Services,"? Academics/researchers ? research students ? schools/faculties ? university wide ? specific disciplines (Table 5 gives breakdown by country and current and planned services)",Constraints,"? Bibliometrics/RDM is not a priority for your library. ? Bibliometrics/RDM is not perceived by others as a library role. ? There are different levels of demand across academic departments/ schools. ? There are different specialist needs across disciplines and subjects. ? Library staff require additional knowledge or skills. ? Library staff require additional confidence to work in this area. (Figure 3. shows data by country on constraints on developing RDM services.)",Current staff education,"? Prior to joining,as part of their LIS or other education ? Within thelibrary, through inservice training or seminars ? Are self-trained ? Learn on-the-job ? Library-funded external professional development ? Other (Table 6 give numbers by country)",Knowledge and skills needed,"? Data curation skills ? Technical and ICT skills ? Required subject and/or disciplinary knowledge ? Knowledge of research methods ? Knowledge of research processes ? Other (competency in metadata schemas, national legislative and policy context, skills in data discovery and research data interviews (areas identified as requiring the most development of knowledge and skills for library staff to work in RDM was data curation, closely followed by technical and ICT skills, Table 8)",Education and training needed,"? Preparatory professional education programs ? Continuing professional development options (vast majority (around two-thirds of the total sample) favored externally provided education or training over in-house training and development, Another striking finding across the sample was the emphasis placed on understanding the research environment at both macro and micro levels, including research processes, methodologies, and workflows, as well as university and government agenda)",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Cox 2012,Focus group summary,"Staff desired to move beyond abstract discussion of potential roles, towards gaining hands-on experience",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Cox 2014,?Does your institution have a formal research data management (RDM) policy in place?? ,"? 30.86% Yes ? 43.21% No, but expect to have one in place for the next year ? 17.28% No, with no plans for implementation in the next year ? 8.64% Don't Know",Change in the culture of RDM over the previous year,"? Most of the respondents (53 or 70% of the 76 who answered this question) stated that the ?culture of RDM? had in their view changed in their institution in the last year. ? Another 15 (20%) stated that it had not ? eight respondents (12%) uncertain. ",Current research data management services and future priorities ,"? 2 types of role had significantly more than half of institutions already offering provision: ?raising open access to data and RDM policy issues? (64%) and ?advice on copyright and IPR issues relating to RDM? (63%). ? Around 50% of institutions said they offered some level of service in terms of how to cite data (56%), and awareness of reusable data sources (50%). ? 41% reported they were undertaking early career researcher awareness- raising activities and 36% postgraduate research student training. ? only around 20% of institutions providing any service. ? open-access and policy advocacy is the most common current activity and also the top priority. ? Running a data repository is also commonly seen as a priority; here the gap between aspiration and current activity is greatest.",major challenges for libraries with RDM ,"? of 123 different items the most common answers were connected with the issues of skills gaps (20) or resourcing (18) Other challenges included incouraging others to recognise RDM as a priority (10), working with other professional services (9), supporting the wide range of data management practices across different disciplines (7), and getting the library to be taken seriously (7). ",Skills and training needs ,"? About a third of respondents who replied to the question, said that they thought the library did have the ?right skills to play a significant role in RDM ? Over 50% said the library staff did not have the right skills, but these replies were qualified too, acknowledging that they had some of the skills needed. ",Charging of services,"? Most chose not to answer this question, which presumably can be taken as a response that charging cannot currently be confidently said to be appropriate for any services. ? Others stated that RDM activity was too immature in their institution to provide an informed answer to this question. ? Of those who did answer the question, 20 suggested that data storage costs might be chargeable, particularly if storage requirements were unusually large ? Some suggested that more work needs to be done to ensure that appropriate levels of external income of various kinds is channelled to support institution-level services, including RDM",Advocacy,"? 28 respondents (37% of the 76 answering this question) stated that in their view RDM was ?best approached through institutional advocacy and support rather than subject-community advocacy and support? and 12 (16%) disagreed ? a large number of respondents (36 or 47%) were reluctant to choose between institutional or subject-community approaches. ? Several respondents commented that subject-based support is likely to be variable across different disciplines with institutional approaches filling gaps and providing consistency for reasons such as regulatory compliance",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Cox 2016,Many stakeholders,"? In many cases, a working group or task force chaired by the Pro-Vice Chancellors had been set up reporting to the research committee ? The pattern of who was working on the problem was different between different institutions. Stakeholders included: - Computer/IT services - Research support services - Legal advisory services - Records management services - Research leadership,",leadership,"? There was disagreement about whether librarians should have a leadership role. ? generally felt that there was no professional services stakeholder for whom RDM support is a core capability ? unclear who should own/lead RDM",Engagement/collaboration,"? A larger number reported scepticism (or at least lack of engagement) from the academic community ? The distributed power structure of many universities, where there is no single point of decision and many are loyal to disciplinary (or professional) communities beyond the institution means that reaching decisions in an area of multi-professional concern is complex ? finding who were the right contacts to build basic networks was one of the early challenges",stakeholder perspectives,"?Asked to rate the importance of a number of key drivers (storage, security, sharing, compliance and preservation) most interviewees recognised all as important to some stakeholders. But it seemed that the drivers were stronger for some actors than others. IT staff focused on active data storage and security whereas librarians were more interested in preservation and sharing. Thus stakeholders saw the problem differently.",Solutions ,"? Solutions ranged from deployment of new technologies to the creation of new advisory services. ?One common approach was to develop a draft policy and use this as a basis for discussion within the institution. ?a sense among interviewees that however problematic now, solutions would emerge, partly because of the external drivers and partly because of community work on solutions and good practices ",Problems,"? The problem was such that RDM was seen as requiring a ?culture change? ?Resourcing for RDM projects and services was also generally unclear for the library and more widely ?Some interviewees reported library organisational restructuring would be necessary in the context of RDM",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Cragin 2010,Forms of sharable data?sharing what and when ,"? High level of variation in what is considered a sharable dataset ? composite datasets common ? Across the 20 cases, image formats (n = 4), databases (n = 4) and tabular (spreadsheet) data (n = 10) were the most common form of data ? Participants generally had positive views of data sharing and expressed openness to sharing their own data, particularly with people in their field to better advance their area of research ?60 per cent (12/20) of the participants identified a need to restrict some or all of their data from public access for some length of time. ? Of the 12, five would still require an embargo period first, ranging from 1?3 months to 2?5 years. ? For 8 participants, sharing any data before publication or embargo was strictly limited to known and trusted individuals who were either immediate collaborators or known associates ? (10%) reported that their data have reuse value for 3?5 years, and four participants (20%) reported a preservation period of 5?10 years. ?13 participants (65%) reported that their data had reuse value for a minimum of 10 years, and 4 of those reported their data having value for reuse for an indefinite period of time. ",Distinguishing private and public sharing ,"? Two fundamentally different needs : keeping data private or making it public ? Supplying data involves either targeted transfer of data to and from current collaborators or close colleagues or distribution on request; these have the effect of keeping data private to some extent",Sharing with the wider public ,"? Sharing practices are constrained by data-management pressures and personal experience, which include the desire for personal control of one?s research products and proper recognition or reward ? Supplying data to others beyond bounded collaborations was often based on a decision process that weighs familiarity (i.e. how well known is the person requesting the data) with data-preparation complexity or cost (e.g. the time it takes to prepare data)",Avoiding misuse ,"? Misuse incidents experienced by scientists in this study influenced their views on the appeal of data sharing, decreasing their willingness to share and increasing their cynicism in data-sharing initiatives ? A small number of participants also described incidents of, or concern about, wrong or inappropriate interpretation of data, and some have developed strategies to guard against this kind of misuse",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Creamer 2012,Current services,"? 30 librarians (47.6%) did not currently provide DCM services ? 43 (69.4%) stated their library has developed or is in the process of creating a data management plan for the library or related library services policies or strategic plans. ? majority stated the plans were in the early stages",technical competencies needed (skills),"? I use Web 2.0 technologies: 62.2% (28) ? I provide data archiving and preservation services: 44.4% (20) ? I work with and/or develop virtual tools to manage and curate data 42.2% 19 ? I work with a variety of metadata standards (e.g. interoperability standards and language such as Dublin Core, MODS, and OAI-PMH, etc.) 42.2% 19 ? I build, populate and maintain digital databases 40.0% 18 ? I maintain an institutional repository 40.0% 18 ? I provide data mining, interpretation, representation and visualization services 37.8% 17 ? I work with metadata manipulation, crosswalk, validation and portals (e.g. description, indexing, storing, etc.) 28.9% 13 ? I use a variety of programming languages (e.g. XML, SQL, etc.) 28.9% 13 ? I work with and/or develop digital lab notebook applications 22.2% 10 ",Non-technical competencies (services),"? I promote digital data sharing, open access, and/or participation in an institutional repository at my institution 79.5% 35 ? I actively advertise my and the library's data services to researchers at my institution 72.7% 32 ? I perform a ""data interview"" with researchers to assess their data needs at various stages of their research 72.7% 32 ? I consult with researchers about the life cycle of their data and work with them on archival and conservancy issues prior, during and post project 72.7% 32 ? I help patrons understand the intellectual property and copyright issues concerning their data (e.g. provenance, publication, licensing and digital rights) 70.5% 31 ? I work with researchers to help them be compliant with government- sponsored grants' regulations and requests concerning data management (e.g. NSF) 68.2% 30 ? I teach data literacy to patrons 59.1% 26 ? I work with researchers to create a data management plan before they begin data collection/aggregation 56.8% 25 ? I access or locate data sets from the published literature for patrons' original research papers 56.8% 25 ? I work with researchers on data security issues 27.7% 10 ",Suggested competencies for inclusion in curriculum,"? Understanding the taxonomy of data management and curation ? awareness of types of data and their usage ? ability to identify and build collaborations with researchers, the Institutional Review Board (IRB), the Clinical and Translational Science Award (CTSA), and Information Technology (IT) groups ? competencies in policy issues involving intellectual property and privacy ? digital project management skills ? cyberinfrastructure competencies, using applications across platforms, and domain knowledge and the principle research problems specific to those domains ",Desired skills for continuing education,"? Foundational knowledge ? hands-on data management and curation instruction ? advanced instruction addressing discipline-specific metadata, metadata for data, and instruction on linking data using the semantic web, and demonstrations of the tools to mine these data ? instruction on the various ways researchers use data sets and transition active data to archived data ? ways to gauge the data needs of patrons and the access to more open-source data management tools ? need for continuing education addressing intellectual property issues concerning data. ",Frequency of data service requests,"? Range from none, few, monthly, biweekly ?majority were grant-related",Obstacles slowing establshment of escience services,"? Lack of funds for training/hiring ?lack of funds for upgrade/purchase of cyberinfrastructure ?evloving speed of technoogy, policy and data management CEs ?lack of institutuional policies/support ?lack of patron awareness ?lack of reseraches trust in library to maintain and secure data",Respondents were asked about their technical data management and curation competencies. There was only one competency that more than half of respondents had.,"62% use web 2.0 technologies. 44% provide data archiving and preservation services. 42% work with or develop virtual tools to manage and curate data. 42% work with a variety of metadata standards, 40% build, populate and maintain digital databases, 40% maintain an institutional repository, 38% provide data mining, interpretation, representation and visualization services, 29% work with metadata manipulation, validation and portals, 29% use a variety of programming language, 22% use digital lab notebook applications",Respondents were asked about non-technical data management and curation competencies. Overall they were more confident that they had these skills,"80% promote digital data sharing, open access etc, 73% actively advertise library data services to researchers, 73% perform a ""data interview"" with researchers to assess their data needs, 73% consult with researchers about the life cycle of their data and work with them on archival issues, 71% help patrons understand the intellectual property and copyright issues, 69% work with researchers on compliance with grant regulations, 60% teach data literacy to patrons, 57% work with researchers on data management plans, 57% access or locate data sets from the published literature for patrons' original research papers, 28% work with researchers on data security issues",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Cugler 2013,"Anomalies were reviewed by biologists and classified into four categories: A) metadata error; B) outdated metadata; C) errors in the distribution range maps D) possible new species pattern detected","? Approach enables detection of anomalies in both species? reported geographic distributions and in species? identification. Our main contribution is our geographic algorithm that deals with uncertain/imprecise locations. ? Geographic anomalies identified for 12% of 1037 distinct species in the database, with a total of 371 records out of 7049 records",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Daniels 2012,Adding value representing the activities repositories undertake to make data suitable for secondary analysis,"value is added in different ways, ? by preparing data so that it can be accessed and reused in different ways during the dissemination process ? during the submission process and off-loading some of the work to others within the designated community",Correcting Errors,"? Both staff and users, at different times, are important sources of error detection for the three repositories. ? Across repositories, the majority of error correction takes place udring submission",Creating consistency ,"Consistency created, ? to ensure that data and documentation are discoverable and have the same look and feel from one search to the next ? through the implementation of discipline specific standards","Changing Representations of Data to Reflect New Knowledge","Datasets are the raw material for new analyses and interpretations, ? new interpretations of some data do not change the representations of the data themselves ? in other instances, interpretations are additive, captured in the bibliography of studies using the repository?s datasets ",Responding to Designated Communities,"Each of the repositories we studied responds to the needs of their designated communities, whether providing new content or developing services to better support disciplinary practice. In each case the repositories adjust their own work practices to meet the developing needs of their users. ",Evolving Practices around Collecting ,Staff members from the three repositories we studied reported expanding the kinds of data they collect into new areas. ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Darch 2015,Sources of heterogenity in data management practices,"? Diciplinary Background ? Career Stage ? Social networks within and without the laboratory ? Physical Acces to tools ?Shifting networks of resources",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Dearborn 2014,Design of Repository,"? ISO 16363 audit process is used as a rubric, barometer and set of goals for Purdue University Research Repository ? To become a trustworthy repository, the Purdue University Research Repository project team has consistently worked to build a robust, secure, and long-term home for collaborative research. In order to fulfill its mandate, the project team constructed policies, strategies, and activities designed to guide a systematic digital preservation environment. ",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Delasalle 2013,In general,"? Focus was on how to write good data management plans when applying for research grants, and how to keep data sufficiently secure. ? differences between researchers? perceptions of what is research data. ? issues in handling data varied between the physicists and the social scientists. ",Other issues,"? Scepticism about the value of sharing data. ? Keeping data might prove more costly than if it were to be reproduced at point of need. ? Feeling threatened by a new (bureaucratic) requirement that might hamper the progress of research itself. ? Any university practices should not interfere with existing practices for data storage and sharing in their own discipline. ? Few researchers had already been required to submit a data management plan: an online tool to produce one and save them time would be welcome. ? Some physicists were sceptical of the role of metadata, describing their data: this was not a priority for them. ? In contrast, social science researchers were interested in submitting data to the UK Data Archive, but in need of time and support to get the data sets into a suitable condition to be archived and shared in this way, and this included the creation of metadata. ? How to encrypt data and how to ensure security of data, in line with data protection requirements? ? How to anonymise personal data? ? How to arrange collective access amongst researchers, that is not open or public access? ? Provide access to specialist IT equipment and support for handling and moving extremely large data sets. Moving data from one computer to the next was a big challenge for some researchers. ? Back?up their data centrally for them: amongst interviewees, this was done by the department, or in consortium arrangements with other universities. Amongst event participants, many of whom were PhD students, some did this for themselves. ? For some large studies, processing of data might be outsourced to a third party company: the implications of this practice need to be considered. ",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, denBensten 2009,,SwissBioGrid allowed for integration between industry and university. Authors found that this interaction did not present a particular problem in the case of SwissBioGrid. The challenge came instead from the technical and organizational issues of data management and computational problems in the life sciences.,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Denison 2012,"E-Research also brings its own challenges for Participatory Community Research ","? The need for appropriately sensitive institutional commitment to the long-term maintenance of repositories to support continued data storage and curation ",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Denny 2015,Predisposition to Data Sharing,"data-sharing practices were, ? ?ad hoc? decisions (post study) and informal practices of exchange between colleagues and interested persons, OR ? formal procedures, enforced by institutional policy in the form of contractual agreements between the principal investigator, their home institution, and the funding body",Why Share Research Data,"Primary global value of data sharing, described by many of the senior researchers, was seen to be its potential to move the field of science forward by opening up new avenues of science and by closing knowledge gaps through collaborative communication between different research programs",Disadvantages of Data Sharing,"? The need to protect data for its publication value was identified by several participants to be a key deterrent to releasing data, as some researchers worried that data would lose their value once placed in the public domain ? The potential for shared data to be misused, misunderstood and produce false conclusions that threaten the integrity of the primary research was also highlighted ? The problem of free riders was highlighted",Attitudes Toward Sharing Data,"Disagreement amongst participants about the extent to which research data should be shared beyond sharing de-identified dtaa for academic and public health purposes","Informing Research Participants About Data Sharing","? Conflict between the ability of research participants to control how their data are re-used and the uncertainty of future research endeavors emerged as an ethical dilemma when respondents were asked to consider specific approaches to seeking consent ? agreeing to share data for future research was seen as an altruistic act, but one that needs to be respectful of people?s rights.",Preferred Methods of Consent,"Senior participants from all three sites preferred a broad approach to consent, in which consent was obtained for future research related to the primary research area. In this way, senior interview participants felt permitted to conduct future research on existing data or even share data with others when it was within the original field of study, without an explicit indication of data-sharing plans ",Data Management,"? The ethical duty of researchers was described by many interviewees as the provision of accurate data records to nurture professional integrity through transparency of practice and to avoid unauthorized future use of research data ? Almost all participants agreed on the importance of having properly specified metadata in this regard ? The process of curating information generated by research in a retrievable and auditable manner raised several views about the need to protect data from misuse and the commitment by researchers to accurately preserve data for future re-use. ",End User of Shared Data,"? Particular concern was the potential threat of research misuse ? issue of whether secondary data users should be asked to consider potential benefits to original participants when requesting data",Benefit Sharing Component,"Datasharing agreements should include clauses requiring the preparation of metadata and dissemination of results with the view to public health implementation",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Diekema 2014,Data management ,"? Faculty generally are unfamiliar with institutional data management requirements ? Faculty have generally not changed their data management practices because of the mandates ? Many sponsored offices now provide data management plan assistance in response to mandates ",Sharing of data ,"? Most faculty are willing to share their data ? Some faculty have conditions for sharing: will share after publication and/or require proper attribution ? Some data cannot be shared because it is propriety data and/or confidential",Storage of data sets ,"? Most faculty either generate and/or use data sets ? Faculty are somewhat unaware of data repositories ?Sponsored programs are more aware of institutional data repositories than faculty ?Faculty generally store on local computers or external drivesz ? Faculty who already store data locally are slowly shifting to repositories ? About 1 in 3 faculty store data in library-based, agency sponsored, or commercial repositories ? A majority of institutional repositories handle data set storage",Use of data sets,"? Faculty use data sets stored in repositories more often than they themselves deposit data in those repositories ? The extent to which faculty teach students to use data sets varies greatly ranging from intensive, to mere mention of data sets, to no mention at all.",Institutional support for data management,"? Faculty would like assistance ? Desired assistance includes tools for sharing data, assistance accessing data sets, data visualization, workshops on data management, help with data set copyright and intellectual property issues ? Only some institutions offer data management support, mostly through sponsored program office and university library (storage) ? The role university library and information technology departments play in data management has increased as result of mandates ? majority of universities have an institutional repository ? The most common types of repository platforms hosted by universities are Digital Commons, DSpace, Fedora, or homegrown platforms",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Diekmann 2012,"COLLECTING, ORGANIZING, AND PROCESSING RESEARCH DATA","Data were collected at least annually, but frequently data collection occurred multiple times over the duration of an experiment, which often extended throughout multiple years. Field sites and locations varied considerably, and many were routinely switched from year to year",STORAGE AND BACKUP OF RESEARCH DATA,"Storage and backup of research data sets emerged as a highly fragmented activity with each researcher essentially devising his or her own strategy for handling data sets",PLANNING FOR RESEARCH DATA MANAGEMENT,"Virtually none of the respondents was actively engaged in developing formalized data management plans for research projects and, thus, first-hand experiences in formal data management planning were very limited",SHARING AND REUSE OF RESEARCH DATA,"Respondents were generally positive and open about sharing research outputs, including data sets, with other researchers, but they lacked personal experience in doing so",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Dietrich 2012,Overall,"? Data policies were missing a significant number of the elements identified in our rubric ? no single policy addressed all 17 of the elements",Data policy parameters are typically general in scope,"? Data policies we surveyed had more general than specific data requirements ? The most commonly found requirements across all policies pertained to general data management activities",Data policies often don't address standards thoroughly,Data policies were not very prescriptive regarding standards for data and metadata.,Data policies concentrate on access more than preservation,Data suggested there was a greater emphasis on access to research data than on preservation,Publications are infrequently mentioned in data policies,Funders were largely silent about open access to the publications resulting from funded research,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Drew 2013,accessibility,"? Only 16.7%, 1,262 from a total of 7,539 publications surveyed, provided accessible alignments/trees ? Our attempts to obtain datasets directly from authors were only 16% successful (61/375), and we estimate that approximately 70% of existing alignments/trees are no longer accessible. ? we conclude that most of the underlying sequence alignments and phylogenetic trees produced by the systematic community during the past several decades are essentially lost, accessible only as static figures in a published journal article with no capacity for subsequent manipulation.",Quality of data ,"? When data are deposited, they are often incomplete ? Are all the phylogenetic trees presented in the publication deposited 75% yes ? Does the algorithm muse dint he tree file correspond to the algorithm of the tree used in the figure ? based on the label 52% yes ? Do the tips in the tree file correspond to the labels in the figure 97% yes ? Does the tree file have branch lengths 23% yes ? Does the tree file have bootstrap support/posterior probability 8% yes ? Are the sequence produced by the study deposited in GenBank 91% yes ? Does the TreeBase ID in the paper match with the one in the database 37% yes ",Change over time,? The situation has barely improved over the 12 years covered in this study,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Ducloy 2006,Creating metadata: thinking about reusability [Case study 1],"Fundamental exportable data for a PhD dissertation repository: 1. Names: Identifying the various statuses or names of a person in order to establish correspondences requires enriching the metadata related to persons. 2. Structure: needs to identify the structure the way it was mentioned in the PhD theses with its equivalent in the institution structural network 3. Partners: he PhD student?s enrolment university is mentioned in the thesis. The list of partners and/or the collaboration type will have to be completed. ",Institutional surveys [Case study 2],"Need to establish taxonomy of affiliations. i.e A consistent metadata schema, such as LEAF[7] one, could offer a solid base which must be completed by a strong study of affiliation links.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Edwards 2011,Clarity of NIH guidelines: data sharing,"? 43% agreed ? 12% disagreed ? 36% did not know ? 9% neutral",Risk of participant reidentification,"? 70% reidentification unlikely to occur ? If a study participant were to be reidentified, 76% of respondents thought it was unlikely that the participant would be harmed as a result",Reconsent,"Great diversity of opinions when when it is ethically necessary to obtain reconsent from research participants",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Einbinder 2001,Interaction with data,"Repository created for end-users to interact directly with system - many users prefer to have CDR personnel formulate and submit queries for them",Incremental approach,User interface has gone through several major and minor iterations,Organizational issues,Created for researchers but also used as a nonresearch resources,User survey findings,"? Initial use was best explained by a user?s proficiency in pertinent computer applications, familiarity with standard coding conventions, and an understanding of how data are recorded at our institution. ? Compatibility with an individual?s work style and skills was strongly asso- ciated with satisfaction and continued use of the CDR. ?Ease of use? was also associated. ? The reason most often cited for not using the CDR was ?not enough time,? pointing to the time constraints of busy health professionals as an impor- tant factor to consider in system design and training. ? The CDR?s use of encrypted patient and physician identifiers was viewed as a barrier to usage by 44 percent of respondents. ? Most users were satisfied with the accuracy of data provided by the CDR and believed it would become an important resource at the University of Virginia.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Embi 2013,Theme 1: Data Quality ,"? Data Availability: EHRs were not designed to gather data for research but rather to enable health care processes. Moreover, health care providers are typically neither trained to record data for research purposes nor are they encouraged to do so. These factors often result in missing data. ? Lack of Shared Semantics ? Determination of Shared Provenance: A common problem with using electronic records for CER is defining variables that have more than one possible operational definition. ",Theme 2: Data Preparation; Temporal Relationships and Data Synthesis,Lack of time-related data making it difficult to utilise data from EMRs effectively,Theme 3: Sociotechnical Factors ,Regulatory frameworks were obstructive and prevented some participants from contributing data to the study. ,Theme 4: Organizational Factors ,Organizational factors include availability of trained informatics personnel involved in database management as well as institutional support. ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Enke 2012,Willingness to deposit data into a public repository,"? 80% Yes 20% No ? (78%) claimed to already share data generated in their laboratories ?Some are willing to share after the end of a project (25%), immediately (16%) or at retirement. ? In general reponses similar between researchers from different countries",Main concerns and obstacles to data sharing," ? 53% ""Loss of control"" of data ? 50% Amount of time required ? 41% Spend more than a week preapreing data for sharing ?34% Lack of community wide standards for managing data sharing ? 40% aware of community-wide standards in their fiel of research ? 31% Worried someone using their data will draw wrong conclusions ? 27% Legal and confidentiality issues would prevent data sharing ? 60% knew of databases to deposit data ? 40% did not know of data repositories ? 63% of researchers willing to share data state their is a data base to do so ? 51% of researchers unwilling knew of a data repository ? 39% have data management plan 10% not sure if they have a data mangement plan ? 75% of those using a DM plan thought it was a good idea ? 17% use DM plan becaue of institutional requirement ? 15% use DM because of funding agency ? 13% because of collaborators ? 18% DM plan is extremely important ? 25% DM plan is very important ? 35% DM plan is important ? 22% somewhat important ? 2% not important at all ? 58% feel they are the owner of the data they Produce ? 44% feel the institution owns the data ? 12% Uncertain about the ownership of data ",Motivation for data sharing,"? 72% availablility of comparable data sets for comprehensive analysis ? 71% networking with other researchers ? 62% transparency of results ? 48% generation of research data is often publically funded ? 36% increases ones visibility in the community ? 59% requirements of journals ? 57% requirements of funding agencies ? 58% possibility to cite data sets like publications ? 64% Being recognized or cited for sharing data would better motivate reseachers to share their data ? A place to deposit (52%) and funding (49%) to deposit data would encourage researchers to share data ? 71% want to be recognised liek a publication in the references; 36% content with mention in acknoledgements; 33% want co-authorship",Technical structures ,"? Data are stored on the researcher's own (70%) or institutional PC (49%) as well as on external hard drives (57%) and USB sticks (38%) ? 38% of the participants state that they store their data on institutional servers. ? Backup of research data is done in nearly equal percentages once a day (23%), once a week (26%) or once a month (23%), 21% backup their data less than once a month. ? Project or institution internal databases are used by 53% of participants ? Conditions that have to be met for researchers to deposit their data in a database include the possibility to edit (67%) and delete (48%) data after uploading ?A user friendly data up- load is required (64%) and the long term maintenance of the database has to be guaranteed (63%). ? Important are also that researchers are contacted if someone else wants to use their data (60%) and that a history of data use is displayed (56%). guidelines for the reuse of data is important (62%). ? The quality of data has to be secured (55%). ? Additional features of an optimised data portal should be the possibility to annotate data and to combine and present different data (e.g. on a certain species) (both 58%) ",Data Reuse,"? More than half (61%) of the overall participants (re)use data generated in other laboratories, 32% do not and 7% are not sure if they do ? Only 40% of those who are unwilling to share data (re)use other researcher's data whereas 63% of those who are willing to share data also (re)use other researcher's data ? The quality of data that researchers would like to reuse is evaluated by the author of the data set (62%) and the quantity of additional information (53%) ? 39% of the participants assume that published data is reliable ? Details on how (91%), by whom (76%), when (66%), where (65%) and why (64%) data were collected are additional information to evaluate the quality of data sets",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Eschenfelder 2014,,"data access and use controls were highly variable both across and within repositories",,"Amount of restricted data in each repository varied greatly",,"A small subset of our repositories required institutional-level membership for free end user access to data",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Faniel 2010,Number of participants who had reused scientific data,"10/14 had reused data from collegues ",Assessing whether colleagues data are relevant,"To assess the relevance of colleagues? data, EE researchers generate criteria related to the problem they want to address. The criteria are based on the EE researchers? domain of expertise and the model they want to validate. ",Assessing whether colleagues? data can be understood ,"Whether EE researchers understand colleagues? data rests on whether they understand how their colleagues conducted the experiment. EE researchers need context information that is more detailed and more subtle than what they use when assessing data?s relevance. Rather than simply matching their criteria to colleagues? experimental parameters, EE researchers need to have a thorough understanding of the data. ",Assessing the trustworthiness of colleagues? data ,"EE researchers need to ensure that they are measuring data in the same way that their colleagues measured during the experiment. Having context information about the number, type, location, and direction of the sensors that collect the data is critical for ensuring data are reliable. ",Are the data reliable,"? researchers need to ensure that they are measuring data in the same way that their colleagues measured during the experiment ? context information about the number, type, location, and direction of the sensors that collect the data is critical for ensuring data are reliable",Are data valid,"Researchers need to know what problems occurred and how they were resolved",Resources EE researchers use to assess the data reusability,"Researchers learn about potential data for reuse through colleagues? journal articles as well as personal networks",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Faniel 2013,Lack of Context,"The lack of context was a persistent problem encountered during the reuse of archaeological data. At issue were the data collection and recording procedures.","Role of Data Collection Procedures during Data Reuse","? Having access to data collection procedures helped respondents understand and verify the data against the archaeologists? research objectives and interpretations. ? Respondents also relied on archaeologists? presentation of documents created during field work, the reputations of the archaeologists, their scholarly affiliation, and the institutions where the data were housed for additional insight into the data.","Role Additional Context Plays in Data Reuse","Archaeologists record their thoughts and actions in situ - Respondents relied heavily on this kind of information to account for and to link archaeologists? actions and interpretations that occurred in the field",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Fear 2012,Using provenance as an indicator of trustworthiness,"researchers expressed satisfaction with what others provided and confidence in their own documentation processes:",Accuracy,"Most important info when evaluating a dataset, most subjects focused on, ? experimental and methodological parameters, such as the instrument used ? the characteristics of the sample used ? the search algorithm or other data processing methods used ",Integrity and authenticity,"In order to indirectly assess data integrity and authenticity, users relied on several heuristics that draw on provenance information, ? media related: published data more credible ? source of data",Using provenance as an indicator of expertise,"Identity of the data producer would play a role in their judgments about the data",Using provenance together with other information,"General approval for the provenance information specified in the MIAPE standard",Connecting data to a publication: users? reliance on the ??archives of science?,Critically important to link data with the paper it was first described in,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Fecher 2015,Data donor ,"Comprises factors regarding the individual researcher who is sharing data Factors include: ? Sociodemographic factors (Nationality, age, seniority, career prospects, character traits, research practice) ? Degree of control (Knowledge about data requester, having say in data use, priority rights for publications) ? Resources needed (Time and effort, skills and knowldege, financial resources) ? Returns (Formal recognition, professional exchange,Quality imporovement)",Research organization ,"Comprises factors concerning the crucial organizational entities for the donating researcher, being the own organization and funding agencies Factors include: ? Data donor's organization (Data sharing policy, organization culture, Data management) ? Funding agencies (Finding policy, financial compensation)",Research community ,"Comprises factors regarding the disciplinary data-sharing practices Factors include: ? Data sharing culture ? Standards ? Scientific Value ? Publications",Norms ,"Comprises factors concerning the legal and ethical codes for data sharing Factors include: ? Ethical norms ? Legal norms",Data recipients ,"Comprises factors regarding the third party reuse of shared research data Factors include: ? Adverse Use ? Recipient's Organization",Data infrastructure ,"Comprises factors concerning the technical infrastructure for data sharing Factors include: ? Architecture ? Usability ? Management Sofware",Sharing analysis scripts and data sets [survey question],"? 120 (25.9%) Willing to share publicly ? 98 (21.1%) willing to share under access control ? 163 (35.1%) Willing to share only on request ? 83 (17.9%) No willing",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Federer 2013,Ccheivements to date,"? Providing basic instruction in best practices for data management for the research team ? assisting the team with preserving its existing digital data. ? collaborating on plans for more substantive projects to enhance the team?s data gathering and management work-flows. ? writing a data management plan ? inclusion of the informationist as key personnel in the NSF grant ",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Fielding 2008 ,"Among qualitative researchers who have a substantial engagement with computational resources, principal current applications include archiving and database work",Numbers not reported,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Finn 2014,"LEGAL ISSUES - Intellectual Property - Copyright","INTELLECTUAL PROPERTY - Many institutions and organisations are aware of the potential repercussions open access may have for the rights of intellectual property owners. Such intellectual property rights include copyright, trade secrets, database rights and licensing. Copyright - Discussions about research undertaken within the RECODE disciplinary case studies highlight problems arising in identifying the true right holders. Copyright may act as a barrier to providing open access to research data. This includes issues around identifying the ?true? copyright holders as well as the retention of rights to gain benefit from their intellectual property though restricting access to the material and/or by trying to gain or preserve their proprietary rights to this material and the benefits such proprietary rights engender.",LEGAL ISSUES - Intellectual Property - Database rights; Licensing,"Database rights - The advent of data storage via cloud computing presented additional difficulties associated with database rights, including how to enforce these rights in a cloud environment. This lack of clarity may represent a significant barrier for researchers and institutions both of which might be reluctant to provide open access to research data without significant safeguards in place. Licensing - In relation to open access to research data, licensing is unique in that it represents both a potential barrier to open access and a potential solution to assist in providing open access to research data. Licensing emerges as a key barrier for organisations, such as the JRC, which purchase data from private data creators or data brokers. In addition to limitations on re-use of the data, these licensing terms also have an economic cost. Thus, licensing arrangements allow private companies to restrict the re-use of their data. It can interfere with some organisations? legal or funding obligations to make their data accessible to the public and available for re-use. Again, this requires organisations that rely on purchased data to navigate conflicting legal regimes in relation to open access to research data. These examples introduce some of the issues central to intellectual property rights that may compromise open access research data. All of these issues present limitations on the preservation, dissemination, accessibility and re-use of research data. As such, they have spurred the development of practical solutions to navigate open access to research data. Furthermore, intellectual property rights may conflict with other legal obligations, especially open access mandates that require the provision of open access to research data.",LEGAL ISSUES - Intellectual Property - Data Retention,"Data Retention - There will sometimes be potential tension between expectations or preferences of research participants and the data preservation elements of open access to research data, where obligations to protect researchers or to treat their data ethically may limit the extent to which open access to research data can be realised.",ETHICAL ISSUES - Ethical concerns about open access - Unintended secondary uses and misappropriation,"Unintended secondary uses and misappropriation - The secondary use of data to validate results, address new questions or apply new analytical methods may produce relevant new insights or scientific advances. It may also help to uncover errors or mistakes in research results, which contributes to sound scientific practices. Sometimes, though, data are misinterpreted, taken out of context or used for purposes that the original researchers or research participants did not intend or anticipate. In some cases this might be ? from a researcher?s perspective at least - an undesirable, but nevertheless an acceptable, drawback of publishing research results, be they data or publications. However, in some instances the intended secondary use or misappropriation of research data may cause unacceptable damage or distress to individuals and groups, as well as to research and the scientific enterprise. It can harm or wrong research participants or other stakeholders, particularly when results are perceived to be manipulated or distorted or when data are used for purposes that research participants themselves find objectionable. Unintended secondary use can damage identities, reputations and relationships between individuals, and may even endanger research subjects or sites. One concern is that the misinterpretation of publicly available medical health data by patients, for instance, can put these patients at risk. Another concern is that unanticipated or unintended uses may harm the reputation of researchers and the public trust in science or social institutions.",ETHICAL ISSUES - Ethical concerns about open access - Dual Use; Restriction of scientific freedom,"Dual Use - Some data can be used for research that could produce knowledge, products or technologies that benefit society, but could also pose a threat to public health, agriculture, plants, animals, the environment or material. Such dual-use data present an ethical dilemma for data sharing and open access: do the benefits of providing access to research data outweigh the costs? Restriction of scientific freedom - Open access requires that researchers take a particular approach toward the data they collect for a particular research project. In order for data to be locatable, assessable and usable by others, there are different kinds of restrictions upon the choices available to researchers in terms of what they can do and how they must do it. These include, but are not limited to, the attachment of standardised meta-data to the datasets produced or the use of specific technical formats and naming standards.",,"Access Management. Archaeology, physics and clinical data all require some form of professional accreditation or other access management review in order to enable researchers to access data. This professional gate-keeping solution allows these disciplines to manage legal and ethical compliance in relation to open access to research data. Specifically, they serve to identify true ?professionals? who will have expertise in research methods or legal requirements such as confidentiality, privacy, data protection and research ethics. This solution ensures that the data is used responsibly and any potential issues associated with misuse are identified and mitigated. It also serves as a mechanism for enforcement, whereby individuals who do not use data responsibly may not be ?approved? a second time.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Flechais 2009,,"? As seen from the case studies, involving stakeholders in the design of security provides a very effective means of identifying their needs. In addition, the presence of stakeholders during security design also provides additional benefits in raising awareness and knowledge of security issues in the system. Finally, understanding stakeholders? capabilities also facilitates the design of appropriate countermeasures, making the final system well suited to its intended users. ?Also assigning responsibility to individual stakeholders, or ensuring their motivation to address the security issues are important factors for a secure system",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Frank 2015,Disciplinary practices,"Archaeologists and zoologists primarily view good data collection as a means of reconstructing the data for analysis. Thus, preservation is viewed largely in terms of reconstruction. // Their data management and recordkeeping practices influence their attitudes toward data preservation ?researchers in both disciplines view preservation of research data as being important, but are sceptical about the viability of long-term preservation for their data.",Influencing extrernal factors for data presevation,"Funding, legal requirements, and the status of museums and repositories",Data collection,"Archaeologists and zoologists do not explicitly have preservation issues on their minds during data collection. Rather, it is destruction that poses the biggest challenge. Zoologists and archaeologists have responded to this destruction by turning their attention to collecting quality data. Quality is assessed in different ways including capturing multiple data points about a specimen, working from a good research design, and collecting contextual data. Archaeologists and zoologists primarily view good data collection as a means of reconstructing the data for analysis. Thus, preservation is viewed largely in terms of reconstruction. Few of our participants explicitly linked data collection and preservation, and when they did, the association was not positive.",Data management and recordkeeping,"Archaeologists and zoologists learn how to manage the data that they collect through academic advisors who communicate community norms as well as trial and error. Their data management and recordkeeping practices influence their attitudes toward data preservation?researchers in both disciplines view preservation of research data as being important, but are sceptical about the viability of long-term preservation for their data. The costs that researchers associated with data management and recordkeeping, and in particular the higher cost of managing data when also trying to preserve that data, were a common concern among both archaeologists and zoologists. Researchers were concerned about the higher costs in terms of time and resources associated with managing large amounts of data.In addition to being costly, managing data for preservation is difficult and takes time away from other activities. This concern about spending time and resources to manage data for preservation only to lose that data was echoed by both archaeologists and zoologists. For both archaeologists and zoologists, preservation was seen to add to the cost of data management and recordkeeping practices. In some cases, these costs became prohibitive and the researcher was put in a position where preserving data was perceived to be detrimental to the actual research. Disciplinary norms guide the formats in which data were collected, and therefore the formats that researchers must be prepared to manage and preserve. For archaeologists working in multidisciplinary teams, this meant collecting data in a format common to the team; for zoologists, this entailed collecting the required information along with the specimen in a way that was easily exported when the specimen was turned over to a museum. Yet, the sheer number of formats used by researchers in these disciplines posed preservation problems. Researchers expressed scepticism that digital data could be preserved long term.",Funding," Researchers who are aware of both the need to preserve their data and the lack of funding that can be allocated toward preservation expressed concern about what will happen to their data in the future. Another way in which this concern about how to fund data preservation effortswas expressed came in the form of respondents discussing concerns about repositories? ability to preserve data. This concern is distinct from earlier discussions about how to fund preservation of data in who would care for data rather than how those efforts would be funded or what equipment would be available.",Requirement of data deposit for publishing,"In addition to funding agency mandates for data management, many journals also require data to be deposited in a repository. This influences attitudes around preservation by providing researchers with an incentive to deposit their data. While the requirements do not include mechanisms to ensure or enforce compliance, researchers? behaviour is shifting toward preserving their data in repositories where it will be accessible to more people over the long term. This is particularly true for researchers who want to publish in these journals in the future. While norms for data management in zoology were described by respondents as being focused on depositing specimens into museums, the linking of data deposit to publication also reflected norms about what types of data should be shared and what should be kept private. Researchers generally expressed positive attitudes toward funding agency and publisher requirements about data management and preservation.",Legal requirements,"Legal requirements support preservation for both digital and analogue data, including artefacts and specimens. Both archaeologists and zoologists are required to obtain permits to collect data in the field, and museums and repositories for both disciplines require proof that the proper permits for data collection were obtained before they will accept data, including artefacts and specimens, for deposit. Researchers obtain permits that govern data collection and must then engage in the responsible and ethical data collection practices that the permits require. The practices that are required by these permits reflect behaviours that are recognized within the respective communities as being ethical and are required as proof that the ethical norms were upheld when artefacts and specimens were collected in the field. The evidence required to demonstrate compliance with these norms comprises important contextual information that supports preservation. In addition to museums and repositories requiring permits as a condition of accepting data, some also require the researcher to agree to deposit their data into a museum or repository. Overall, the legal requirements that govern how artefacts and specimens are collected and managed contribute to the preservation of research data. They do so by requiring the collection of valuable contextual information that can be used to help make sense of the data over long periods of time, and by ensuring that this information is kept with the artefacts and specimens in repositories and/or museums where researchers are required to deposit their data.",Institutional infrastructure of museums/repositories,"A great deal of data preservation ?preservation of both physical specimens and artefacts as well as digital objects?takes place not at the level of the individual researcher but rather at the level of the museum or repository. Our respondents talked about the ways in which this orientation toward museum- and repository based preservation of research data reflects norms within their respective communities of practice. Zoologists expected that any specimen collected in the field would automatically be placed in a museum. This contextual information that researchers deposit along with their artefacts and specimens helped to ensure that the meaning of the data would be preserved over time. Zoologists also discussed the ways in which museums and repositories influenced the preservation of digital data in addition to physical specimens, a much broader spectrum of the available data than for archaeologists. For both of these communities, the institutional infrastructure of museums and repositories affected the ways in which researchers thought about preservation and utilize their options for preserving data. For zoologists, the well-established norms around specimen and data deposit into museums and repositories were reflected in the attitudes about who can access data and how to gain access to data. For archaeologists, the expectation that a researcher would be able to go to a museum to gain access to research data seemed far less certain. Rather than discussing the ways in which museums and repositories provided access to data, archaeologists discussed ways in which museums limited access to data for research in the interest of pursuing goals that focus on data preservation. For both archaeologists and zoologists, the institutional infrastructure of museums and repositories greatly influenced preservation of physical and digital research data. In some cases, museums and repositories furthered preservation goals by aligning them with the goals of access and reuse, and in others, they furthered preservation goals by restricting access and reuse. In both cases, museums and repositories played a role in the preservation of research data. Researchers discussed preservation as something that was someone else?s responsibility. Specifically, that the goal and responsibility of repositories were to preserve data and that once a researcher has deposited her data, it was no longer her responsibility to worry about long-term preservation.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Fry 2009,Dissemination of outputs and by-products,"? Most projects were in favour of making the results of the research public (including data) - provision of easy access to project outputs did not always have priority ? The effort to make the data suitable or robust enough to make them into a commonly used resource may be considerable, and thus represent a Catch-22 situation for researchers: a large effort can be made, which may or may not be useful, but if it is not attempted, then it cannot be useful in the first place the highest priority",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Garritano 2009,Skills required by librarians working on e-science projects,"? Library and information science expertise (i.e development of a DM plan, restrictions on data, reuse of data, conducting a data interview with patrons) ? Subject expertise ? Partnerships and outreach: Internal and external (During a project opportunities may arise to provide data management services, create and approve metadata or technical standards, supply technical equipment or instruments with particular data outputs, or provide similar types of services or products) ? Participating in sponsored research ? Balancing workload",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Goldenberg 2015,Biospecimen sharing with ?outside? institutions,"? Without original researcher involved: 29% required info about proposed study; de-identified biospecimens - 43% require information review; identified speciments - 65% require documentation of outside institution IRB ? With original researcher invovled: 55% of the IRBs would require documentation of IRB authorization by the collaborators? external site(s) and 78% would require submission of information about the new study ? 39% said their IRBs always or usually review the original consent form when biospecimens are coded/de-identified",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Goldman 2015,Data Services and Support,"Two (12%) libraries responded as currently having formal plans in place, seven (44%) are currently in the process of developing a plan, and seven (44%) responded with no plans for creating data management programs. The nine libraries with plans in place or in the development stages described their program or planning process: working with other institutional departments, offering workshops and classes on research data management, and creating subject guides using the LibGuides platform from Springshare. Specifically, one library is piloting a workshop on using the DMPTool for creating data management plans; another is working to implement Mendeley as a tool for managing and sharing research papers; and one library is currently in discussions about how to integrate data management instruction into the graduate school?s curriculum. Librarians were asked to select the data reference and technical services they provide to their users. Eleven (85%) libraries are providing reference support for finding and citing data or data sets; 10 (77%) are discussing research data management with other libraries, people on campus or RDS professionals; nine (69%) are creating LibGuides and finding aids for data, datasets, and repositories; and nine (69%) are providing research data management classes. Additionally, some librarians are doing outreach and collaboration with faculty, staff, and students on data management plans and data standards, and are directly participating with researchers as a team member on a project.In terms of processing services, six (86%) libraries work on preparing data and datasets for deposit into a repository (either supported by the library or institution, or a repository service outside of the institution), and five (71%) libraries are identifying data or datasets for repositories, creating or transforming the metadata for these datasets and then ingesting the data or datasets into repositories. Fewer librarians are providing technical support for research data management systems and appraising datasets for curation and preservation.Libraries are providing consultation in many areas related to data management: all 16 (100%) responding libraries provide copyright services, 14 (88%) provide scholarly communication, 13 (81%) provide open access services, and four (25%) provide eScience (team science, networked science or new approaches to scientific research) support. Further service areas that were described by libraries included: digital collections, citation management, data management plan support, consulting for sharing and publishing, research impact, gene set enrichment analysis or data analysis, National Institutes of Health (NIH) requirements, National Science Foundation (NSF) grant writing, and ingestion of data into institutional repositories. To understand the format in which libraries are providing data service topics, the survey asked about libraries? data services workshops and classes. For this survey, workshops were defined as including one or a series of meetings over a short period of time involvinginstruction on a specific topic or skill ?usually involving specific assignments during the course of the workshop session. These types of workshops were identified by 12 (83%) of the responding libraries (with five currently running workshops and seven with future plans for workshops), and are strongly geared toward graduate students. Classes or courses were defined as a series of meetings over an extended period of time in which students are taught a particular subject; usually including assignments to be completed outside of the meeting times. Of the eight libraries that currently offer or have plans for these types of classes, five (83%) courses are for professional development, two (33%) offer a for-credit course, and one (17%) library gives a semester for-credit course.",Collaboration and Outreach,"Four (25%) of the responding resource libraries are currently working on collaborations with libraries (branch or main) at the same institution, and three (18%) have plans to collaborate with libraries at other institutions. Thirteen libraries are collaborating with other departments on campus. Nine (69%) libraries are teaming up with different academic departments, and provide and promote services in mainly science departments such as: medicine, nursing, biomedical, biology, genetics, and molecular biology. These nine libraries shared their current, past, and future collaboration projects related to data services: implementing institutional repositories (IRs), working research data management into curriculum and educational materials, starting digital curation projects, providing Information Technology (IT) support in workshops, accessing software and data analysis, and teaming up with the Office of the Vice Provost for Research (VPR) for Responsible Conduct of Research (RCR) training.",Data Positions and Staff Development,"Libraries were asked about their formal staff positions and the type of librarians performing data services. Only three (19%) of the libraries currently have data management-related library staff positions. These position titles include metadata archivist, metadata librarian, data librarian, science data management librarian, and repository manager.Due to the fact that many of these libraries do not have specialized personnel to field data questions, libraries responded that reference, instruction, or subject librarians are performing or providing data services. In addition to their regular tasks, technical services and embedded research librarians may also be called upon to provide data services.Only 13 libraries responded to the question asking how staff develop and acquire the needed skills and knowledge for providing data services. Eleven (85%) libraries identified conferences, 10 (77%) utilize internet-based learning, and eight (62%) provide on-the job training/learning as the primary preparation for librarians to provide data services. There is less of a focus put on classroom instruction (six, 46%) or attending career development courses (three, 23%).",Digital Data Services,"One section focused on the library?s use of online resources when providing or learning about data services. Of the 16 NER libraries participating in the survey, only seven (44%) librarians consult online resources for assistance with inquiries. The most heavily used resources named by these seven libraries are the e-Science Portal for New England Librarians and The New England Collaborative Data Management Curriculum (NECDMC). These librarians are also consulting RDM websites at Minnesota, Purdue, UK Data Archive, MANTRA, DMPTool, and other online resources through California Digital Library (CDL) and Association of Research Libraries (ARL). Libraries were asked about the presence of digital repositories at their institutions. Of the 16 responding libraries, 13 (81%) have a repository at their institution or library. Of these 13 libraries, 12 (86%) have an institutional repository, four (28%) institutions have research or data repositories, and one (7%) library identified having a disciplinary repository. In terms of what is accepted into these repositories, libraries manage mostly local research outputs. Three (25%) libraries accept data or datasets to their institutional repository, with six (50%) others planning to make this feature available in the future.",New England Region Environmental Scans,"Of the responding libraries, six (38%) have completed an environmental scan. Their major findings included: community need for data storage and preservation; assistance with creating data management plans; being educated about best practices and tools; more outreach and visibility of projects and policies; use of electronic lab notebooks in research labs; the best ways for handling confidential data; and focusing research data management education for graduate students. To address these needs and areas lacking in data education, libraries have created strategic agendas to include providing more instruction, making local-and third-party resources available and easily accessible, structuring institutional repositories as sharing platforms, and developing educational and training materials and courses for students.",Challenges of Library Data Services,"The final section of the survey was made up of open-ended questions asking libraries to identify the challenges their institutions have encountered to provide data services, or why their institution does not provide data services. The challenges encountered by the two (12%) libraries currently offering data services included: a lack of data management policies at the institution; difficulty in raising awareness of library services due to other departments on campus also providing data services; and the need for better collaboration betweendepartments and with researchers across campus. Seven (44%) libraries responding to this survey are currently in the development stages of creating data services and programs. The challenges they identified during the planning process include: confusion surrounding the definition of ?data services;? requiring stakeholder support and time; raising awareness of data management issues; changes in staff and administration; varying patron types with different needs; lack of staff time and expertise; unclear if someone else at the institution is already doing data management; ineffective promotion of the library?s role; and researchers who are not willing to share their work or data.There are also seven (44%) NER libraries not offering data services and are not currently planning programs to address data management. These institutions identified the need for more personnel in terms of staff size and specialization, and lack of requests or need for data management help. Other major issues cited were the focus of the institution (teaching versus research university, or academic versus hospital library) and decentralized organization adding challenges to working with others at the institution in order to develop data services.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Grace 2015,Lesson learned,"? Leveraging the skills and experience gained from existing in-house repository platforms is likely to be a major factor in the selection of data repository software. ? Outsourcing to a repository hosted service may be a way to access expert guidance. ? Other factors aside, there may be policy reasons not to use a publications repository for datasets, as the constraints on offering open access to data are greater than for publications, making it difficult for a single repository to apply an open access policy consistently. ? Don?t necessarily try to solve all issues around the repository before implementation; it is likely that some functionality will need fine-tuning as researchers begin to deposit data. ? Clearly identify which elements of support for the repository, if any, should be paid for through direct grant recharging.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Green 2015,Most frequently used materials,"100% texts 94% Images 58% Maps 42% Video 39% Audio",Most important needed functionalities for digital collections,"Text: Highly searchable content, download and export functionality, detailed metadata, quality of text, breadth of content Images: Downloading Capabilities, Editing tools, metadata, usability and user-friendly interfaces, searchability Multimedia: Detailed multimedia, Searchability, download capabilities, editing tools",Needs and use of digital collection according to participants [Interview study],"? Better seach functionality ? Need for annotation and editing tools ? Imporvement in user interface design ? Expanded completeness of colleciton's content",Digital curation,"Functionalities that enabled scholarly use and reuse of digital materials include, ? detailed metadata ? accessible textual and image files ? temporal coverage of content ? transcriptions ? the inclusion of nontextual sources in collections ? access to broader content",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Grubb 2011,Data and Results Sharing,"? Everyone should have access to research data ? data should be available at no cost ? disagreement about when data should be shared: range of answers from 'as soon as possible', 'after review', 'after publication', 'within reason'",Percentage of expirments replicated and should be replicated,"? 0-19%: 9 reponses for % of experiments which are replicated; 4 reponses for % of experiments which should be replicated ? 20-39%: 3 reponses for % of experiments which are replicated; 1 reponses for % of experiments which should be replicated ? 40-59%: 2 reponses for % of experiments which are replicated; 1 reponses for % of experiments which should be replicated ? 60-79%: 0 reponses for % of experiments which are replicated; 0 reponses for % of experiments which should be replicated ? 80-100%: 0 reponses for % of experiments which are replicated; 2 reponses for % of experiments which should be replicated",Timeframe for data and results availability,"? As soon as possible (4 reponses) ? After review (1 response) ? After publication (8 responses) ? Within Reason (1 response)",Classification of scientists in regards to sharing,"1. Those who share their data and results immediately. 2. Those who share their data and results eventually. 3. Those who believe in sharing data and results, but who do not share them due to limiting factors (such as concerns over scooping and/or publisher restrictions). 4. Those who do not believe in sharing data and results (beyond what is included in their published papers). ",Limiting factors,"? Time pressure ? patent pressure ? publication pressure ? publishers restrictions ? infrastructure issues ? scooping",Scooping data,"Respondents experienced, ? non-malicious ? malicious ? copyright infringement",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Guy 2013,Success factors,"A common factor across projects that claim success in reskilling their librarians is that they have considered what distinguishes training for librarians in RDM as opposed to training for other groups: ? Teach about the research process ? Use active learning ? Encourage librarians to see RDM training as career dev't ? Obtain feedback",Challenges,"? Teaching the right skills to the right people ? Embedding training ? Maturity of RDM (is a relatively new area still in dev't)",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Haendel 2012,"Uniquely identifying research resources is critical both to enable sharing and to ensure reproducibility of science.","? 85% of the labs visited at Oregon Health & Science University as part of the eagle-i project did not indicate use of a lab inventory system ? labs that do track resources typically use an informal, often distributed system of spreadsheets or applications ? informal tracking systems often do not contain detailed enough information about a lab?s resources that would allow for unique identification and semantic linking to other data.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Hall 2013,Value in sharing research,"Nearly all participants see value in sharing research data Those who saw little value believed that their data was too narrow to be useful in a different study",Reasons for being hesitant to share data,"Concerns about IRB compliance Protection of research subjects amount of work required to organize data to make it useful for others data could be taken out of context and politicized",Senior faculty preceptions,Faculty who already have tenure were more likely to speak favourably about sharing data to increase transparency,Junior faculty preceptions,Expressed concern about being able to publish their own data before someone else could,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Hanauer 2014,Repository data request forms,"? Analysis of research data requests forms revealed considerable heterogeneity in form content, both in the breadth and depth of the topics covered ? most forms over-emphasize the collection of administrative metadata and under-emphasize the collection of important details necessary to communicate a complex data request to a reporting team",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Harris-Pierce 2012,,"? 16 (out of 52) institutions offer courses in data curation (31%) ? Varied topics found in the available syllabi are an indication of the broad scope of issues that can be covered in a data curation course: 8 Metadata 7 Collection development 7 Hardware and software platforms 6 Data types, standards, lifecycles 5 Digital preservation 4 Case studies 4 Challenges 4 Data quality, discovery, and publishing 4 Project management 4 Provenance 4 Researcher practices, needs and role of data in research 3 Digital scholarship in the humanities 3 Digitazation 3 Libraries and Archives 3 Risk Management 3 Role of curators 2 Data analysis 2 Data management plans 2 records management 1 HIstory 1 Open access 1 Stakeholders 1 Subject KNowledge 1 XML",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Hens 2010,Reuse of existing DNA collections and consent,"? Written informed consent was not asked for storage of samples in a diagnostic context ? Centers would not reuse diagnostic samples for research without written consent",Informed consent,"? 7/8 used specific consent for research ? 1/8 had a standard informed consent form ? All centers would agree to a request for withdrawal of a sample from their research and diagnostics collections.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Henty 2008,Provisions & needs related to data management,"Over 90% of respondents reported that their research generates digital data with less than 10% saying that their research does not generate digital data. // About one-third of respondents have less than 1GB of data and a similar proportion between 1GB and 1TB. Less than five per cent, a comparatively small proportion, reported that they have a larger amount of data, over 1TB. // Over 80% of respondents acknowledged that they do not have a formal data management plan. // There was a wide variety of responses to a question about data storage and backup, with most respondents indicating that they use more than one system, and less than 1% saying that they have no system at all in place. // The overwhelming majority of respondents said that they manage their own data (77.8%). //over three-fifths of respondents are willing to share their data, whether ?openly? (8.6%), ?via negotiated access? (44.0%), ?only after the formal end of a project? (6.4%) or ?only some years after the end of a project? (2.3%).// two-fifths of the respondents say that their data is never made available, for unexplained reasons (19.0%) or because of privacy or confidentiality issues (17.6%). About one-quarter of this group indicated that they would be willing to make their data available if ?an easy mechanism? was available to do so. // The majority of respondents access their data as raw data (52.7%), whether using datasets as a whole (40.2%) or in small chunks (35.0%). // Three-quarters of respondents wanted training related to data management planning, either creating a research data management plan at the beginning of a project (52.0%) or after a project has finished (22.4%). Another large group (32.9%) wanted a data ?exit? plan, a topic designed for researchers who might be retiring or leaving the university or completing a postgraduate degree and moving on. Help with digitisation was also keenly sought by nearly one-third of respondents (30.6%).",University of Queensland Focus Groups - Areas of in DM that were of concern,"Vulnerability of existing data; lack of training and support for proper data management; insufficient storage; need to protect the identity of researchers in contentious areas of research; funding needs to be provided if researchers are to be able to make their research data available for others to use and share",Queensland University of Technology Focus group - data management issues,"Ownership of the data; The sharing and publishing of research data; development of guidelines on the authorship of publications if data is shared; incentives and motivations could be developed to encourage sharing",Digital data,"Over 90% of respondents reported that their research generates digital data with less than 10% saying that their research does not generate digital data. It could be seen as surprising that as many as 10% say that they do not generate digital data, as it is hard to imagine in the current environment that there would be any research which does not involve at the very least the digital generation of text. Perhaps what we are seeing here is a perceived divide between data and text, with some not recognising digital text as data.",Non-digital data forms,"The survey asked what kinds of non-digital data was maintained, to get some estimate of what other kinds of research materials are being generated. These might at some future time need to be digitised or otherwise take care of. The question was, however, flawed, in that it asked for a response only from those who had no digital data. The responses reflected the flaw, and included many indignant comments that research projects tend to generate both digital and nondigital data. Many more responded to this question than had responded to the previous question (that they had no digital data) in order to emphasise the point. A wide variety of non-digital formats were mentioned in the comments: survey and evaluation forms, laboratory notes, client files, photographs, cardboard, plastic and timber models, drawings, audio tapes, radioactivity data in printed form, jewellery and clothing, rocks and shells, draft manuscripts. Some of these can potentially be digitised; some not. More importantly, some of these (such as survey forms, client files or laboratory notes) could be collected digitally to start off with, removing any need for later digitisation or storage.",Types of digital data,"Spreadsheets and databases are the most common, with two-thirds of respondents having them. Slightly fewer have documents and reports, and just less that one half have data automatically generated from or by computer programs. About forty percent have experimental data and email, with diminishing numbers reporting data collected from sensors or instruments, images, scans or X-rays, fieldwork data, digital audio or video files, web sites, laboratory notes, and blogs or discussion threads. Few researchers generated only one type of data. Other responses included a wide variety of data types: ?bibliographies, biographies and other textual elements,? online surveys, secondary data analysis, questionnaires, bibliographic databases, mathematical models, simulations, interview transcripts, computer programs, satellite imagery, GIS data, CAD models, genotyping and sequencing data, electronic health records, music scores, podcasts, laser scanning imagery, GPS measurements, mind maps, flow cytometry data and spectral data , and ?data in the form of CFD [computational fluid dynamic] codes containing specific models for turbulence, chemistry and the like.?",Size of data collection,"Repository managers and data curators are interested in knowing how large data collections are in order to assess likely storage needs. Researchers, on the other hand, do not necessarily think in the same terms, unless the data sets are large and have known storage requirements. Table 3 shows that about one quarter of respondents either do not know how large their data is or did not respond to the question. About one-third of respondents have less than 1GB of data and a similar proportion between 1GB and 1TB. Less than five per cent, a comparatively small proportion, reported that they have a larger amount of data, over 1TB. There were many qualifications to these figures in the comments, with estimates provided in terms of the number of CD-ROMs or DVDs held, or the number of pages of text, or the number of video films or segments. Others commented that their collections are growing, or that they have not started collecting yet.",Software used for analysis or manipulation,"The use of different software for data analysis and manipulation can have an impact on data management and curation. The answers to a question about software use demonstrate what a remarkable range of software is in use. Some is proprietary and well known, some is open source and some is being developed in-house for specific purposes.","Software storage and retention ","QUT included a special question: ?how do you store and retain any software used to generate your research data?? The responses, which were all in free text, on occasion showed a degree of puzzlement. Perhaps it had not occurred to some respondents that software storage might be an issue.",Research Data Management Plans,"Research data management would be easier for all concerned if researchers, research units and research organisations all had policies and plans surrounding the creation and management of data. This survey asked whether individual researchers currently have a formal data management plan. Over 80% of respondents acknowledged that they do not have a formal data management plan. This suggests a need for advocacy and training within the universities. There is currently no formal requirement for researchers in any of the three universities involved in the survey to have a data management plan, although this might change in the future. There is pressure from funders, especially government funders, to ensure that data, once created, is properly managed and stewarded. And there are many who would prefer that the issue of data management is raised at the beginning of the research process rather than later, when it might be too late to prevent data losses and difficulties. An analysis by discipline shows some differences. Current opinion suggests that the science disciplines are more attuned to the need for good data management than those in the humanities and creative arts. In general, the results shown in Table 5 do not show this to be the case, as the largest proportion of those with data management plans are in the Social Sciences (25.5%), and Medicine & Health (21.2%). While the Humanities & Creative Arts appear nearly at the bottom of the list, on only 10.6%, they are only just below those in Science (11.1%), Engineering & Architecture (12.8), IT (13.0%) and Business & Economics (15.3%). The Humanities & Creative Arts do come in above Law, with only 8.3%, but the figure for Law should be interpreted with caution as the sample is so small.",Data storage and backup,"There was a wide variety of responses to a question about data storage and backup, with most respondents indicating that they use more than one system, and less than 1% saying that they have no system at all in place. The small number who reported that they don?t know how their data is backed up (2.6%) at least know that this most basic of data housekeeping is taken care of, even if they don?t know who has responsibility. Presumably their backup is provided by their department or the university more broadly, and it is likely that such a central service would be effective. Whether the other data storage and backup systems mentioned are effective is not apparent. The most frequently mentioned storage and backup systems such as USB/Flash drives (65.2%), CD-Roms (55.7%) and DVDs (38.8%) may be useful in the short term but are unlikely to have any value over long periods as they deteriorate, can no longer be read or get lost. The Storage Area Network (38.5%), Offsite Storage (22.1%) and Tape Storage (15.2%) would seem to be more reliable, although Offsite Storage can, and did, mean a variety of things. Some of the comments provided more detail about offsite storage, which could mean at home, or emailed to a gmail account or other cyberstore facility, or held by a research partner in another institution. The 11.6% who ticked the ?other? box did not provide a lot of detail about what this might have meant, with one person reporting ?Some simply in boxes in research rooms?.",,,,,,,,,,,,,,,,,,,,,,,,,,,,, Hernandez 2012,Data management,"? 72.3% (95% CI = 6.2) of the students who were still in the process of completing their master?s or doctoral research were planning on completing the data life cycle in their research - 65.3% (95% CI = 6.7) of these students intended to archive their research data so that it would be available online ? those who had already completed their graduate degree, 63.9% (95% CI = 16.2) stated that they had completed the data life cycle, whereas only 29.3% (95% CI = 13.1) had made it available online",Metadata,"? Almost one-third of the students whose research was in progress did not know what it means to create metadata for their data sets (28.0%, 95% CI = 8.8), ? A similar number (34.7%, 95% CI = 9.3) did not plan to create metadata for their data sets ? For the students who had finished their research, 25.6% (95% CI = 1.3) created metadata, 63.2% (95% CI = 1.7) did not, but 12.0% (95% CI = 1.3) planned to do so some time in the future",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Herold 2015,Data sharing practices ,"? Seventy-two of 155 (46%) articles indicated that related research data was publicly shared by some method ? The most prevalent method for data sharing was via journal websites, with 91% of data sharing articles using this method ? Ecology, evolution, and behavior scientists shared data at the highest rate (70% of their articles), contrasting with fisheries, wildlife, and conservation biologists (18%), and forest resources (16%)",,"Faculty in EEB (ecology) shared data at the highest rate, with 49 of 70 articles (70%) indicating data sharing. ",,"Sample sizes for some research types were quite small, but results indicated higher prevalence of data sharing for some research types. ",,"? Faculty members with full professor rank shared data for at least one of two reviewed articles at a 50 percent rate (23 of 46) ? Faculty at associate professor rank shared at a 79 percent rate (15 of 19), and those with assistant professor rank shared at a 93 percent rate (13 of 14).",,"The 155 articles reviewed in this study were published in 103 different journals Two or more articles appeared in only 29 (28%) of the 103 journals. The greatest number of articles published in the same journal was five, and this occurred in five journals: Evolution, Northern Journal of Applied Forestry, PloS One, Proceedings of the Royal Society of London Series B-Biological Sciences, and Proceedings of the National Academy of Sciences (PNAS). All articles were published between 1999 and 2014, with 116 (75%) of the articles published in either 2013 or 2014 and only three articles published prior to 2008 (one each in 2005, 2000, and 1999). The 72 data sharing articles were published in 46 journals. Thirteen of these journals contained two or more data sharing articles. The journals containing the highest number of data sharing articles were Evolution and PLoS One, with 5 each, followed by Molecular Phylogenetics and Evolution and Proceedings of the Royal Society B, which each contained four data sharing articles. Most (86%) data sharing articles were published in 2013 (23) or 2014 (39). No data sharing articles were published prior to 2010; only 3 prior to 2012. The journal websites containing data sharing articles were a frequent location for hosting some or all of the data that was shared, accounting for 91% of data sharing instances. There were 66 data sharing articles for which some data or all shared data was located on a journal website; forty-four journals hosted data from one or more of these data sharing articles. Evolution (5), Molecular Phylogenetics and Evolution (4), and PLoS One hosted datafor the highest number of data sharing articles.",,"Supplemental data hosted on journal websites was most frequently available only to subscribers. Supplemental data from 48 (73%) of the 66 data sharing articles that shared via journal websites were available only to journal subscribers. Data from 18 articles (27%) were available open access, including one where it was explicit that the author paid to make the article and data open access in an otherwise non-open access journal. File format types were identified and recorded for all shared data. In many cases more than one file format was used to share different supplemental data for a single article. The most commonly-used file format for data sharing on journal websites (n=66) was PDF (portable document format) with 39 instances. Second most common were Microsoft Word .doc and .docx formats, with a combined 25 instances. Microsoft Excel was the next most common format, with six instances. Archive (.zip and .gz) files were used 5 times to package groups of files. Plain-text files (.txt) were used 5 times. A wide variety of other file formats were used to shared data, none more than three times. They included: .html (hypertext mark-up language), .jpg (image), .tif (image), .mov (video), EPS (Encapsulated PostScript), .mp3 (audio), .csv (comma-separated values), .r (R statistical software code), .c (C programming code), .nex (NEXUS phylogenetic format), and .tre (tree age decision tree format). Among the thirteen articles with shared data in the Dryad repository, .zip files were used frequently to archive/package groups of files or to compress large files. Plain-text files (.txt) were the most often used format to share text-based data, and comma-separated value (.csv) files the most often used format to share numerical or text tabular data. There were few instances of PDF, Microsoft Excel, or Word formatted files in Dryad. Data were also shared in a variety of plain text file formats for use with specific software packages, such as NEXUS and R.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Higman 2015,RDM Policy,"? Many policies lacked detail and specificity, with only 23 of the 37 having a named owner, 20 stating the aim of the policy and 14 actually defining what they meant by ?research data? ? 8 policies vaguely address how RDM was to be funded, and even fewer explicitly acknowledged the aspirational nature of what was being outlined ? a critical issue given most services are still in their infancy ? RDM policies did, highlight the importance of research funders, suggesting RDM networks which frequently span institutional boundaries ? Data sharing were the most commonly cited driver of RDM ? Data management planning was the most frequently mentioned stage in the research data lifecycle",The number and proportion of research data management policy documents mentioning key drivers for RDM ,"8 Source of funding for RDM activities 20 Intellectual property rights 31 Security 33 Funders? requirements 36 Data sharing",The number and proportion of policy documents which mentioned key stages of the research data lifecycle,"37 Data Management Plans 17 Active data management 14 Disposal 36 Preservation",,,,,,,,,,,,,Proportion of access controls for Health repositories,"? Embargo period 100% ? Registration required to access any data for download 100% ? Registration required to access some data for download 100% ? formal application required to access any data for download 100% ? formal application required to access some data for download 100% ?Time limit on access to data 50% ? IP range restrictions 0%",Proportion of access controls for Ecology repositories,"? Embargo period 100% ? Registration required to access any data for download 25% ? Registration required to access some data for download 100% ? formal application required to access any data for download 0% ? formal application required to access some data for download 0% ?Time limit on access to data 25% ? IP range restrictions 50%",Proportion of access controls for Chem/molecular repositories,"? Embargo period 100% ? Registration required to access any data for download 33% ? Registration required to access some data for download 100% ? Formal application required to access any data for download 33% ? Formal application required to access some data for download 33% ?Time limit on access to data 33% ? IP range restrictions 33%",,,,,,,,,,,,,,,,,,,,,,,,,,, Hinnant 2012,Perceptions of ownership of data,"All subjects indicated that the ownership of scientific data generated at the NHMFL rested firmly with the principal investigator of the user team generating the data. ",Data curation perceptions,"They also highlighted how the work orientations of CMP may impact the ability to develop more formalized curation policies, rules and labs at NHMFL. One subject commented on how the nature of the work would impact the ability to implement curation ",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Hiom 2015,Development of Library RDM services,"? Focus on the sustainability planning to ensure that research data management is embedded as a core university service ? ensure that research data management is properly valued and supported within universities",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Hou 2014,Repository profiles,Assessment with 55 criteria proivde profiles of the repositories - these are valuabel for data curation education,Summary of criteria,"? All repositories concentrated on data in digital formats ? 50%+ used customized metadata scheme ? Only 5 repositories had preservation policies ? Most repositories primarily acquired data thru deposit by members ? Long-term sustainability plan only available for 1 repository",Data formats,"All the repositories concentrated on data in digital formats; however, 18.4% (7) also accepted physical forms, such as microfilms, CD, and hard drives. ",Metadata standards,"While 21.1% (8) of the repositories identified and followed specific metadata standards, such as ISO 19115 and Dublin Core, more than half utilizing a customized metadata scheme. ",Preservation Planning,"Only five of the repositories (13.2%) had explicit preservation policies, and only 10.5% (4) provided evidence of complying with an identified certification or accreditation standard (e.g. Information Fair Trade Scheme, ISO 9001, and ISO 14001). ",Data services and support,All but one repository (2.6%) provided information on their services and support functions with 44.7% (17) providing software and tools.,Producers and Consumers,"Most repositories (60.5%; 23) primarily acquired data through deposit by the members of their data communities, and 21.1% (8) of the repositories also received data directly from instruments. Although researchers were the main data producers for the repositories, only 10.5% (4) identified researchers as their only consumers. ",Appraisal ,"Although only 15.8% (6) of the repositories had specific data appraisal and selection procedures available, 71.1% (27) indicated preferences for certain data types.",Ingest,"During the ingest process, 31.6% (12) of the repositories addressed issues regarding transfer of rights, but only 7.9% (3) of the repositories specifically stated that their datasets were ?free from copyright.? ",Preservation ,"Nearly half of repositories (47.4%; 18) provided some level of data processing, and importantly, more than half of the repositories (52.6%; 20) used some type of data identification system.",Store,"A long-term sustainability plan was evident for only one repository (2.6%), and only 15.8% (6) of the repositories covered measures for protecting against damages or losses due to environmental factors, such as natural disasters. ",,,,,,,,,,,,,,,,,,,,,,,,,,,,, Hruby 2013,Pre-repository,"Adoption: not possible Workflow efficiency: 1 RA per dataset Publication quantity: 11.5 Publication quality: 1.7 avg impact factor Time to complete a project: 12 months",Post-repository,"Adoption: 5/8 clinical researchers (62.5%); 4 basic science researchers Workflow efficiency: 5+ RA per dataset Publication quantity: 25.6 Publication quality: 3.1 avg impact factor Time to complete a project: <6 Months",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Huang 2012,Ranking of Data Quality Dimensions,"? On average, the participants ranked Accuracy as of the highest importance and Security the lowest",Ranking of Data Quality Skills,"? Data-error-detection skill was ranked the highest and the data-quality-cost/benefit skill the lowest",Ranking of data quality dimensions by mean importance,"6.27 Accuracy 6.19 Believablility 6.02 Accessibility 5.77 Consistent representation 5.71 Interpretability 5.67 Completness 5.56 Unbiased 5.56 Understandability 5.44 Ease of Manipulation 5.34 Traceability 5.34 Up-to-date 5.19 Appropriate amount of info 5.14 Relevancy 5.1 Value added 4.98 Reputation 4.82 Concise representation 3.78 Security",Ranking of data quality skills by mean importance,"6.04 Data error detection 5.95 Data mining skills 5.81 DQ measurement 5.73 Data quality implication 5.6 Data Quality dimensions 5.57 Data quality audit 5.54 Statistical techniques 5.51 Data entry improvement 5.5 Software tools 5.38 Organization policies 5.34 Data warehouse setup 5.25 User requirement 5.21 Analytic models 5.2 Change process 5.12 Structural Query Language (SQL) 4.94 Information overload 4.84 Data quality cost/benefit",Ranking of 5 data quality constructs by mean ranking,"6.01 Accuracy 5.80 Accessibility 5.52 Usefulness 5.08 Relevance 4.56 Security",Ranking of 4 data quality-skill constructs by mean ranking,"5.71 data quality literacy skills 5.58 Interpretive Skills 5.46 Technical Skills 5.26 Adaptive skills",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Huang 2012,Attitudes and practices to sharing data,"? 90% (91.8%, n = 338) of respondents agreed the sharing of biodiversity data is very important, 7.6% (n = 28) thought it of some importance, and two respondents thought it unimportant ? Over 80% (84.3%, n = 311) of respondents agreed sharing article-related data is a basic responsibility, whereas 11.1% (n = 41) disagreed ? A strong majority of respondents would be willing to share article-related data, but almost two-thirds would prefer not to share before publication.",Experiences of sharing data,"? 85% have always (10.1%, n = 37), often (22.4%, n = 82), or sometimes (52.7%, n = 193) shared article-related data ? The most frequent data archiving approach was through files supplementary to articles (51.5%, n = 169), followed by public databases (38.1%, n = 125) websites (25%, n = 82), or personal websites (12.8%,n = 42) ? Many researchers also shared article-related data with colleagues through e-mail.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Huang 2015,Mean Rank of data quality dimensions for end users,"Attribute: ? Accessibility 2.1 ? Accuracy 1.9 ? Appropriate amount of information 2.8 ? Believability 2.7 ? Completeness 2.7 ? Concise representation 4.4 ?Consistent Representation 3.5 ?Ease of Manipulation 3.9 ?Interpretability 3.8 ?Relevance 3.7 ?Reputation 3.3 ?Security 3.8 ?Traceablity 3.8 ?Unbiased 3.5 ? Understandability 4.0 ?Up-to-date 3.9 ?Value Added 4.4 ",Mean Rank of data quality dimensions for enduser/curator,"Attribute: ? Accessibility 1.5 ? Accuracy 2.1 ? Appropriate amount of information 2.6 ? Believability 3.6 ? Completeness 3.5 ? Concise representation 5.0 ?Consistent Representation 2.4 ?Ease of Manipulation 2.7 ?Interpretability 3.9 ?Relevance 3.0 ?Reputation 3.0 ?Security 4.0 ?Traceablity 3.25 ?Unbiased 1.6 ? Understandability 4.5 ?Up-to-date 4.7 ?Value Added 5 ",Mean Rank of data quality dimensions for Curator,"Attribute: ? Accessibility 1.8 ? Accuracy 1.6 ? Appropriate amount of information 3.3 ? Believability 1.7 ? Completeness 4.0 ? Concise representation 3.0 ?Consistent Representation 3.0 ?Ease of Manipulation 3.5 ?Interpretability 3.5 ?Relevance 2.0 ?Reputation 4.0 ?Security 5.0 ?Traceablity 3.9 ?Unbiased 5.0 ? Understandability 4.0 ?Up-to-date 3.6 ?Value Added 3.0 ",Difference in mean rankings between End user vs end user/curator vs curator,Chi-square analysis revealed several statistically significant differences (p < 0.05) in the top five DQ dimensions and DQ skills by survey participants who perform different curation roles ,data quality Value to different users,"Curators of genomics data valued quality criteria that can be assessed through direct examination of the data more highly, while end users placed a high value on the quality criteria that can be assessed indirectly, such as believability",Data Quality Skills,"Curators appeared to care more about understanding user's requirements and specific data management skills than end users, while end users valued the skills needed to deal with information overload more highly ? those needed to identify useful, relevant information from large amounts of data",data quality values differ between disciplines,"Scientists with different curation roles, given common curation taskswith the sameskill requirements, prioritized different data quality criteria",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Ioannidis 2009,Publication of data in public repositories,? 16 of 18 studies published data in public repositories ,Reproducability of studies using shared data,"? 10 of the 16 studies with shared data could not be reproduced ? Inability to reproduce the analyses was mostly due to unavailability of data (no data at all, n = 2; no individual-level reporter-specific data, n = 1; data on a limited set of genes only, n = 1), inability to determine which data corresponded to which analyses (n= 1) or both (n = 1); unavailability of the used algorithm and software in the public domain (n=1); lack of documentation in preprocessing of data (n=1); raw cel files not available and reproduction efforts were impossible or gave very different results because crucial analytical choices were not made known (n=2).",Discrepencies in reproduced results,"? Discrepancies were mostly due to incomplete data annotation or specification of data processing and analysis. ? More strict publication rules enforcing public data availability and explicit description of data processing and analysis should be considered","1) The main reason for failure to reproduce was data unavailability, and 2) discrepancies were mostly due to incomplete data annotation or specification of data processing and analysis","We reproduced two analyses in principle and six partially or with some discrepancies; ten could not be reproduced",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Ishida 2014,Benefits and Challenges,"? Benefits of the curriculum: adaptability and flexible framework ? challenges in the pilot: significant amount of time needed to create local content and complement the existing curriculum",Overall,"Pilot of data management training for librarians showed that ? NECDMC (New England Collaborative Data Mgmt Curriculum) is a good, thorough introduction to data management, ? that it was possible to adapt NECDMC to the local and Canadian settings in an effective way.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Jetten 2014,From pilot have learned: ,"? Research data services are best developed by holding in-depth conversations with researchers. ? Research data management is tailor-made and consequently time-consuming for both researchers and supporting staff. ? Different types of research data require different advice and support. This reaffirms the claim that research data management is tailor-made. ? Research data management policies are best made within research institutes as they are aware of the different practices and needs with regard to research data management. ? Big data and long tail data require different infrastructural solutions as existing archives in the Netherlands do not offer solutions for big data (yet) due to their size, only for long tail data.",Challenges that lie ahead: ,"? Migrating from pilot to embedded services, both regarding infrastructure (embedding the Donders infrastructure in the Research Institute) and support (embedding the Expertise Centre Research Data in the University Library of the future) ? Migrating from a Donders Infrastructure to a more broader Radboud University infrastructure. ? Creating awareness among researchers, which will be stimulated by the research institute policies and protocols in progress as well as by funder requirements with regard to Research Data Management.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Johnston 2010,Where do you store your research data electronically? ,"63% Work Desktop Computer 39% Work Laptop computer 14% Home desktop Computer 24% Home Laptop Computer 40% Department/unit server 22% Collegiate server within 3% Collegiate server outside unit 5% Central facility 14% Central server 5% external server 15% Other external hard drives",Back-up Solutions,"23% Collegiate IT 26% Department IT 14% Central IT 4% External provider 43% Secondary Hard drive 29% CD/DVD or other removable media 9% Other 6% do not back up data",Local point of contact ,"The Local IT point of contact, such as the departmental or collegiate IT admin, is the primary end-user resource for most cyberinfrastructure needs. ","Needs related to data production, access and storage","? 19% of researchers who do generate over 1GB of data per week, only a small fraction of these generally produce terabytes of data (or 1024 GB) per week. ? Of those who generate over 1 GB of data per week, Biological sciences represent 21% of responses, Physical sciences15%, Engineering14%, Health sciences 19%, Interdisciplinary fields11%, and Social Sciences 15%. Only arts and humanities, each representing 2%, where less evenly distributed. ? Most researchers (64%) access their data every day (Figure 11). Daily access is most consistent in the physical sciences (89%) and by many in social sciences (75%), biological, engineering (71%), humanities (75%), and interdisciplinary (62%) fields. Arts (56%) and Heath (59%) researchers reported slightly lower daily access to their data. ? 70% of researchers would like to keep their data forever ? Many researchers do not require their data to be securely stored (with 40% respondents not requiring password authentication and 38% of respondents not needing their data physical secure onsite) therefore many can take advantage of remote storage options such as data repositories.",Interdisciplinary research involves collaborating and sharing data ,"? Data is shared by 92% of all researchers and primarily with researchers in their own unit or researchers on campus (51%, 18% respectively). ? Only 8% of researchers do not share their data at some level. ? Of those, 52% would share their data, 30% would not and 19% did not know. ? Of those 8% of respondents who are not sharing their data, the arts and humanities disciplines represent the majority of responses with 44% and 50% respectively. ",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Johnston 2014,Data Collection/Organization,"? Students used date-based file naming structures, even when they weren?t familiar with the concept of a file naming structure ? Students did not consider data security an issue and felt that they had adequate protections in place. ? Back-up of their data was often sporadic or nonexistent ? Students agreed that they had no formal data mgmt instruction but had to rely on their peers, family, and previous experience for direction",Data Processing/Analysis and Results,"? Regardless of format, a process of further manipulation of the data was described ? Students were not receiving all the support they needed in more advanced data analysis",Sharing and Archiving,"? All students had shared data ? Graduate students did not see the value in archiving similar data sets together in a subject-based repository structure",Preservation,"Students were unclear who held the responsibility to preserve the data for long term access",Data storage: how and where do you store your data? ,Desktop computer (100%) ,Formal policies: are there formal procedures that you follow in managing your data set?,I don?t know (50%) ,File naming systems: do you use a file naming system to assign names to your data files? ,"? Always (50%) ?by month of date collected? ? Sometimes (25%) ?download files then make a copy and rename them according to when they happened? ? I don't know (25%)",Backup: do you make backup copies of your data? ,"? Always (50%) ?occasionally and sporadic? ? Never (50%) ?sort of by accident I do make backup copies because I have of course these multiple Excel files that have the same data in them.? ",Security: do you take security measures to protect your data? ,"? Always (50%) ?password protection to computer? ? Sometimes (25%) ?authenticated computer and locked lab door? ? Never (25%) ?computer has security and the web site is password protected?",Version control: do you follow a system to identify and track different versions of your data?,"? Always (75%) ?by date, labeled in the filename? ? Never (25%) ",Authenticity: do you provide a means to identify the ?official? or ?authoritative? version of your data? ,"? Always (25%) ?place ?Actual? in the file name to indicate the official version? ? Sometimes (25%) ?email current version to other group members? ? Occasionally (25%) ?just remember it as the one that you work with the most? ? Never (25%) ",Documentation: please describe how this data set is documented and described. ,Varies. ?We write our own descriptions of the data within the Excel files when the data are compiled.?,Reproduction: is there sufficient documentation to reproduce the data? ,? Yes (100%) ,,,,,,,,,,,,,,,,,,,,,,,,, Jones 2013,Data management strategy,"? A Research Data Management Strategy and Strategic Plan has been released for the period 2012-2015 ? This puts forward five data management themes, aligning each with other University strategies to show how research data management contributes to the University?s research, education and professional objectives",Data management policy,"? RDM policy and procedures were developed over several years of engagement with researchers. ? Whether policies should pre-date infrastructure is a moot point. Without the infrastructure a policy can be complex to implement, but without the policy it can be hard to leverage investment to develop the infrastructure ? Monash University chose to address both simultaneously, continuing to build the infrastructure and capability needed while developing the policy framework to demonstrate the institution?s commitment to improving RDM.",Guidance and training,"? Webpages have been available since May 2009 ? Training is being addressed in a number of ways at Monash to ensure a range of options are available to meet different audiences and needs ? Two-hour data planning seminars aimed at new postgraduate research students have been run, along with induction and workshop sessions (particularly within faculties and schools), e-Research seminars, workshops and network breakfasts",Research data storage and archiving,"Established in 2006, the Large Research Data Store (LaRDS) is the preferred storage environment for research data at Monash University",RDM platforms,"? Approach of developing data management capability and platforms during the active phase of research ? They have adopted a federated, disciplinary approach to research data management via RDM platforms. ",Metadata,"? Australia aspires to maintain a catalogue of research data in an accessible form, as outlined in the Australian Code for the Responsible Conduct of Research",Data Management Planning,"? Monash University guidelines encourage all researchers to undertake data management planning at the start of each research project ? The library developed an initial data plan template in 2007 and trialled this with researchers",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Kansa 2014,Data publishing: Challenges,"Solicitation - Project too ?new? to share publicly Metadata documentation - Incomplete metadata; Crediting data creators in large team projects Review, decoding, and editing - Non-unique primary identifiers; Coded data; Data consistency Linked data annotation - Data annotation Reuse/analysis - Insufficient information for analysis; Poor data modeling practices",Data Editing: Challenges,"? some (two) datasets were not detailed enough to include in the data integration phase of the project (for example, data tables containing summary data rather than record-by-record data) ? two other cases: participants submitted datasets in code, which vastly increased the amount of time we had to spend in preparing datasets for publication",Data Annotation: Challenges,"? Some classifications important to zooarchaeology lacked representation in the EOL or UBERON vocabularies ? in other cases, an existing ontology may have related concepts, but those concepts may map poorly to a specific domain need",Data Interpretation and Reuse,"? The participants in this project had confidence in using the edited and ontology-annotated data for many types of comparative analysis, particularly those forms of analysis less sensitive to sampling biases ? However, certain forms of comparative analysis proved more challenging ? Researchers needed more information about factors that may bias sampling. For example, some datasets in this study contained a large number of molluscs. Researchers needed to know if the absence of molluscs meant that the ancient inhabitants did not exploit marine resources, or that molluscs were simply not recorded in some databases ? Understanding such ?missing data? is critical for many forms of reuse and these types of sampling biases need documentation in the project metadata",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Karasti 2006,Science Drivers of the LTER network,"? Site ecosystem Research ? Long-term research ? Global research ? Collaborative research (network of sites and multi/interdiciplinary) ? Publicacly funded research (i.e mandate for open data access)",Challenges of data sharing,"? LTER has open access policies and requires each member site to have primary research data on the internet 2 years after it's collection ? LTER sceintists are encouraged to prepare data for future use by providing context and documenting with metadata ? One issue scientists face is the time required to prepare data for reuse ? Sites selected to be part of LTER thru competition process ? NSF aligned LTER funding with mandated open access policies primary research ? data must be available 2 yrs after collection",Challenges of data stewardship,"? The nature of ecologic data (variability) makes it particularily difficult to describe adequately enough for others to use ? Ecologic datasets are also complex, requireing extensive quality assurance and control before preserving them in a public database ? LTER datasets are also contantly changing- stresses the need for continuous data management, 'dynamic datasets' accrue annual additions and are subject to various revisions ? More standardized ways of describing data are needed, open access to research data requires more standardized ways of describing data ",Ongoing data managing with extended temporal horizon,"Activities include: ? Recovering legacy datasets ? Attending to ongoing data collection ? Designing for the future",Intensive data description,"Two reasons the LTER requires intensive data description requirements: ? Small and highly diverse ecological data sets themselves are highly variable ? The many ways in which data may be used and reused need to be accounted for in the data descriptions Data description is essential however highly time-consuming and labor intensive",LTER data activity characteristics,"Data taking ? Site specific ecological and social science data ? Observational, largely non-reproducible data ? Heterogeneous and complex data (sets) Data preserving ? Dynamic data sets, annually/seasonal updates Data describing ? Multi-site data category building Data using ? Long-term site and network science Data sharing ? Open, public access to data and metadata 2 years after collection Data reusing Appropriate data structure, context and presentation for interoperability All reuses cannot be anticipated",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Kennan 2012,Data types collected,"? Photos (15/15) ? location (13/15) ? ecological vegetation class and habitats (11/15) ? insects/pollinators (10/15) ? birds (10/15) ? plant lists/surveys (9/15) ? seeds (8/15) ? cuttings/specimens (8/15) ? time of day (8/15) ? height or other growth patterns (8/15) ? distortions (6/15) ? fauna (4/15) ? weather (4/16) ? plant behaviour (3/15) ? perfume (2/15).",Data collection methods,"? 15/15 Digital camera ? 8/15 GPS ? maps also considered an important tool ? 11/15 used notes ? 5/15 mental notes",Data Storing,"? Recognition by some participants that not all data collected is worth storing or sharing. ? All participants stored their photos on their personal computers ? Computer back-up: used a variety of methods including memory sticks and external discs, many recognising the precarious nature of the data should a problem occur with their computer. ? Another popular method of storing data was in hand-written note form (often in folders or files), in notebooks, index, or through book annotations (10/15), while others (6/10) saw their memories as storage spaces.",Connecting and sharing: Gaining and providing access to data and information ,"? Thus participants reported their own dif culties in gaining access to data and information in which they are interested, even where they had contributed data and information. ? Reluctance to share through formal process, and they identified reasons such as lack of reciprocal access once they had contributed, lack of acknowledgement and attribution, the difficulty of preparing data for deposit in databases and repositories which may not be constructed with them or their data, values and needs in mind, as well as the time it takes to prepare data for deposit. ? they were also concerned over quality control.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Kennan 2014,Constraints on RDM service development,"? Staff need additional knowledge/skills 76.5% ? Staff need additional confidence 63.6% ? Differing levels of demand 44.7% ? Differing specialist needs 49.1% ? Not a priority 30.3% ? Not perceived by others as a library role 37.9% comments - resourcing (capacity) issues were frequently mentioned as a major constraining factor ",Current staff education and training for RDM,"? Learn on-the-job 80.6% ? Are self-trained 64.3% ? Within the library: in-service training or seminars 54.1% ? Library-funded external professional development 56.1% ? Prior to joining the staff: part of their LIS or other education 28.6% comments - ?other? responses for RDM training and education in Australia mentioned training through the Australian National Data Service (ANDS), specialized research projects involving an in-built training component",Need for RDM education,"As preparatory education Yes ? core curriculum 39.3% Yes ? as elective unit 58.5% No 2.2% As continuing professional development Yes ? as external training 70.1% Yes ? as in-house training 29.1% No 0.7% ",Additional knowledge required for RDM services,"Data curation skills 90.2% Technical and ICT skills 78.9% Knowledge of research processes 79.7% Knowledge of research methods 67.5% Subject and/or disciplinary knowledge 43.1% comments - importance of a knowledge of policies (e.g. copyright, open data, ethics), of broader aspects of RDM, and of very specific skills (e.g. metadata, minting of DOIs for research data, and the ability to conduct research data interviews) ",Demand for research support services not currently offered,"? Greatest area of demand was RDM 19% ? Many services were listed by three libraries as having a growing demand: data storage, metadata creation for data discoverability, additional IR services such as statistics and reporting, data preservation, and copyright and intellectual property assistance",Planning for new research support services,"37% planning to offer new research data services (variously reported as data management, preservation, storage, curation, archiving and deposit).",Are these new research support services ongoing?,"? 69% of respondents indicated that these new services are part of their ongoing services with ongoing funding ? 13% of respondents indicated that while these services are currently project funded, they are expecting that after the projects are complete, the services will continue, and only 13% stated that once the projects are complete, they will be unfunded.",Summary,"Findings of this study suggest that LIS educators should take cognizance of the ongoing changes and prevailing trends in the profession with special reference to RDM",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Kerby 2015,Data reuse,"? Over two-thirds of the articles gave no indication that data was reused in the study ? However, in nearly 25% of the articles the authors did acknowledge the receipt and use of data from a colleague ? Many authors did this more formally in an acknowledgements section at the end of the article, but some included the acknowledgement in the methods section when explaining how data was generated or collected for the study ? In some cases, the ?data? was a biological substance such as tissue or cell cultures, which obviously need to be handled much differently than a spreadsheet full of numbers or clinical records ? A few articles noted that the authors reused data from a previous study, for example, where the data collected during one study (or even chain of experiments) was analyzed in different ways to produce information for multiple articles ? Several articles indicated that genetic sequence data from GenBank was reused in the study. ",Data sharing,"? Across all articles, there was very little indication that the authors were publically sharing their data ? Only 20% stated that data produced in the study was openly available, either through a public repository or on the publisher?s website as a supplementary file ? These were the only two methods of sharing mentioned ? The only repository used appeared to be GenBank.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Kervin 2012,Organizational Macro factors in the Lab,? The two coercive macro factors that strongly impact researchers? data practices in these interviews were publication requirements and funding source mandates.,Individual Micro factors in the lab,"? Both senior and junior researchers described how senior mentors taught junior researchers the normative values surrounding scientific data. ? According to the researchers interviewed in this study, effective data management included maintaining a lab notebook, organizing digital data, and communication of that data through either publication or commercialization.",Sharing,"? Widespread data sharing only occurred when external macro pressures are strongly aligned with multiple internal micro pressures ?When data sharing did occur, it was a side effect of publication, and not necessarily a result of idealized scientific norms",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Kervin 2013,Common errors in data organization and metadata completeness,"There were seven overarching error categories - These categories represent errors researchers regularly make at each stage of the Data Life Cycle: ? Collection & Organization ? Assure ? Description ? Preserve ? Discover ? Integrate ? Analyze/Visualize. Collection & Organization and Description errors were some of the most common errors, both of which occurred in over 90% of the papers ? The most common Data Life Cycle Element errors were Description errors (51; 96.2%), and data papers contained an average of 9.3 Description errors ? Many such errors (83.0%) were simple editing errors, including grammatical errors (44; 58.5%) that ranged from awkward sentence structure or wordiness, to simple mistakes that an automatic grammar check would catch, such as missing spaces after a period",Common errors in metadata completeness,"? The most common errors occurred when the researcher did not provide adequate metadata to enable others to interpret and potentially re-use the data ? Errors in descriptive metadata were also very common (39; 73.6%) and many researchers (24; 45.3%) had a tendency to use either vague terms, such as ?moderate? or ?extreme,? or field jargon, such as ?degree of fragmentation,? without clearly defining those terms",,"? Reviewers identified an average of 20.3 errors per data paper ? The numbers of errors identified by reviewers varied yearly and there were no consistent long-term trends with respect to the overall number of errors ? Through all years, the most common errors identified by reviewers were Collection & Organization and Description errors ? Reviewers also consistently identified Assure errors each year, although thesewere significantly less common than Collection & Organization or Description errors. ? Most data papers (49 out of 53; 92.5%) had errors associated with Collection and Organization ? On average, each data paper had 7.8 Collection and Organization errors ? The most common Collectionand Organization errors were in the description of collection methods (38; 71.7%) not adequately describing the data collection site or time frame (29; 54.7%); and omitting relevant variables that were important for future analysis of the data set (43.4%) ? Nearly half (26; 49.1%) of the papers had an error in the description of the data collection protocol, including errors of omission, such as neglecting to explain how long samples, such as water or soil samples, were storedbefore analysis ? Errors in the description of the data collection site included not describing how the site was determined or subdivided, including whether critical points of plots, such as edges or center points, were clearly marked ? Over half of the papers analyzed (28; 52.8%) had errors in the description of Quality Assurance/ Quality Control (QA/QC) procedures, with an average of 1.2 errors perpaper. Nearly one third of the papers (32.1%) did not adequately describe their QA/QC procedures. Errors ranged from neglecting to provide basic statistics regarding the data, such as ranges or mean values, to incomplete descriptions of logical consistency checks or benchmarks used to verify the accuracy of the data.",,"Reviewers of data papers noted errors related to the long-term preservation and storage of the submitted data sets in about one in five papers (12; 22.6%). Authors might not provide details regarding the maintenance of the data set, in cases of data sets archived over extended periods. Over half of the papers (28; 52.8%) had Discover errors that would affect the ability to discover a particular data set and to assess the data set?s utility. Despite the large number of papers with Discover errors, the average number of Discover errors per paper was much lower, with only 1.2 errors per paper. The most common Discover errors were insufficient description of access or use constraints (7; 13.2%); insufficient description of the data set?s contributions and limitations (18; 34.0%); and not including information that would make finding the data set easier for potential data reusers (11; 20.8%). Of this last category, 17.0% were a result of authors not including all relevant information in the abstract such as not including the years of data collection or not summarizing the data collection methods. About six percent (3; 5.7%) of these papers had errors in the integration of data sets. Each data paper that had an error of this type failed to properly cite the sources of data that went into the integrated data set. For example, one data paper provided climatic data to supplement the data collected, but neglected to acknowledge the source of the climatic data. Another data paper did not use the most current version of the referenced data source.",,"While most data papers presented raw data sets, numerous papers included some analysis of the data. Seventeen percent (9) of the data papers analyzed had some type of error in the presentation of the analysis or visualization results. These errors included neglecting to include statistical significance of the analysis results, not including all relevant variables, and not explaining how the data changed during the analysis process. Of the nine papers that had Analyze/ Visualize errors, seven authors did not sufficiently describe their analysis methods, such as not documenting formulas used to create new variables or data sets.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Killeen 2012,Data analysis,"research data analysis is often far from a wellstructured linear process. Therefore one has to sensibly target appropriate processes to codify in quasi-static workflows.",Data descriptors,Selection of input data can be problematic. This is because the descriptors (e.g. name) for the data may not be standardized (or repeat acquisition name changes confound automatic selection).,Summary,"""Demonstrated that the framework functions satisfactorily, although there are some components still to be implemented and refined. This work demonstrates that we will be able to substantially automate the processing of data held in DaRIS repositories""",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Kim 2015,"Social scientists? data sharing behaviors are mainly driven by personal motivations (i.e., perceived careerbenefit and risk, perceived effort, and attitude toward data sharing) and perceived normative pressure.",,"Funding agencies? pressure, journals? pressure, and availability of data repository were not found to besignificant factors in influencing social scientists? data sharing.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Kim 2011,Dimensions of work: data,"Six major duties with respect to ?data,? including ? collecting primary data (cleaning and checking data, ? collecting original data, and understanding data needs), ? collecting secondary data (such as previous literature or public/commercial data sets), ? storing data (creating databases, managing metadata, and storing data), ? managing data (cleaning, annotating, managing, maintaining, and future planning), ? analyzing data (statistical analysis, processing scripts), and ? presenting data (helping researchers to access data, posting data for wide access, dealing with data ownership, and writing about data). ",Dimensions of work: people,"Major duties for the eScience professionals in terms of working with ?people?: ? locating collaboration opportunities, ? communicating with others, ? enabling collaborations and organizing teams, ? analyzing researchers? technology needs, ? coordinating between researchers and information technology experts (e.g., with technology requirements and specifications), ensuring compliance, and ? training researchers and others in using technologies. ",Dimensions of work: things,"Major duties that mainly pertained to the use of ?things? ? primarily computers and software: ? investigating technology solutions, ? recommending technology solutions (by comparing technologies), ? implementing IT for researchers (installing operating systems, installing software applications, ? managing collaborative technologies, and configuring systems by using scripting), ? maintaining and managing the technologies (administering systems, ? maintaining tools/technologies, and facilitating IT usage), ? preparing, compiling, and managing documents, and ? managing budgets and project processes.",Dimensions of work: overall,"We identified eight different skills: ? administrative skills, ? communication skills, ? database management skills, ? programming and scripting skills, ? project management skills, ? research skills, ? system administration skills and ? general computer skills",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Kim 2012,Sharing practices,"? Most of the interviewees felt that they have limited individual authority to share their data by acknowledging that sometimes they need to seek permission from others for any collaboratively collected data. Only two interviewees (one post-doc and one doctoral candidate) felt they had no authority over sharing the data they collected ? Most researchers reported internal data sharing within their research teams or among collaborators; they usually used email, FTP servers, and website as the major internal data sharing methods ? Researchers asserted that they share their data upon request; they use email or website upload as method of fulfilling such requests ? Researchers also reported contacting other researchers individually to gain access to their data sets from published articles. ? Across different disciplines, this data sharing method was common, and it was the only data sharing method in the disciplines that do not have any informal or formal data repositories ? Researchers in certain disciplines such as chemistry ?where there are small, but highly structured data sets ? share their data as an electronic supplement through the journals? websites.",Motivating Factors,"? The single most significant motivation for scientists? data sharing (giving) is a push by funding agencies to make data from funded projects available ? In many disciplines, data sharing is considered part of the professional responsibility; researchers believe that data sharing is one of their missions, and that it will help the development of their research disciplines. ? In these same disciplines, researchers reported that they are expected to share their data; they feel pressure from their colleagues to do so. ? Researchers reported observing what other researchers do, and they indicated that they tried to follow colleagues? practices that they saw as useful. ? A few researchers reported a belief that the research perrformance of other researchers who use the shared data would improve ? some researchers reported a belief that data sharing could highlight the quality of their work in research. For some, data sharing provided professional ?credit? including coauthorship, citation, and acknowledgement, and reputation ? In terms of using the shared data, researchers also believed that data sharing would improve their research (e.g. time saving in collecting the same data, replicating data for another research, conducting diverse comparison studies and large scale research) ? Many researchers worried about losing publication opportunities by sharing their data. It took a lot of time and effort to collect data, and they desired having as many publications as possible from their data. ? These researchers also worried about getting scooped on innovative findings when they shared their data with other researchers ? altruism emerged in about half of the interviews as a factor influencing researchers' data sharing.","Factors Influencing Data Sharing - Institutional Factors","? Pressures by funding agencies, journal publishers, and private funding organizations influenced researchers? data sharing practice. ? the single most significant motivation for scientists? data sharing (giving) is a push by funding agencies to make data from funded projects available. ? Scientific funding agencies in the U.S. including NSF and National Institutes of Health (NIH) require their awardees to share the research data from projects they fund. Second, journals? requirement of data sharing is another factor ? The journals in biology, chemistry, and some in ecology require their researchers to publish their data in any types of data repositories ? private and certain government funding agencies restrict researchers? data sharing - For example, some pharmaceutical companies and military agencies typically do not allow their awardees to share their data. ? Disciplinary influences also affected researchers? data sharing. In many disciplines, data sharing is considered part of the professional responsibility; researchers believe that data sharing is one of their missions, and that it will help the development of their research disciplines ? In these same disciplines, researchers reported that they are expected to share their data; they feel pressure from their colleagues to do so ? Researchers reported observing what other researchers do, and they indicated that they tried to follow colleagues? practices that they saw as useful ? A few researchers reported a belief that the research performance of other researchers who use the shared data would improve.","Factors Influencing Data Sharing - Individual Motivation Factors","? Researchers also gave evidence that they carefully examined pros and cons of data sharing before they committed to sharing data ? First of all, some researchers reported a belief that data sharing could highlight the quality of their work in research ? For some, data sharing provided professional ?credit? including coauthorship, citation, and acknowledgement, and reputation. ? In terms of using the shared data, researchers also believed that data sharing would improve their research (e.g. time saving in collecting the same data, replicating data for another research, conducting diverse comparison studies and large scale research) ? In contrast, researchers also believed that data sharing imposes costs for them. In some scientific disciplines (e.g. ecology and environmental engineering) researchers saw the importance of data sharing, but they saw data sharing as very costly in time and effort ? Due to a lack of established metadata standards and data preparation procedures, they saw the processes of organizing and annotating their data as very expensive ? These same researchers also reported technical problems in the data sharing such as data compatibility and interoperability issues ? This was a similar finding across each discipline that did not have well-established data sharing standards (metadata), procedures, and repositories ? Researchers in those disciplines also reported that it took substantial time to locate and understand other researchers? data since the data do not have any established data repositories and standardized metadata ? Certain perceived risks by researchers also prevented them from sharing their data with other researchers ? Many researchers worried about losing publication opportunities by sharing their data ? It took a lot of time and effort to collect data, and they desired having as many publications as possible from their data ? These researchers also worried about getting scooped on innovative findings when they shared their data with other researchers ? Several researchers considered that misinterpretation and heightened scrutiny of their data would be possible risks if they shared their data.","Factors Influencing Data Sharing - Perceived Controllability: IT Capability Factors","? IT capabilities were found to be important factors influencing researchers? data sharing practice. ? We focused our questioning on two distinct areas: an individual?s self perceived capability to work with the relevant IT tools, including local support (internal capability), and the availability of appropriate community tools and infrastructure (external capability). Internal capability included researchers? own expertise in information and technology management in sharing their data, and also included any information management and/or IT support from within their own research team or host organization. ? Researchers with strong expertise and internal support in these areas also reported more extensive data sharing and reuse ? External IT capability referred to supports for researchers to share their data provided by the research community at large ? In this area, researchers reported data repositories, data standards (i.e., metadata standards), and established data sharing procedures as key features. ? Biologists and chemists reported that they could easily share their data because they have well developed data repositories, standards, and procedures to share their data with other researchers. ? Researchers in engineering fields generally did not report any central or domain data repositories ? These engineers also reported needing to spend a lot of time to annotate, organize, upload, and manage their data on subject-specific or ad hoc data repositories. ? Researchers in ecology reported that they are aware of the importance of data repositories and standards and they have developed domain specific repositories and subject specific repositories ? Since their data were unstructured, however, they reported that they still needed to develop better metadata standards and data sharing procedures.","Factors Influencing Data Sharing - Altruism","? Unexpectedly, altruism emerged in about half of the interviews as a factor influencing researchers' data sharing ? Some researchers reported a strong desire to help their colleagues to save time in collecting data and to avoid replicating experiments unnecessarily ? Additionally, these researchers believed that their colleagues could exploit the data in ways that would extend the original findings and thereby benefit the scientific area where they collectively worked ? These researchers reported a sense of personal satisfaction coming from sharing their data ? A couple of our interviewees mentioned the importance of data sharing across disciplines not only within a discipline. ",Changes in Data Sharing,"? Interviewees reported that during recent years they had observed changes in their data sharing practices ? Many of our interviewees reported that researchers? awareness, funding agencies? push, journals? requirements, technological improvements, and increased availability of data repository as changes they had experienced within recent memory ? Just a few mentioned the emergence of data sharing standards as another recent change.",Supports Needed for Data Sharing,"? Ten of our 25 interviewees mentioned they do not need any supports since they are satisfied with their current data sharing practices ? One biologist and one chemist said that they can easily share their data because they have well-established metadata standards, data sharing procedures, and data repositories ? the remainder of our interviewees mentioned that metadata standards and data repositories are the main concerns of their current data sharing practice ? two researchers mentioned that they desired a data portal site where they could search available data sets ? Several interviewees indicated that they needed better technology support. In particular, they reported that they needed professionals who could manage data sets, databases, storage, and other IT infrastructure.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Kim 2016,"Pressure by funding agencies would positively influence a scientist?s norm of data sharing.",Supported; p < 0.001,Pressure by funding agencies would positively influence a scientist?s data-sharing behaviour.,Not supported; p? 0.05,Pressure by journals would positively influence a scientist?s norm of data sharing.,Supported; p < 0.001,Pressure by journals would positively influence a scientist?s data-sharing behaviour,Supported; p<0.001,Availability of data repository would positively influence ascientist?s norm of data sharing,Supported; p < 0.001,Availability of data repository would positively influence a scientist?s data-sharing behaviour.,Supported; p < 0.001,Availability of metadata would positively influence a scientist?s norm of data sharing.,Supported; p < 0.001,Availability of metadata would positively influence a scientist?sdata-sharing behaviour.,Not supported; p< 0.05,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Kim 2016b,"Regulative pressure by journals, normative pressure at a discipline level, and perceived career benefit and scholarly altruism at an individual level had significant positive relationships with data-sharing behaviors.",,Perceived effort had a significant negative relationship,,"Regulative pressure by funding agencies and the availability of data repositories at a discipline level and perceived career risk at an individual level were not found to have any significant relationships with data-sharing behaviors.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Kirlew 2011,Annual usage levels for most repositories have increased.,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Knight 2012,Information management practices,"? Both research and business staff recognized the value of storing data in a form that would allow it to be easily accessed and used by others. ? However, their perception of data usage differed: business departments focused primarily on access to information for validation purposes or to fulfill business needs, whereas the research groups had developed their thinking to consider the potential for access and reuse of their material to perform other, currently unrecognized types of research ? Research groups performed appraisal to verify the accuracy of the content and made changes as necessary to ensure it was up-to-date, perceiving that their digital information had longterm research value and should not be removed",Risk factors limiting access & use,"Many departments/groups failed to monitor the integrity of their digital assets (e.g., creation and validation of MD5, SHA-1, or SHA-2 checksums) and, as a result, were unable to identify early signs of media failure or determine if and when a data change or loss event has occurred ? Custom approaches to use of data formats, naming conventions, and structuring ? If unaddressed, this may cause confusion with regards to the master copy of a data asset, unnecessary use of disk space through storage of duplicate content, and unnecessary expenditure of staff time to locate information. ? Poorly understood retention criteria ? As a result, data may be stored ?just in case it is needed? or removed during the period that it continues to have use, potentially resulting in the breach of legal obligations and limitations.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Knight 2015,"The primary challenge that a small-scale RDM Service must address is how it will support the needs of a large body of academic researchers, while also introducing improvements in practice, in a sustainable and resource-efficient manner.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Kowalczyk 2011,"3 levels of uniqueness for this data: 1) No other holdings exist - therefore preservation-worthy 2) Slides have specific trmt for a specific research ques 3) quantity and quality of the data: the level of uniformity and integration of the data, the breadth of data, longitudinal nature of the data, or the added value of metadata",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Kratz 2015,"Researcher expectations of data publication center on availability, generally through an open database or repository.","? 68% (n = 166) expect a published dataset to be openly available ? 54% (n = 133) expect it to be in a repository or database. ","Few respondents expected published data to be peer-reviewed, but peer-reviewed data enjoyed much greater trust and prestige","? 29% (n = 70) expected published data to have been peerreviewed ? 72% (n = 175) said it conferred high or complete confidence ? 2% (n = 4) would feel little or no confidence",What does ?data publication? and ?data peer review? actually mean to researchers. ,"The most prevalent expectations relate to access: ? 68% (n = 166) expect a published dataset to be openly available ? 54% (n = 133) expect it to be in a repository or database ? More researchers expected a published dataset to be accompanied by a traditional publication (43%, n = 105) than by a data paper (22%, n = 55). ? Only a minority of 29% (n = 70) expected published data to have been peer reviewed.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Kruse 2014,Data storage,"Apart from a few exceptions, Danish universities, as a rule, provide researchers only with general access to the university computer drives, and not with special facilities for data storage.",Data preservation,"This activity is, in general, considered too resource-demanding for the uni-versities to undertake, and with a 10+ years? time scale, it is not regarded as a natural task for the universities. ",Data archives,"The universities favour a national solution consisting of several data archives as a supplement to, and an expansion of, the existing archives.",Researchers? awareness and research data as university branding,The majority of universities is already carrying out campaigns or other activi-ties ? or is planning to do so ? with the aim of increasing the researchers? awareness of the potentials of preservation and accessibility of research data,Perspectives,"A general result of the survey is that activities in the field of research data storage, preservation, archiving and sharing vary from university to univer-sity.","Same play, new actors ? an update ","Since the DEFF project was completed, several different institutions in Denmark have been active in developing infrastructures for research data management.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Kuchinke 2010,"Quality management systems for data management are in place in most centres/units ","? 90% of centres have a Clinical Data Management System in routine use ? 50% are commercial systems",,"Most widely used functionalities in ECRIN data centres/units are: ? data collection (94% of centres using CDMS routinely) ? query management (89%) ? reporting (74%) ? double data entry, safety management and study management are supported in approximately 50% of centres with CDMS in routine use.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Kutay 2014,Preservation activitiy,"Digitally scanning analog documents is the most popular preservation activity performed on analog documents (18.8%) ",Library partnership,Low percentage of affirmative responses (7.7% of primary research collectors; 9.7% of primary research creators),Familiarity with digital repositories at academic libraries,"? 62.5% stated they were ?not familiar? ? 30.4% stated they were ?somewhat familiar? ? 7.1% claimed being ?very familiar?.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Lage 2011,Most researchers who participated in this study identified their research data as non-public (20 out of 26),"? Some researchers share their data within their lab or with other collaborators, while others do not share their data at all ? Many identified themselves as ?gatekeepers,? responding that their data is not public, but they would share their data with another researcher if they considered it appropriate.","While digital storage space is not an issue, server maintenance and management are on-going problems.",,"Many researchers had curation plans in place for much of their data, but also had subsets of orphan data.","Orphan data: ? fell outside the main scope of their research and had no curation or maintenance plan ? For many of researchers they realized they did need assistance with this data ? Few interviewees had departmental procedures for data preservation ? Some researchers participated in disciplinary-based repositories that supported longterm storage of their data while others had regular data back-up procedures for their lab. ",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Laney 2015,Management of sensor data,"? 68% (52 out of 77) of respondents store their data in individual lab archives, including local hard drives, cloud storage, or lab notebooks ? 73% (60 out of 82) of respondents use external data that they find online ? 57% (47 out of 82) host their own website",Metadata standards,"? Few groups use community-developed metadata standard formats ? < 12% (7 out of 60) annotate their data with machine-readable tags",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Lemke 2010,Participants have a concern about sharing ,"Approximately half of the public (46%) and NUgene (56%) participants indicated that they were somewhat or very concerned about confidentiality and privacy of medical information","Questions and Concerns about Sharing Genetic Research Data and Need for Transparency in the NIH GWAS Data Sharing Policy"," In both focus group types, there were varying views on whether or not genetic research data should be shared with other investigators and participants discussed what they would require in order to feel comfortable having their data shared.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Lemke 2011,Importance of IRB Guidance,"? 77.2% indicated that such guidance is very or somewhat important for developing a data repository or biobank that includes genetic data ? 70% said guidance is very important or somewhat important for research ers who plan to use large-scale data repositories containing genetic information ? 82.6% indicated that it is very or somewhat impor tant for those who wish to share genetic research data with other investigators",Risk of potential for individual participants' identities to become known,"? Half of survey respondents considered it somewhat or very unlikely that such identification would occur, If research participants were to be identified individually, ? 34.8% of respondents thought that harm would somewhat or very likely result ? 45.1% considered this unlikely",Clarity of NIH guidelines for data sharing,"? 36.3% of participants agreed that the NIH guidelines for sharing of data from GWAS are clear ? 16.7% disagreed",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Longstaff 2015,Strengths (neurodevelopmental disorders repositories),"Data access requirements and procedures, and protections for confidentiality were significantly addressed",Gaps (neurodevelopmental disorders repositories),"? Special considerations for minors (absent from 63%) ? Controls to check if data and tissues are being submitted (absent from 81%) ? Disaster recovery plans (absent from 81%) ? Discussions of incidental findings (absent from 88%) ",,"Need to improve the consistency, depth and accessibility of governance and policies on which these collaborations can lean specifically for vulnerable young populations",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Lucas 2009,,"The importance that the panel members assigned to the various quality dimensions: ? The most important are accuracy, coherence and relevance ? Interpretability and timeliness come next; ? The least important are completeness, accessibility and amount of data and, ultimately, the access security.",,"Final Results: Data Quality Categories data quality Dimensions Intr¡nsic Error-of-observation, Coherence Acessibility Accessibility, Access Security, Preservation Contextual Relevancy, Timeliness, Completeness, Appropriate Amount of Data Representational Interpretability",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Luo 2010,Challenges in data management,"? Scattered clinical information, ? lack of awareness of services, and ? security issues.",Recommendations,"? Institutionalizing - university needs to establish policies regarding archiving data ? organizing work - exchange experiences and work together on loctions to similar problems ? enacting technologies - build systems to facilitate data sharing - ""should include metadata that describes current literature, researchers? current work, data sources of various research projects, the pilot work, or permanence of reanalysis section. Such a data system should also include information about what instruments are available for researchers to use if they want to know a particular outcome or particular construct, what studies have used those instruments, what are the summary statistics of using those instruments for the relevant population and etc."" ",Institutions should create data archiving policies and build incentive structures to encourage data archiving/sharing,Challenging to find patient data which is scattered between systems,Data security is important but there is lack of clarity on how ot achieve it,Interview quotes used to support,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Luzi 2013,"CNR researchers in the field of environmental sciences tend to work in collaboration, often involved in multidisciplinary projects within the same institutes and with external organisations","? Majority of researchers work in a medium size (47.7%) group ? When asked how often and on which occasions they collaborate with multidisciplinary groups, 42% reported that they always do so in international projects and with colleagues of the same Institute (36.9%). Working in multidisciplinary groups occurs sometimes with other CNR institutes (56.2%) and with other Italian institutions and/or Universities (61.6%).","Environmental scientists mainly carry out experimental research,","? Majority of CNR researchers (77%) carry out experimental research that generally implies the collection as well as an intensive use of data",data collection is often associated with descriptive metadata that represent a pre-requisite for data reusability and interpretation as well as for preservation,"52% provide metadata related to the date of collection, information on location, type of code used and instrument setting 5% associate data with additional information on the author, software, code of acquisition 30.6% associate both types of the above-mentioned metadata",Use of standards is mixed,"39.6% reported that their community of reference doesn?t use standards 34.4% of them don?t know about the use of standards in their research field 26% use standards (mentioned ones include: INSPIRE, SDI, ISO19115, OGC, SEG Y, NetCDF, ISO/WMM)",A relevant number of researchers rely on procedures for data preservation already set up in their institutes or foreseen in the future,"28,9% reported preservation procedures in place in their institutes, 22,3% reported that these procedures are going to be set up in the future",Majority of them do not have any support from specifically trained data managers,The presence of personnel specifically trained to manage data is reported by 15.4% of researchers,"Despite the use of data produced by others, CNR researchers tend to share only a fraction of data they produce",59% indicate that they use data produced by others,Researcher identify highly with reasons to share data,"Researchers find that data availability and preservation foster the process of science (56.8%) and that it also enhances the transparency of research (53.9% very important and 40.7% important). Another reason to make data available and preserve them is that research is publicly funded and therefore should be made available to everyone (50.7% very important and 38.6% important)",A relevant number of obstacles are perceived by CNR researchers as rather important,"These are: lack of technical support (41.9% important, 31.4% important) lack of standards (46.3% important, 25.8% very important), but also the fact that data are not evaluated like papers in scientific journals (37.5% very important, 31.5% important).","A clear wish to keep control over their own data also after submission","The majority of researchers find very important to have the possibility to update data after submission (60.2%), to know who is using them, when and for which purpose (53.5%), to be contacted if data are used (52%).",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Lyon 2010,"A central finding is that institutional repositories responsible for curating data produced by their own research community will need to develop domain-specific strategies since a generic approach to data curation will not be sufficient to cope with the different data-related needs and expectations of researchers working in different disciplines.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Magee 2014,Availability of data,"Complete phylogenetic data for ~60% of these studies are effectively lost to science",Increase in archiving,"Results reveal a dramatic increase in the archiving of phylogenetic data since 2011; e.g., datasets from more than half of the studies published in 2013 were deposited in online archives",Journal data sharing policies,"Studies published in journals with strong data-sharing policies are more likely to archive both complete (tree and alignment files) and incomplete (tree or alignment files) phylogenetic data, and are also more likely to provide complete and incomplete phylogenetic data upon direct request",Faculty v. student requests for data,Our analyses also indicate that corresponding authors are more likely to grant data requests from faculty than from students.,Journal impact factor,"For non-phylogenetic data, our analyses indicate that studies published in journals with a higher impact factor are more likely to both deposit their phylogenetic data in online archives and provide these data upon direct request",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Manhas 2015,Altruism with limits,"? Parents? motivations were based on altruism, recognition of the potential for knowledge accumulation and a hope to support the greater good. ? Parents recognized the benefits of data sharing (increased efficiency and research opportunities). ? Parents noted the importance of collaboration between all stakeholders to ensure effective and efficient data sharing ? Parents want ? Post-access monitoring ? To remain up-to-date ? Fair treatment of researchers during data access",Privacy concerns,"? Parents? privacy concerns were ? Unwanted publication of personal information ? Unknown third party would contact them for nefarious or unconsented purposes ? Data leading to negative consequences for participant?s family or another family ? Parents were concerned about sharing biological data but less concerned about sharing nonbiological data",Relationships,"? Parents? spectrum of trust ? Trusting of primary researcher and their decisions ? Reservations as data become further removed from participant?primary researcher relationship ? It is the RDR?s responsibility to ensure compatibility between participants and secondary researchers",Diverse strategies,"? Complexity of the topic of data sharing presented a wide variability of opinions by parents ? Three particularly contentious topics: ? Parent?s views on whether child participants and child data were different from adult participants and adult data ? Parents? reservations about secondary research and secondary research environments ? Parents? opinions on how to minimize bias in the RDR governance processes",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Manhas 2016,Attitudes on informed consent,"? Participants were supportive of a broad, one-time consent model or a tiered consent model ? Parents? worry about the interrelationships between the validity of the consent processes and secondary data use","Reciprocity: parents want reciprocity among participants, repositories and researchers regarding respect and trust","Parents viewed the consent process through a lens of, ? Respect was connected to recognition, convenience and control ? Parental distaste for the opt-out model related to the lack of recognition of the crucial connection between participants and the data (of themselves and their children) ? The preferred models for obtaining consent were considered respectful through their inclusion and recognition of parent participants in the decision-making process for secondary data use",Accuracy: parents worry about the interrelationships between validity of the consent processes and secondary data use,? These concerns centred on the process of consent and the implications of providing or withholding permission to re-use the data,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Manion 2009,Project structure and governance,"Necessity of a governance structure: Over 85% of individuals expressed the opinion that multi-institutional data sharing through the caGrid requires a governing body Potential functions of a governing body: functions include common guidelines for data use, communitywide IRB functions, risk assessment, general security policies and procedures, audit and oversight, reporting and enforcement, and selection of external standards for operation",Existing organizational infrastructure for data sharing,? Significant variation in the infrastructure existing at these organizations that could support federated data sharing.,Existing organizational decision-making structure related to privacy,"? Marked variation in the organizational infrastructure underlying decision-making in the area of privacy ? Determining factor appears to be the relationship of the medical school or university to the health system or hospital, producing a wide variety of configurations",Auditing data portability and secondary uses,"? Several participants raised concerns about the portability of data (from an IT perspective) and unauthorized secondary uses of data (from an IRB perspective), and cited the need to audit data portability as part of our standard compliance checking",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Marcial 2010,Inventory,"Geosciences = 26 Medicine = 20 Biology = 15 Astronomy = 14 Ecology = 4 Physics = 4 Social Sciences = 4 Chemistry = 3 Mathematics = 3 Marine = 3 Multidisciplinary = 4","Essential components in Scientific Data Repositories composition (that may correlate with success) ","? Funding (GrantsContracts and MultipleSponsors), ? size or scope of data collection (HoldingSize), and ? the existence of formal policies regarding long-term storage of data (PreservationPolicy)",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Marcus 2007,Information discovery and access,"? Scientists are generally comfortable with a wide array of technological tools, such as online datasets; seen as indispensable to effective research","Gathering, organzing and sharing not great","? Researchers? practices regarding data curation and preservation are idiosyncratic, haphazard, and in great need of attention. A lack of clear standards for data preservation and assistance to implement and maintain standards results in a messy combination of data stored on hard drives, in offices, on servers, and in the published form of journals. ? Many researchers use the traditional paper lab book, with additional organization in file folders and loose-leaf binders.",Data Preservation,"? Practices vary widely on this topic, from preserving everything to very little ? Preservation of data occurs in different ways. Physical copies of lab notebooks still hold high favor",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Marshall 2013,"Identified three main issues to resolve during a data management project: file organization, contextualizing data, and storage and access platforms",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Martinez-Uribe 2007,RDM practices,management of research data occurs with varying degrees of maturity across Oxford University. ,Funding,Researchers tend not to plan the management of the data at the outset of their research project in detail.,Data types,"Data collected were, as expected at the beginning of the project, many and very diverse. The long-term usefulness of the data also varied enormously",Data storage,Mostly stored on personal computers or departmental servers with a variety of security and back up procedures. Very few of the researchers interviewed had deposited any data in domain specific data archives ,Data sharing,"Although researchers tend to feel very attached to their data, they believe that if their research is publicly funded, then their data should be made publicly available.",The top three requirements,"Secure and user-friendly solution that allows storage of large volume of data and sharing ? A sustainable infrastructure that allows publication and long-term preservation of research data ? Advice on practical issues related to managing data across its life cycle.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Massey 2009,Data collection methods,"? Most Perinatal Health Programs collect insufficient data to enable identification of obstetric (and neonatal) practices associated with improved maternal and perinatal outcomes ? There is between-Perinatal Health Programs variability in defining many data fields",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Mattern 2015,Reported Research Data Challenges,"? Disconnect between methodology in the classroom v. methodology in the field ? Access to sources for data ? Overly-technical or over-lengthy data management guidance that does not resonate with user groups ? Overly-technical or over-lengthy data management guidance that does not resonate with user groups ? Not knowing what infrastructure and services are available to them",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Mayernik 2016,Institutional support for data & metadata management are not uniform,"? CENS data infrastructures were very different from project to project, and data management and long-term preservation were not centrally supported within the field science-focused projects, including the ecology projects ? Few projects had well defined responsibility for data management, resulting in often-changing routines that were specific to individuals ? Discipline-specific data and metadata standards were either not useful or not well understood ?The LTER network has an official metadata standard, EML, which, though slow to be picked up and problematic to implement, is now strongly institutionalized within the individual sites ? The organizational and technical support for managing, sharing, and preserving digital research data varies considerably across UCAR",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, McDonald 2012,Usability of homeML Application,All participants felt the application was well designed and intuitive; and that both thehomeML Toolkit and homeML Repository were easy to use,Recommendations,"Recommendations made included, ? the availability of public data usage statistics, i.e. how many times a dataset has been downloaded or viewed; and ? the ability to view each dataset in tabular form",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, McGuire 2008,Majors themes identified,"1. Participants expressed a range of understanding about with whom their DNA data would be shared 2. Most participants expressed an interest in receiving information and generally wanted control over decisions about data sharing 3. There was wide variation in their judgments about the trade-off between privacy and the scientific and clinical utility of the data 4. When presented with traditional, binary, and tiered consent, participants were able to understand all three types of consent, readily grasped the differences between them, and were able to identify the pros and cons of each in terms of privacy and utility 5. Although participants generally preferred tiered consent, they were more likely to consent to unrestricted data release if offered traditional or binary consent 6. Most participants felt that sequence information from existing samples should not be publicly released without explicit consent from research participants",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, McGuire 2011 (+ Companion),RCT: Prefernces for data sharing,"? Before debriefing, 83.9% of participants chose public data release ? After debriefing, 53.1% chose public data release, 33.1% chose restricted (controlled access database) release, 13.7% opted out of data sharing ? Only one participant declined genome study participation due to data sharing concerns",Interviews: Data sharing prefences,"Most parents (73.5%) and adult participants (90.3%) ultimately consented to broad public release. However, parents were significantly more restrictive in their data release decisions, not because of understanding or perceived benefits of participation but rather autonomy and control. Parents want to be more involved in the decision about DS and are significantly more concerned than adult participants about unknown future risks.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, McGuire 2012,Major ethical legal and social issues identified,"Investigators and recruits were similarly sensitive to following issues, ? informed consent ? data sharing ? return of results",Accessibility of metadata - Investigators,"? Frustration that much of the clinical metadata collected from Human Microbiome Project subjects were not more readily accessible ? Most investigators voiced a general appreciation for the fact that sensitive and/or potentially identifying data needed more security ? Other investigators worried that analyses conducted without metadata could lead to inaccurate conclusions",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, McKay 2010,Researchers approaches to data,"? 65 out of 85 researchers said that at some point in the past, they had re-used their own research data ? 98% of respondents retained the data they used to write their last research paper ? 45.3% of all respondents said they had used datasets created by other people ? 32 had never shared any data, 30 had definitely had data re-used by others, and 23 weren?t sure",Data sharing practice,45.3% of all respondents said they had used datasets created by other people,Attitudes to the institution?s role in data curation and sharing,"? Majority of researchers (55, 64.7%) said there wasn?t anything University could offer in assistance with managing research data ? Among the remaining 30 researchers the most popular form of assistance was, - archive space, either digital (10, 30.3%) or physical (3, 9.1%) - back-up service - data conversion service - data management training",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, McLure 2014,Nature of the data,"? Majority of participants mentioned using a core group of file formats ? Most participants indicated that their individual project data files are typically small",Management of data,"? Those who work with large files noted that the transfer of files within a lab?s computer networks, across the campus network, or to geographically distant sites presents ongoing challenges ? They also expressed concerns regarding file storage, data integrity, data backup, and data transfer ? Virtually all participants indicated that they expect their file sizes and their file storage needs to increase in the future","Sharing, Curation, Preservation","Participants indicated several key reasons why they may share data and products: ? mandate by a grant funder, a journal, or a government entity; ? necessity, to functionally accomplish collaborative work such as student theses or research projects involving peer researchers at other institutions; and ? for the benefit of a specific audience Participants had strong opinions on the topic of data storage, expressing interest in the potential benefits of more centralized campus data storage as well as concerns about the possibility of a concomitant loss of current individual or research unit control",Data-Management Plans,"? Some had never or only recently become aware of the concept of DMPs ? Responses also revealed varied perspectives on what a DMP entails and whether it is only a formal plan or may also name procedural workflows that for many researchers are embedded in their research process",Library Support,"? Participants expressed interest in future training opportunities, for themselves and for graduate students, focused on the digital collection of data (as opposed to the continued use of paper lab notebooks, for example); managing data; new methodologies for recording data; and data-organization approaches and tools. ",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Mello 2005,Restrictive contractural provisions,"? Prohibit investigators from sharing data with third parties after the trial is over, - 41% allowed it - 34% disallowed it - 24% were not sure whether they should allow it ? 75% of administrators reported at least one such dispute in the previous year: - intellectual property (30%) - control of or access to data (17%)",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Menzies 2010,Survey: Repository interoperability,"Twenty Institutional Repositories (64.5 percent of institutions) interoperate with other, internal systems",Case study: Examples of good practice,"? In both case study libraries, staff cataloguing items for, or developing the Institutional Repository, work closely with those concerned with the Learning Mgmt System and its catalogue records, creating formal and informal shared workflows ? This, and the wider discussions with which library systems staff are involved, should be seen as exemplary by other HEIs considering process efficiency, data sharing, and a move towards interoperability",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Milia 2012,"Data sharing is not yet common-practice in studies of human genetic variation","a substantial proportion of datasets (23.2%) is not immediately shared through the published material or information contained therein (body text, supplementary material or online databases), while an important fraction (16.6%) continues to be withheld even after serial e-mail requests to all authors of withheld datasets","There is a substantial variation in sharing rate of primary datasets across distinct research fields",Significantly lower sharing rates in Medical Genetics than in Human Evolutionary Genetics and Forensics,"Adoption of explicit editorial policies or impact factor rank has a limited effect on data sharing rates","A slightly higher sharing rate was observed for datasets published in journals with strong editorial policies and high impact factor rank. However, no difference between classes for each parameter is statistically significant, considering both the total and partial datasets (Figure 3 and Table 1). Furthermore, neither factor was associated with a sharing rate beyond 80.5% in the entire dataset (figure 3C, D). As previously observed [9], impact factor ranks and editorial policies were found to be significantly associated (p,0.001; Chi-square test for R6C contingency tables).",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Mills 2015,Attitudes and viewpoints on data access,"? 93% of PIs have historically shared data ? Only 8% were in favor of uncontrolled, open access to primary data while ? 63% expressed serious concern",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Milner 2009,Research needs ,"? 50% of data is estimated to have a useful life of less than ten years; ? 26% is seen as having indefinite retention value",Data sharing,"Most research data is held locally, ? 21% of questionnaire respondents using a national or international facility ? Most researchers share data but mainly within research network teams and collaborators; ? 18% share via a data centre ? 43% would like access to others? data",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Minifie 2011,Data Management,"? The highest rated topic in data mgmt was assuring the accuracy of data - Faculty and students rated the high importance of this topic quite similarly ? Other topics related data management were rated as having less than average importance by both faculty and student respondents: - storing research data (Item30) - determining who has authority to access research data (Item 31) - safeguarding electronic research information (Item 32)",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Mischo 2014,Servers and websites,"? Nearly 40% (503) of the DMPs make reference to local storage mediums, such as a PI server ? A total of 667 proposals (52.9%) mention centralized campus resources as a data storage or preservation sites. ? There were 276 proposals (21.9%) which included the University of Illinois institutional repository as a data deposit resource.",Template and Repository Usage,"? 250 (19.8%) proposals used wording from the Grainger Library DMP template ?276 (21.9%) DMPs, including those using the Grainger Library template, specified the institutional repository as a data deposit and sharing resource.",Confusion around Scholarly Publication of data,"? A high frequency (44.1%) of DMPs that specifically mentioned traditional scholarly outputs in their data management plan. ? Very few DMPs were explicit as to how these traditional scholarly products would disseminate data or data sharing methodologies.",No significant differences between funded and nonfunded proposals for storage,"? Tested the frequencies among funded and unfunded proposals of five DMP proposed storage mechanisms: PI server or websites, institutional repository, campus storage, departmental servers, and disciplinary repositories storage. ? The results of our analysis showed that there are no significant differences between funded and unfunded proposals with respect to these four proposed storage venues. For the DMPs included in this study, there was no advantage ? in terms of being funded ? for proposals specifying disciplinary repositories or the institutional repository as venues for data storage and access.","There was a statistically significant higher use of the Illinois institutional repository and disciplinary or cloud storage solutions in later proposals vs early proposals","? Two groups: prepared before October 1, 2012 and after October 1, 2012 ? calculated the chi-square values for two categories: the proposed use of the institutional repository and disciplinary repository services. ? In both of these cases the chi-square values are statistically significant and indicate that more recent proposals are specifying the use of the University?s institutional repository and disciplinary repositories at a higher frequency.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Mohr 2015,"Support for those that use 'data""","who worked with ?data? wanted more support in preserving data/research materials in the long term (after the research project is completed/published), followed by assistance with preparing their data/research materials for sharing (navigating privacy, copyright issues)","Support for those that use ""research materials""","Researchers who said ?research materials? were their primary product of research indicated different areas where they wanted more support, with respondents from CSE and CLA wanting less support overall compared to their ?data? colleagues. Respondents who worked with ?research materials? in CFANS reported wanting more support than these respondents fromother colleges.","Overall, preserving data in the long term is a consistently high need across colleges and types of data/materials",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Mooney 2012,Data citation is poorly practiced across the journals surveyed in the three academic indexes,"? Perfect score on the Data Citation Adequacy Index was not met by any ? majority of papers on low end of DCAI with scores from 2-8. Score of 2 represents common practice (41.5% of articles) mentions intext basic identifying information about dataset ? scores 4-8 (41.5%) mainly represented a footnote with some kind of explanatory information ? upper range 24-28 (17%) reflect data citation in reference list ? no article included a persistent identifier",Journal author instructions are largely silent on the issue of data citation,"? 14 (32 percent) journals used a house style sheet, ? 28 (64 percent) referred to a style manual, ? and two (4 percent) did not specify any citation style at all. ? The journals using house style sheets are significant in that not one of these individual style sheets provide any instructions for the citation of data. ? A few (7 percent) journals include statements noting that statistical data should be cited, but a statistic is distinct from the main data under analysis within an article.",Data citation instructions,Journal author instructions are largely silent on the issue of data citation,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Murillo 2014,Priority data,"Top priority data - Two themes emerged from this question: (1) Difficulty and/or effort (2) Highly valuable and/or irreplaceable",What perceptions do scientists have on the topic of data at risk?,"Four major areas of concern came from asking the scientists how they would describe endangered data. The areas of concern were (1) unavailability, meaning the data did not exist or were restricted, (2) lack of context, meaning the data were lacking metadata or the record keeping was poor, (3) accessibility issues, meaning the data were degrading or were in old format, and (4) potential endangerment, meaning the data were not backed up or not kept properly, hypothetically.",What perceptions do scientists have of data reuse and sharing?,"When asked about their data sharing practices, scientists discussed incentives and disincentives. The disincentives included: scooping/competition, sharing outside of research group, equipment and technical issues, and metadata issues. The incentives included: collaboration, additional publication, and moving science forward. These topics have been discussed thoroughly in data sharing literature.",What opinions do scientists have in regard to the Data At Risk Inventory?,"The researchers asked the participants of the focus groups specifically if they would use this type of resource and also asked for design and function recommendations. Overwhelmingly, the scientists saw the importance of having this type of inventory available for their work and for data that could possibly be lost. One of the most common reasons discussed was the possibility of doing meta-analysis with the data that they could locate with this inventory.",Data types,"Non-digital datathat participants discussed included physical samples, lab notebooks,and field notebooks. Some participants received their data from private industry",Data curation,Participants felt that these types of curation activities were important in ensuring that data did not become endangered in the future. ,Endangered data,"Participants noted that there were data that had been restricted or were non-existen Participants also seemed very aware that data that lacked context, even if they hadaccessto these data, would still be unusable articipants discussed accessibility issues with data by explaining that some data are endangered due to data degradation or old file formats articipantsshowed concerned for datathey considered to be potentially endangered, such as lab notebooks and tissue samples",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Murphy 2012,Comparison with previous surveys from 4 and 6 years prior on IT-supported clinical research,"Adoption rate of 4 prominent areas of IT-supported clinical research had increased remarkably, ? regulatory compliance ? electronic data capture for clinical trials ? data repositories for secondary use of clinical data ? infrastructure for supporting collaboration",Use of data repositories,"The use of clinical data repositories to support clinical research also appears to have arisen as a critical area of emphasis in recent years, with substantial growth in such repositories noted since 2007 (2005-2011: increase = 72%).",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Nicholson 2011,Data mentioned in abstracts,"? 31% of the abstracts stated that data were included in the research; ? abstracts that also seemed to suggest (but not explicitly state) that data may be present, the total increased to 54% ? education had the highest percentage of dissertations whose abstracts explicitly cited the presence of data (50 percent), while sociology was in close second place with 42 percent in this category",Data mentioned in table of contents,"? Contains data in TOC is 63 % ? Only 31% of our sample included tables of contents that offered no explicit mention of data. ? among all of the dissertations with abstracts that did not mention the presence of data (44%), their tables of contents revealed that 38% of these actually did offer data-supported research. ?The disciplines that best represented the presence of data in tables of contents were education and sociology",Research data collections may not need further access,"Research data collections support a specific project and may have no applicability beyond the focused research for which they were created. In our analysis, fully 90% of the datasets generated for dissertation research fall into this category",Availability of data in sample,We were able to assert that 67% of the dissertations in our sample made some portion of the data collection available,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Noor 2006,"No journal had complete compliance with its requirement for all DNA sequences to have been submitted to GenBank","? Between 3% and 20% of papers in these journals did not include GenBank accession numbers, ? and between 3% and 15% of studies never submitted their DNA sequences at all.","If this ?oops effect? is common, we expect that the publication dates of the papers completely lacking GenBank submissions should be later on average than the publication dates of papers only missing accession numbers","We find a marginally significant difference in publication month (t = 2.2, p = 0.08) in the expected direction between papers for which sequences were never submitted to GenBank and those for which sequences were submitted but accession numbers were not printed.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Noorman 2014,Longer-term curation of data remains a key problem.,"However, even for such established institutions and for longer running projects and experiment like ATLAS at CERN, long-term curation and preservation pose a problem.",Financial support,"? Funding is a key issue in making and keeping research data openly available ? the long-term preservation of growing volumes of data introduces a whole new set of issues for which there is currently no clear funding strategy","The above analysis shows that institutions have focused primarily on developing strategies to ensure the technical quality of data deposited (e.g. are the correct formats used, is the metadata complete, etc.). Less effort has gone into establishing review practice that focus on the scientific value of data, partly because it is a time consuming and difficult task. An important barrier that has to be overcome in order to move forward is the lack of incentives for researchers to engage in data review processes.",,"Open access to research data requires specific skills and knowledge that have to be developed and maintained. As the Chapter showed, several institutions have taken up the challenge of educating and training researchers, librarians, information and data scientists and other professionals, building on existing and emerging digital data management practices. Libraries, data repositories, data centres and dedicated organizations play and important part in offering workshop, training materials and other kinds of support.",,"Evaluating and maintaining the quality,value and trustworthiness of research data;","Institutions have developed various measures and strategies for evaluating and maintaining the quality and integrity of data as well as determining their value and impact. These include adherence to established best practice, peer review procedures, citation records, clear origins of data, transparent review and publishing practices, standard metadata, etc. Issues still remain.",Educating and training researchers and other relevant stakeholders,Data sharing is not yet a common practice in most disciplines. This is partly because researchers often lack the skills and knowledge to share their data. Researchers may also lack knowledge on data management policies and the legal and ethical aspects of dealing with their particular kind of research,Creating awareness of the opportunitiesand limitations of open research data,"Besides a lack of skill, there are many other reasons why data sharing and open access are still not the norm in most disciplines. d. Researchers are reluctant to make their data publicly available because of concerns ranging from their work being scooped or misused, to not having enough time or fundingto make their data accessible, to maintaining the privacy and confidentiality of their researhc",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Oleksik 2012,"Artifact ecology (such as sketches, inscriptions, notes, records, and intermediary reports that enable communication and reuse of information)","? Sole focus on technologies for data access and sharing limits the potential impact of attempts to improve scientific collaboration ? successful collaborative research involves intricate interconnections of technical infrastructure, tools, and purposefully created digital artifacts that emerge in various stages of the scientific discovery and fulfill individual and collective needs of the researchers",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Oliver 2012,Judgments about the risks and benefits of data,"Participants were more restrictive in their reported data sharing preferences than in their actual data sharing decisions. ","Comparing Hypothetical and Actual Preferences","83.9% of participants initially consented to public data release when they were enrolled into the genome study, and the majority (53%) chose public data release after debriefing. ",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Oushy 2015,"Attitudes to sharing biospecimens and participating in a virtual national biorepository","Participants were more restrictive in their reported data sharing preferences than in their actual data sharing decisions ? (n = 99, 88.4% of total) agreed with the NIH Resource Sharing Plan and indicated they considered ?human biospecimens? as resources that should be shared ? 78 respondents who answered the questions about using a virtual biorepository would be likely or very likely (n = 47, 42.0% of total) to obtain specimens using such a biorepository with similar numbers of respondents (n = 50, 44.6% of total) likely or very likely to share data about their biospecimens in a virtual national biorepository","Perceived reasons individuals refuse to donate specimens. ","% of 112 total participants ? Inconvenience 11.6 ? Health concerns 10.7 ? Recruitment barriers 8.0 ? Privacy and security barriers 8.0 ? Misuse of personal information 7.1 ? Distrust in research/health care system 6.2",Researcher requirements for collaborating and sharing data,"% of 112 total participants ? Collaboration and acknowledgment 12.5 ? Expertise in tissue research 11.6 ? Compliance with institutional and federal policies 8.9 ? Data sharing policies 6.2 ? Preservation of resources 4.5",Researcher concerns if unwilling to share specimens.,"% of 112 total participants ? Plausibility of research 10.7 ? Intellectual property rights 6.2 ? IRB Concerns 6.2 ? Costs/lack of reimbursements 6.2 ? Sample issues 6.2 ? Lack of expertise in tissue research 4.5 ",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Parsons 2013,The diversity of data types and the strong presence of non-digital data such as lab notebooks.,"? Highest response to type of data created was 16% for documents (rest divided) ? 7% physical notebooks, 7% slides/specimens","There are multiple locations for the data and therefore, the ad-hoc strategies of back-up.","? Strongly indicating that data is stored in multiple locations, with campus computers, laptops, external hard drives, USB drives, University storage, web based storage or paper being the top answers ? only 48% respondents used the University networked storage ? Only 35% of respondents backed up their data daily. For a large percentage of respondents, backing up of data was not done regularly, with another 9% admitting that they did not know when it was backed up and 2% admitting that they never backed it up at all",The range of data sizes means the standard University provision of 4 GB of space may be insufficient. ,"? A tentative conclusion would be that a typical researcher requires 1-500 GB, with some users requiring significantly more in the Sciences, Engineering and MHS Faculties","Sensitive data, IPR rights and the sense of ownership to the data will doubtless, hamper efforts to share data. Overall the responses indicate that certain areas such as the medical fields will require additional effort to investigate if and how sensitive data can be shared.","? Only 25% said their sensitive data would be suitable for sharing publically ? 25% are the owner of IP rights for research data","Training appears to be high on the agenda for many, with very few expressing no interest at all. Key areas included help with DMPs, metadata, storing data and funding body requirements sessions.","? The greatest demand was for ?Developing a research data management plan?, followed by ?Storing data?, ?Creating metadata for data? and ?Documenting data? (details of methodology, equipment used, details of physical specimens etc.)",The funding analysis revealed a surprisingly low awareness of funding requirements regarding data sharing.,"? Only 9% said they were aware of a requirement by their funder to make data available OA ? 78% no response to question, 3% not applicable, 10% not aware of such requirement",Volume of research data,"By far the greatest number of respondents estimated the volume of their research data to be between 1 and 50 GB. Only a small number of respondents estimated the volume of their research data to be greater than 50 TB and a considerable number had no idea of the volume of data they were creating",Backing-up research data,"Only 35% of respondents backed up their data on a daily basis. For a large percentage of respondents, backing up of data was not done regularly, with another 9% admitting that they did not know when it was backed up and 2% admitting that they never backed it up at all. The majority of respondents backed their data to an external hard drive. ",Using metadata to describe research data,The majority of respondents did not record metadata. Those that did were asked whether they used any standards or guidelines when constructing it.,Externally funded research,"The survey results show that 71% of respondents are currently working on a research project that was externally funded, with 29% being internally funded",The development of a research data management plan,"Further analysis of the data showed that Research Fellows, Career researchers and ?Other? researchers (Professors, Assistant Professors, Research Assistants, Research Associates, Research Officers, Technicians and Managers) were most likely to have developed a research data management plan than post-doctoral researchers or PhD researchers.",,,,,,,,,,,,,,,,,,,,,,,,,,,,, Pearce 2010,Extent of adoption of information computer technology,"? Coefficients for age become positive for awareness of Access Grid nodes and use of a repository ? For gender there is very significant (at 1 per cent) positive relationships for Access Grid nodes and repository use, suggesting that males are more likely to be aware of and use these",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Pejova 2014,Data Storing,"? Most data are not stored centrally in a shared repository, but that they remain on workstations (computers) of individual researchers (This answer was selected 52x)",Sharing data,"? 44 stated that data is provided in person ? 21 are not willing to share data ? 24 willing to provide data online or via repository",File formats for preservation,"? Data stored and archived in pdf, doc and xls formats were most frequent. XML, CSV or image formats (TIFF, JPEG)","Length of time archiving data ","More than 10 years - 54% do not know - 31% do not archive data for such a long time - 15%","Person taking care of data (could choice more than one response) ",Researchers themselves - 53 respondents,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Pepe 2010,"The CENS case studies presented here reflect the complexity of scholarly and scientific practices that need to be represented. Our initial experiments, working in concert with the ORE development process, indicate that ORE offers a feasible and promising approach to information management.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Pepe 2014,Data sharing practices,"1) astronomers produce derived data in standard astronomical formats, 2) they are overwhelmingly willing to share their data with their peers and the public, 3) they are normally unaware of mechanisms for archiving and citing derived data, and 4) they rely upon non-automated, non-standard methods to acquire and provide derived data (e.g., they put derived data on their website and link to it, they contact paper authors to obtain data)",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Peters 2011,,"Data Lifecycle Workflow In terms of planning prior to data collection, practices varied greatly. The majority of respondents indicated that much of their planning occurs during the grant proposal writing process. Three of the ten PIs discussed their use of LabVIEW, software that allows scientists to develop customizable measurement, test, and control systems for large data sets, and that supports a wide variety of file types. In this context, designing the research process was synonymous with planning for data collection. All of the researchers provided milestones for their projects.",,"Data Characteristics Each of the projects selected for this assessment include multiple data types. With the exception of XML and GIS, all of the data types we suggested were used by at least some of the participants, with images, scanned documents, spreadsheets, and text being used almost across the board. According to the data, a majority of the projects included in this study are not expected to produce in excess of 500 GB of data, indicating that for many scientists on campus, organization will be of greater concern than storage space. In keeping with this finding, few faculty members currently look to the library for assistance with the technical side of data management. Most of the interview subjects indicated that they are comfortable utilizing departmental and campus IT resources to store and back up their working data, despite the fact that there is no clearly organized information available on campus that documents what particular data-support services are offered and by whom.",,"Data Management When asked who was responsible for managing the data associated with a project, all ten PIs indicated that they have that responsibility, with one claiming that data management is his responsibility and his alone. Nine PIs claimed shared responsibility, with six specifying graduate students, two post-doctorates, and several isolated mentions of technicians, a project manager, and a lab team. In regard to data storage, numerous methods were used by everyone interviewed, the most common being storage on a PC hard drive. Only one PI did not specify any particular location, indicating that students were responsible for storing all of the data.This is interesting given that the same PI claimed sole responsibility for managing data associated with the project in the previous question. This is indicative of a general theme that has emerged from this study, namely that there is much confusion over what constitutes data management. The question concerning data backups was encountered with particularly high levels of uncertainty and discomfort, so it came as no surprise that responses varied widely among respondents. Only one individual reported that his data is backed up multiple ways and at various times throughout the month, even going so far as to claim that it is also stored in multiple places. Seven indicated that their data is backed up weekly, with two specifying that the backups happen by means of campus IT. Although most of the interviewed PIs revealed some sort of plan for obtaining the raw data generated by graduate students during their time at the university upon their departure, this was not always the case. Surprisingly little thought was given to how the transience of students might impact the consistency of data management practices within a lab. The majority of respondents indicated that they plan on storing their data indefinitely. Comments to this effect included the need to have data available should a paper be challenged, as well as the opinion that in the absence of storage space concerns, there is simply no reason to get rid of anything. With this being said, further investigation led to the realization that researchers were largely referring to analyzed, often visualized, publication-level data and not data directly generated through the course of the experiment. In most cases, there was no centralized storage of experimental data. Two individuals did state that they only plan on keeping the data associated with their project for one to five years, one specifying that certain key results will be kept indefinitely. Three indicated that they will keep their data for more than ten years. When asked if they had a data management plan (DMP) for the project in question, seven respondents indicated that they do not; six of them stating that this is because a DMP was not necessary at the time of their proposal submissions. Other reasons included general lack of information about DMPs and the extra demand on their time. The three individuals who claimed to have a data management plan in place stated that it was just good research practice to do so, that it helped them to stay organized, and that data are simply too valuable not to manage.",,"Data Organization As with data backup, there was a great deal of variation when it came to preferred methods of data organization. A common theme among researchers was to suggest that their data organization is largely predetermined by experimental design and research practice, and that additional planning is unnecessary and inefficient. Five respondents indicated that they have no real file or folder naming conventions in place, with students often being left to their own devices in terms of data organization. Only one of the PIs that we interviewed provides very specific instructions for his students and lab personnel on how to manage the data associated with his projects. Another two claimed that they use industry standards to organize their data without specifying exactly what those standards are. Five individuals claimed to use specific file- and folder-naming protocols, with one additional person claiming to simply use the file names generated by the equipment. Three of the ten PIs discussed their use of LabVIEW. Although none of the interviewed individuals indicated using any formal metadata standards, two mentioned that their data is automatically time-stamped by their equipment, while one additional respondent mentioned that some of his equipment automatically embeds metadata as the data is generated. Even though these responses varied greatly, all thirteen indicated the belief that their data organization methods were sufficient for others in their field, even those individuals who acknowledged no clear method of data organization.",,"Data Use Every person interviewed indicated that the researchers, students, and staff working on a project have access to project data. Most of the respondents who indicated that they have made data available to people outside of the research group stated that they do so only upon request. Only published data, not raw or experimental data, was ever shared with the public or project sponsors. The difference in responses of researchers working on individual versus group projects was minimal. The most commonly cited reason for not sharing data was that it is confidential, proprietary, or classified; but intellectual property concerns, possible misinterpretation of the data, and the time or effort required to share it were all cited as reasons as well. Only one researcher claimed that data would be shared with anyone who asked. One individual who does research in the field of evolutionary biology, an area that is controversial even without the possibility of the misinterpretation of data, claimed that he would share his data with anyone in the scientific community, but probably not the general public. Another researcher pointed out that there was no reason to share raw, unprocessed data with anyone outside of the research group, because it would not make any sense to them. Not a single respondent seemed to consider the possibility that sharing raw data might allow for the independent validation of results or different studies using the same data set. When asked how they currently share their data, the vast majority of researchers indicated that e-mail was their preferred method. This was followed by external storage devices and collaborative web space. As stated before, the idea of sharing raw data or making raw data openly available to everyone was simply not a consideration by any of the respondents.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Pham-Kanter 2014,Influence of Data Sharing Poilices,"Influence of Data Sharing Policies 65% - NIH policies had been influential in increasing data sharing 35% - journal publication policies had a positive influence towards facilitating data sharing 55% - these policies had had no influence on sharing 39% - formal instruction had an influence on encouraging sharing 58% and 51% - rated the practices of advisors and the norms of their field, respectively, as having been influential in encouraging data sharing 20% - institutional material transfer agreements had been influential in discouraging data sharing 25% - thought the same of other technology transfer policies Industry agreements and commercial activities (such as potential patents and royalties) were thought to impede sharing.",Compliance with Data Sharing Policies,"Compliance With Policy Requirements Compliance with sharing requirements varied across domains and policies. 92% - reported always having submitted, when required to do so, a detailed description of their methods as an online supplement; 8% of respondents only sometimes submitted this description. 89% - reported always submitting, when required, data as an online supplement 90% - reported always submitting, when required, to a third party repository. 83% - compliance with submitting biomaterials to a third party repository Frequency of MTA policy violations Always 5% Sometimes 19% Rarely 10% Never 57% Not aware of MTA policies 9% Reasons for violating policy* MTA takes too much time 85% MTA requires too much red tape 82% MTA negotiations too onerous 78% Philosophically opposed to MTA restrictions 48% Scope of MTA overly broad 38%",Use of data sharing plans,"Policy Tools and Infrastructure Influencing Data Sharing Data sharing plans in grant proposals. Of the 735 respondents who had served as grant reviewers in the last 3 years, 27% - quality of data sharing plans important or very important in evaluation of proposals; 43% said the plans had been somewhat important; 30% - data sharing plans not at all important in review of proposals Data sharing infrastructure and tools. 58% - online data and methods supplements helped respondents? own research 33% - third party data repositories helped their research progress 40% - third party biomaterials repositories helped",,"Sanctions for Noncompliance 4% - had appealed to a funding agency, journal, or professional association in response to another scientist?s failure to share data or biomaterials 17% - stopped collaborating with a nonsharing colleague. 8% - had taken steps to delay sharing 5% - refused to share their own data with the noncompliant scientist",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Pinfield 2014,1. To examine roles and relationships involved in RDM,"? Some participants saw the library as being in an ambiguous position and were themselves uncertain about its role in RDM. Current work was seen as a way of helping to resolve that uncertainty ? There was sometimes disagreement even within individual library organisations about whether in both the short and long-term the library should lead on RDM issues",2. To identify the main components of RDM programmes,"Strategies: defining the overarching vision for research data management within the institution and how it relates to the institutional mission and priorities, and outlining major developmental goals and principles which inform activity. Policies: specifying how the strategies are to be operationalised through regular procedures, including not just an RDM policy but also a set of complementary policy frameworks covering issues such as intellectual property rights and openness that may be relevant. Guidelines: providing detail on how the policies will be implemented often written from the point of view of a particular user group (such as those within a particular disciplinary area) and defining specific activities, and roles and responsibilities. Processes: specifying and regulating activities within the research data life-cycle including research data management planning for individual projects, data processing, ingesting data into central systems, selecting data for preservation,etc, and involving the use of standards and standardised procedures wherever possible. Technologies: underpinning processes with technical implementations including data repositories and networking infrastructures allowing for storage and transport of data. Services: enabling end-user access to systems and providing support for research data life-cycle activities (including supporting the creation of data management plans, providing skills training, and delivering helpdesk services).",3. To evaluate the main drivers for RDM activities,"1. Storage: the need to provide immediate storage facilities for a wide variety of datasets at a scale which anticipates the future requirements of researchers and in a way that represents value for money and is convenient to use. 2. Security: the requirement to ensure that data, particularly that which is confidential or sensitive, should be held securely with relevant authentication and authorisation mechanisms in place. 3. Preservation: the need for medium and long-term archiving of data with associated selection protocols and preservation activities along with a supporting technical infrastructure. 4. Compliance: the need to comply with the requirements and policies of other relevant agencies, particularly funders, as well as legal obligations, such as data protection, and industry good practice. 5. Quality: the imperative to maintain and enhance the quality of research activity in general in order to demonstrate the robustness of findings and enable results verification and reproducibility (partly derived from but not limited to the quality of research data itself). 6. Sharing: the need to share data amongst targeted users and also to provide mechanisms and systems to enable open access to data where appropriate. 7. Jurisdiction: the development of a professional narrative around the need to be involved in RDM and how this impacts upon other stakeholders in the institution.",4. To analyse the key factors influencing the shape of RDM developments,"Key Influencing Factors emerging from this research were: Acceptance Cultures Demand Incentives Roles Governance Politics Resources Projects Skills Communications Context",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Piwowar & Chapman 2010,Whether the study was NIH funded had little impact on data sharing,We estimated that the NIH data sharing policy applied to 61 of the studies (the study was submitted to the NIH after 2003 and received more than $500 000 in direct NIH funding per year): these 61 studies had a slightly higher frequency of data sharing in univariate analysis (52% vs. 46%),"The impact factor of the journal and the experience component for the first and the last authors were significantly associated with an increase in data sharing prevalence, with corresponding author country and journal policy strength is trending towards but not reaching statistical significance.","A study with a corresponding address in the USA was twice as likely to publicly share its microarray data; a study published in a journal with an impact factor of 15 was 4.5 times as likely to have shared data compared to a study published in a journal with an impact factor of 5 (assuming all other covariates are held constant), increased author experience suggests a increased odds of data sharing, and the odds of data sharing is higher in journals with a weak data sharing policy than in a journal with no data sharing policy, and higher yet in journals with a strong data sharing policy.",Availability of data,"Almost half of the studies made their raw datasets available (47%) The impact factor of the journal and the experience component for the first and the last authors were significantly associated with an increase in data sharing prevalence,",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Piwowar 2007,Citation Rates,"? The 48% of trials with publicly available microarray data received 85% of the aggregate citations. ? Publicly available data was significantly (p = 0.006) associated with a 69% increase in citations, independently of journal impact factor, date of publication, and author country of origin using linear regression.",,"Characteristics of Eligible Trials by Data Sharing. High Impact (? = 25) 100% Data Shared Low Impact Journal 40% Data Shared (60%) Data Not Shared Published 1999?2000 83 Data Shared 17% Data Not Shared Published 2001?2003 46% Data Shared 54% Data Not Shared Include a US Author 63% Data Shared 38% Data Not Shared No US Authors 21% Data Shared 79% Data Not Shared Citation count for 85 publications- percentage increase Publish in a journal with twice the impact factor - 84% Increase the publication date by a month - 3% Include a US author - 38% Make data publicly available - 69% ",,Only trial design features such as size and clinical endpoint showed a significant association with citation rate; covariates relating to the data collection and how the data was made available only showed very weak trends.,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Piwowar 2008,Journal policies with respect to data sharing,"? 53 of the 70 journals had explicit mention of sharing publication-related data in their instruction to authors ? 40 journals had sharing policies applicable to gene expression microarray data; 17 were classified as weak, 23 as strong (requiring proof of database submission prior to publication)",Factors associated with data sharing policies in journals,"? Type of journal publisher: 46% of commercial journals had policies vs 82% of journals published by an academic society ? open access: all five open access journals had data sharing policies ? impact factor: journals with no data sharing policies had a median impact factor of 3.6; weak policy 4.9; strong policy 6.2 ? Variables were positively associated with the existence of a microarray data sharing policy: impact factor, open access, and academic society publishing ? Subdisciplines of Biochemistry&Molecular Biology and Oncology were negatively associated with the existence of a microarray data sharing policy",Relationship of data sharing policies and measured data sharing submission in GEO database,"? Journals with no data sharing policy, a weak policy, and a strong policy had median data sharing prevalence of 8%, 20%, and 25%, respectively ? Journals with the strongest data sharing policies had the highest proportion of papers with shared datasets ? articles were more likely to have submitted primary data to Gene Expression Omnibus (repository) when they were published in journals with a data sharing policy, published by an academic society, or in the subdisciplines of Genetics&Heredity or Multidisciplinary Sciences",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Piwowar 2010,Prevalence with which investigators publicly share their raw gene expression microarray datasets after study publication,? 2901 of the 11603 (25%) articles published data in GEO or ArrayExpress,Several factors correlated with frequency of data sharing,"Factors correlated with frequency of data sharing: ? Publishing in a journal with a relatively strong data sharing policy ? Having funding from many NIH grants ? Publishing in an open access journal ? Having prior experience sharing gene expression microarray data ",Several factors strongly associated with negative probabiliyt of data sharing,"? Increased first author age and increased experience ? having no experience sharing data ? studying cancer ? having human subjects",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Piwowar 2011,Patterns in the frequency with which investigators openly archive their raw gene expression microarray datasets after study publication,"? Authors were most likely to share data if they had prior experience sharing or reusing data, if their study was published in an open access journal or a journal with a relatively strong data sharing policy, or if the study was funded by a large number of NIH grants ? Authors of studies on cancer and human subjects were least likely to make their datasets available",,"This data-sharing rate increased with each subsequent article publication year, increasing from less than 5% in 2001 to 30%?35% in 2007?2009. Accounting for the sensitivity of my automated method for detecting open data anywhere on the internet (about 77% [44]), it could be estimated that approximately 45% (0.35/0.77) of recent gene expression studies have made their data publicly available. The data-sharing rate also varied across journals. ",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Piwowar 2013,Citations Rates,"Studies that made data available in a public repository received 9% (95% confidence interval: 5% to 13%) more citations than similar studies for which the data was not made available",Data Reuse,"? The level of third-party data use was high: for 100 datasets deposited in year 0, we estimated that 40 papers in PubMed reused a dataset by year 2, 100 by year 4, and more than 150 data reuse papers had been published by year 5. ? Data reuse was distributed across a broad base of datasets: a very conservative estimate found that 20% of the datasets deposited between 2003 and 2007 had been reused at least once by third parties.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,"This is here what I did piece, not a real study" Polydoratou 2006,The need to link source and output repositories,"? 57% of respondents noted that linking from published output to source would be a significant advantage; 29% a useful feature ? Linking source to otuput: 41% as a signifcant advantage; 33% as useful ",Metadata for research data sets,"? Most important types: author name (89%); project description (68%); date and title of the data set (58% each) ? Assign metadata during: file saving (37%); prior to data creation (26%); indexing process for source (24%); after submission to a repository (8%); not sure (5%)",Current access to primary research data,"? Academics: control access by storage on private network (21%) ? Postgraduate reseaarch students: control access by storage on private network (32%) ? Contracted researchers: authentication of ID and passwords (100%) ? Research assistants: storage on standalone computers (16%)",Level of metadata considered important,"The majority of the chemistry respondents (89%) noted that the author and/or creator?s name was the most significant metadata element for their data. Other important metadata elements were the project?s description (68%), the project?s title (68%) and the assignment of subject keywords (68%). The date and the title of the data set (each at 58%) were equally important. The least important metadata was considered to be the funding source of the project (13%).",Assignment of metadata,"More than one third of the chemistry respondents (37%) noted that metadata is assigned to resources during file saving which indicates the involvement of software for automatic assignment of metadata. The second most popular choice was that metadata is assigned prior to data creation (26%) while one quarter of the respondents noted that metadata is either assigned as part of the indexing process for source files (24%) or no metadata is assigned (24%).",Barriers in sharing ,"The majority of the responses from the academic staff indicated, ? storage of their data on a private network/intranet (21%) as the main measure to control access. ? The same measure was also employed by a large proportion of the postgraduate research students (32%) All of the contracted researchers noted that they use authentication of ID and passwords for controlling access to their data.",Preferred routes of searching at output repositories,Majority of respondents replied that they preferred to use the simple search option ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Pope 2015,Analysis reproduced from archived data,"? Journal?s mandatory archiving policy has had a substantial positive impact, increasing genetic data archiving from 49 (pre-2011) to 98% (2011?present) ? 31% of publicly archived genetic data sets could not be recreated based on information supplied in either the manuscript or public archives, with incomplete data or inconsistent codes linking genetic data and metadata as the primary reasons ? While the majority of articles did provide some geographic information, 40% did not provide this information as geographic coordinates. Furthermore, a large proportion of articles did not contain any information regarding date of sampling (40%). ",Has geographic or temporal information been provided and at what scale? ,"In contrast to the gains in genetic data archiving, the provisioning of geographic and temporal data changed little from 2009 to 2013. All articles for which geography was deemed relevant provided geographic information of some kind. However, over a third of articles provided geographic information as text only (36%), with 18% describing geography in the text at a regional-level only (ocean, country, state, region or province). Only 60% of articles provided geographic coordinates. There has been an increase in the level of precision of geographic coordinates when provided (<1 km increased from 29 to 46%); however, the overall rate of latitude and longitude reporting has remained steady. Similarly, reporting of time of sampling remained fairly constant. Around 40% of articles did not provide any temporal information, and many provided only a range of years (20%). Thus, only 40% of articles reported year of sampling (or greater precision). ",What is the scope for repurposing the associated data for future studies?,"The proportion of data sets available for repurposing will vary depending on the spatio-temporal needs of the new study. If temporal information is not required and if authors are willing to use locality text information, in addition to geographic coordinates, a large proportion of recreatable data sets could be reused (83%). However, if latitude and longitude are required, fewer data sets are repurposable (64%), and if latitude and longitude along with year of sampling or better are desired, a much smaller pool of data sets are available for repurposing (21%).",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Procter 2013,"Outreach, training, education",Many researchers had heard about the services - often did not have very good knowledge of the nature of the services offered or what the benefits of using them would be,Need for organizational arrangements: local research support,Lack of support that spans the provision of generic Info/Comm Tech services and the more advanced services used in e-Research,Need for establishing working relationships between technology developers and users.,"Ensure users experience a uniform service and support interface wherever they (or the service) may happen to be located, e.g., research funders to invest in coordinating discipline-specific structures (repositories) driven from within the disciplines themselves",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Prost 2015,Frequency of inclusion of research data in PhD dissertations in social sciences and humanities,? 188 / 283 dissertations contained appendices with research data,The distribution of disciplines per dissertation with appendices is more or less the same than for the overall sample,"? The linear determination coefficient between both variables is high (R2=.91). ? Average disserations with data appendices: 66%",Support presentation and format of data appendices,"? The French official guidelines for PhD dissertations do not specify how to structure or present an appendix. ? Highly signigcant difference in speration of appendices and text between electronic dissertations (52%) and print (38%), ?2=14.32",Variety of data sources used in the research work of the disserations,"? Most-used: archives; surveys / interviews; text samples ? Less-used: inventories; experiements/observations; internet; photographs",Research data present in the appendices,"? Text samples are the most important data type, followed by tables (spreadsheets), images (including drawings and posters), maps, photographs, statistics, graphs (including figures, charts and visualisations), databases, and timelines (chronologies). ",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Pryor 2007 (+ Companion),Workflows and norms of scientific researchers in their use of source and output repositories,"Project StORe bi-directional linking between source and output repositories: endorsed by 85 per cent of respondents to the questionnaire, with 46 per cent believing it promised to deliver significant advantage to their research",Diversity in the application of good data management practice,"Individual institutional repositories will also contain different file types and formats, and will apply different metadata standards",Cross-sectional (survey): Common areas for improvement,"Appropriate assignment of metadata acknowledged as critical/demanding, both intellectually and time required to do it well",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Qin 2010,Disciplines influencing data sharing,"? Whether researchers were frequent data producers depended upon the nature of their discipline and research. ? some disciplines brought in extra data for use more frequently than others ? data producers understood that researchers outside their group might make use of their data",Metadata,"? Metadata entries helpful when obtaining data for use from external research groups: highest in physics, zoology and environmental science faculty ? somewhat lower reporting of researchers generating own metadata vs in indicating how helpful metadata was in obtaining external data ? sources used to decide what metadata to create were predominately local (own planning, discussion with lab group, peer researchers) as opposed to guidelines",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Raju 2013,Research Data Managament,"Maintain that academic research librarians are the most appropriately equipped to provide required research data services such as data management planning, digital curation (selection, preservation, maintenance, and archiving), and metadata creation and conversion",Partnership/Collaboration,The library as a valuable an active partner in all aspects of the research cycle,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Rans 2013,Several takeaways that are useful to researchers using funding from UK funders,"? Technical elements of a project are integral to the research, and not an afterthought; address them early in the planning process ? allow time for stakeholders and contriubtors to be integrated into the planning process ? writing a good technical plan requires input from members of the university support staff ? if the project produces a resource that would be of use to a wider community on a long-term basis, consider: How long will the resource retain its utility? Will future projects need to interact with it and how will that relationship be managed, both practically and financially? Most importantly, how will the resource be funded?",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Rathi 2012 (+ Companion),Strong support for and prevalence of sharing data ,"? 236 (74%) thought that sharing de-identified data through data repositories should be required ? 229 (72%) thought that investigators should be required to share de-identified data in response to individual requests","Concerns with data sharing through repositories","Most common concerns related to appropriate data use: n=205 (65%)","Reasons for granting and denying individual data sharing requests",Most common reasons cited were related to promoting open science: n=248 (78%),Support for data sharing,"No significant differences in support for data sharing in principle between respondents categorized by trialists? academic productivity and geographic location, trial funding source and size, and the journal in which it was published",Concerns with sharing data through repositories,"Majority of respondents (76%) reported at least one experiential or hypothetical concern with sharing data - No significant differences in overall concern about sharing data through repositories between respondents categorized by trialists? academic productivity and geographic location, trial funding source and size, and the journal in which it was published",Reasons for sharing data,"Majority of respondents (78%) identified the promotion of open science as an experiential or hypothetical reason for sharing data from their published study - no significant differences in reasons for sharing data between respondents categorized by trialists? academic productivity and geographic location, trial funding source and size, and the journal in which it was published Exception = would share data from their published study in order to receive academic benefits or recognition, their responses differed significantly based on geographic location (p <0.001).",Reasons for withholding data,"The majority of respondents (74%) identified ensuring appropriate data use as an experiential or hypothetical reason for withholding data from their published study - no significant differences in reasons for withholding data between respondents categorized by trialists? academic productivity and geographic location, trial funding source and size, and the journal in which it was published Three exceptions = responses differed significantly based on trialist academic productivity (p = 0.01), trial funding source (p = 0.003), and journal of trial publication (p <0.001)",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Ray 2011,"Differences between choices pertaining to studies conducted within our laboratory, the sharing of specimens with collaborators and sharing of specimens with the NIH-NIDA repository","There was a significant difference overall in the affirmative informed consent rates (p<0.0001): ? 96% of subjects consented to studies conducted within our laboratory; ? 92% to the sharing of specimens with our collaborators; and ? 87% to sharing of specimens with the NIH-NIDA repository (p <0.0001).",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Read 2015,Data organization challenges and needs,"? Basic scientists perceived a lack of available standards and procedures available for them to uniformly collect their data ? basic scientists: disconnect between the different types of data collected; often located in different places and difficult to find ? basic scientists: graduate students work for a limited amount of time in the lab and leave with either the physical data or methodology used to collect the data ? clinical researchers: biggest challenge was quality of data, stemming from multiple people collecting the data and inconsistent collection methods ? clinical researchers: difficulty transfering data from one format to another",Research interest in data sharing,"? Clinical researchers, particularly those in the department population health, willing to share their data with the public as long as they were aware of who was using their data; same researchers were interested in using shared datasets for their own researcher ? basic scientists showed little interet in sharing their research data, preferring to share with direct collaborators or with no one; reasons: negative exeperience with past sharing, concerns about privacy, belief that data is too specialized to help others, insufficient storage options for sharing data publically, hurdle of having to organize their data prior to sharing",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Reidpath 2001,Cross-sectional: Response rate,"? 9 (60%) authors receiving specific requests responded ? 12 (86%) authors receiving general requests responded",Cross-sectional: Length of time to respond,"Authors who received specific request for data took significantly longer to respond than the general request recipients. (Log Rank = 5.26, df = 1, p = 0.0218 took significantly longer to respond than did recipients of a general request (Log Rank = 5.26, df = 1, p = 0.0218).",Content Analysis: General receptiveness to sharing,"? Of the 21 respondents, 57% were prepared to release the data at least in principle ? 14% were not happy to release the data ? 29% remained noncommittal about whether or not they would release the data",Content Analysis: Concerns expressed about releasing the data ,"4 categories emerged: 1) corresponding authors who wanted to know why the data were being requested, or what specific analyses were to be conducted 2) who would be happy to release the data at some future time. This included authors who were still analysing the data, or where the data formed a part of a larger and as yet incomplete study. 3) stipulated conditions of use. This included payment for the preparation of data, signing a contract of agreed use, and coauthorship or `serious' acknowledgement 4) agreed in principle but had to consult other co-investigators before the data could be released",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Richardson 2012,eResearch,"23% (3) of responding libraries have embraced eResearch as a potential new area of involvement, and have representation on eResearch working groups: ? 46% (6) have integrated eResearch/library services, or are working closely on specific eResearch services or projects ? 15% (2) are working towards greater involvement ? 39% (5) have either no or limited involvement",Data Management,"17% (2) of respondents reported having an active role in mapping research collections and participation in eResearch projects: ? 17% (2) reported having an advisory role ? 25% (3) are developing strategies ? 41% (5) libraries reported having either a limited role or no role at all ? 1 library did not complete this topic",Overall,"Research support is strong: ? research impact ? publication support for researchers ? institutional repositories high level of variance in support: ? eResearch support in general ? research data management in particular",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Rimkus 2014,File format policies have evolved little beyond the document and image digitization standards of traditional library reformatting programs,"? 174 file formats appear in these 118 policies. By type, they break down into the categories Application (14), Audio (19), Computer programs (17), Geospatial (6), Image (28), Presentation (10), Spreadsheet/database (28), Text/document (36), Video (15). ? The five most commonly occurring file formats in all policies are the Tagged Image File Format (extension TIFF, or TIF) (115), the Waveform Audio File Format (WAV) (80), the Portable Document Format (PDF) (74), JPEG (JPG, JPEG) (70), and Plain text document (TXT, ASC) (69). ? Practitioners place high levels of confidence in trusted formats for documents and images with origins in library reformatting programs. ",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Roos 2014,Take-home message [Case study],"? Research data management requires a broad cooperation within the institution, between institutions, nationally and internationally ? Libraries are neutral players on the field but it is necessary to take researchers' framework and processes into consideration when designing services for them",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Rostami 2009,Review of audit program indicates that higher data quality may be achieved from a series of small audits throughout the trial rather than through a single large database audit at database lock,"? Error rates trended upward from year to year in the period characterized by traditional audits performed at database lock (1997?2000) ? Error rate consistently trended downward after periodic statistical process control type audits were instituted (2001?2006) ? Increases in data quality were also associated with cost savings in auditing, estimated at 1000 h per year, or the efforts of one-half of a full time equivalent (FTE).",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Rousidis 2014,Data quality problems (exist in Dryad). There is an absence of quality control.,"? Creator (Author) 1443 (8.71%) out of 16,568 records demonstrated Creator quality issues ? Date Identified as a quality issue but frequencies not reported ? Type 21.4% was jargon, blank, or irrelevant",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Sacaramozzino 2012,Data preservation,"? 97% report that they are personally responsible for data management ? 40% indicate that no one was re-sponsible",Data backup,"? 94% of respondents report storing the primary copy of their research data on their office computer ? 30-35% also report storing the primary copy on a lab computer, home computer, USB flash drive, or external hard drive",Data sharing,"? 20% of faculty report being aware of criteria for the creation of descriptive information to aid in discovery and reuse of data ? 65% important to openly share data and that colleagues should do the same",Educational Needs,? Not sure/not confident in their data mgmt skills,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Samuel 2015,Roles and responsibilities,"? 45% of the DMPs, it was not made clear who was in charge of the data",File formats,"? 55% of the DMPs specifically mentioned file formats or a program that was going to be used to capture data ? 79% met the broader category of mentioning either or both file formats and data types.",Data storage,"? 19 out of 29 DMPs indicated using a department-run or university-run server",Intellectual property rights not included,"Almost half of the DMPs neglected to mention any sort of rights to the data",Data sharing,"? 17% DMPs did not adequately describe how the researchers were planning to make their data publicly available", Documentation and metadata,"? 29% indicated metadata standards/schemas will be used? ? 41% indicated plans for documentation and metadata are specified",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Sands 2014,Performance problems with the data transfer,"? Slow and unreliable data transfer over the network ?Slow and unreliable verification and validation of the data once received ? transfer and integrity chekcs required 10-15% of the library's preservation specialist's time for the full five year period",Workforce expertise,"? The library team did not have much expertise handling hundreds of TB of scientific research data, but did possess the combined workforce expertise to overcome the challenge ? Expertise included ability to proble-solve, seek out solutions, work in a team and write code ? Team wrote new code to address the performance problems with the data transfer and verification ? Lacking astronomy domain knowledge necessary to meet some of the preservation and curation goals ? Realized that astronomer would be neededto perform the curation work instead of library team to ensure scientific accuracy",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Savage 2009,Authors' response to initial email,"4 refused, 3 did not reply, 1 willing to share data",Reply to follow up email with reminder of PloS data sharing guidelines for authors who refused,"1 replied forbidden to share data, 2 refused citing ""too much work"", 1 required formal proposal to a trials group.",Second email sent to 3 authors who did not reply to 1st email.,"1 refused: wanted to conduct more analyses, 2 never replied.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Sayogo 2013,Two key determinants affect the researchers' willingness to publish their research data,"? Data management in terms of data management skills and organization support ? Acknowledgement of the data set's originator in terms of appreciation and legal and policy requirements ? The impact that these determinants have is contignent on the amount of data to be published",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Sayongo 2011,"Social, organizational, economic related challenges"," Form of Organizational Support Support on data management: Yes -51% , No - 49% Support on data storing during the project: Yes - 46, No - 54% Tech. support for data management during the proj.: Yes - 57% , No - 43% Tech. support for data management beyond the proj.: Yes - 43%, No - 57% Training for data management: Yes - 27%, No -73% Fund for data management during the project : Yes - 39%, NO- 61% Fund for data management beyond the project: Yes - 27%, No -73%",Form of acknowledgement,"Co-authorship on publications: Yes- 60% , NO -40% Formal acknowledgement of the data providers: Yes - 93%, No- 7% Opportunity to collaborate with others: Yes- 81%, NO -19% Part of cost of data must be recovered: Yes -30%, NO- 70%",Technology related challenges,"Lack of data standards: Yes - 18% , NO -82% Research data collection process: Yes -89%, NO -11% Research data searching process: Yes- 86%, No- 14% Research data cataloging process: Yes -76%, No -24% Research data storing during the project: Yes - 85%, No-15% Research data storing beyond the project: Yes -56%, No- 44%",Legal and context policy challenges,"Place a condition on access to make data available: Yes- 82%, No- 18% Citation is required to use the data: Yes- 98%, No-2% Approval from data providers is a must for data reuse: Yes- 48%, No- 52% Review from data providers is need for data reuse: Yes- 62%, No-38% Reprint copy of products must be given to data owner: Yes- 70%, No- 30% Legal agreement should be obtained for data reuse: Yes- 45%, NO-55% The data provider is agrees to a statement of uses: Yes- 67%, No- 33% The funding agency requires data management plan: Yes- 34% No- 66%",Local context and specificity challenges,"Misinterpretation of data due to complexity of data: Yes -90% , No-10% Misinterpretation of data due to poor quality of data: Yes- 87%, No- 13% Data may be used in other ways than intended: Yes- 91%, No- 9%",Propability to publish data sets,"Not publish in Website: Org - 46.3% ., PI - 56.9%, Not publish in Research network: Regional - 64.7%, National - 45.3% Publish some in Website: Org. - 38.6%., PI- 31.8% , Publish some in research network: REgional - 27.7% , National - 37.2% Publish most in WEbsite: Oeg - 11.2%, PI- 8.3%, Publish most in Research network: Regional- 6.0% , National -14.1% Publish all in website: Org - 3.9% , PI - 3.0% . Publish all in Research network: Regional - 1.6%, National -3.4%",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Schmidt 2014,Implemented the BExIS system to support data sharing and reuse.,,Take-home message [Case-study],"Institutions usually do not have the capacity to initiate the foundations of the infrastructure tasks before the project starts even though basic services are often required early on in the project. It takes time to develop tools such as standards- compliant data creation, whereas researchers usually cannot delay the creation of data",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Schmidt 2016,Access and licensing conditions,"?Public Domain? and ?Attribution? licenses are considered very useful by the majority of respondents, which is in line with most recommendations by advocacy organizations, policy makers and research funders",Guidelines for open data,"Only 23% (216 respondents) were aware of any guidelines for publishing data; all other 733 respondents did not seem to be aware of any specific guidelines",Expectations about functionalities of infrastructures for open data,"? The top four functionalities are that authorship and attribution information are highlighted, data are citable via persistent identifiers, links to publications are provided and restrictions, conditions and/or licensing information is commu- nicated",Importance of open data for disciplinary communities,"? More than 4 out of 5 respondents highlighted that open data is very important for advancing research ? half of respondents consider open data important for supporting applications to societal problems",Barriers for publishing data as open data,"? Most important barriers were desire to publish results before releasing data, legal constraints, loss of credit or recognition and possible misinterpretation or misuse",Motivators to publish data as open data,"? Top motivators to publish data: accelearation of scientific research and applications, dissemination and recognition of your work research, personal commitment to open data, requests from data users and funder policy",Discovery of open data,"? References in journal articles, web search engines and data repositories were identified as the most common discovery routes (n = 774 respondents selected at least one option).",Data archives,"? Leading examples as suggested by the respondents included the following data repositories (roughly ordered based on a word cloud): NASA, Dryad, NOAA, GBIF, Pangaea, Figshare, data.gov.au",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Schumacher 2015,Electronic files that they would attempt to recover first in the event of an apparent loss,"? 42 participants (75%) identified electronic files within the category of scholarly materials (including research data and scholarship in a broader sense); 23 (41%) designated teaching materials; ten (17.8%) selected administrative and/or organizational materials; and five (8.9%) named electronic communications. ? They included the following formats: JPEG, PDF and .doc most often",The data suggest that many faculty members are unaware that their data is at risk. They also indicate a strong correlation between faculty members? digital object loss and their data management practices.,"? 37 (66%) relied on the hard drives of their office computer; 22 (39%) used an external hard drive; 21 (37.5%) used a hard drive as a built-in component of a personal computer; 18 (32%) used cloud-based services; 16 (28.5%) used a Flash/USB drive; ten (17.8%) used their email account(s); six (10.7%) used means or devices not mentioned in the project interview?s list of storage options; and three (5.4%) relied on optical discs like CDs or DVDs. ? 20 (37.5%) empoloyed institutional networked capacity for the storage of work-related material; one participant made use of an external, discipline specific repository ? 31 (55.3%) aware that they had lost work-related digital objects and unable to replacethem with back up files ? Faculty members able to sue college administered networks from off-campus reported slightly lower levels of data loss",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Schwartz 2010,Data sharing,"Respondents strongly endorsed data sharing, with the caveat that principal investigators should choose whether or not to share data they collect",Use of Medical Educaiton Repository,"? Majority believed that a repository would benefit their unit and the field of medical education ? Few respondents reported using existing repositories",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Shen 2015 (+ Companion),"The results suggest that these data assets have huge potential value. They often contain valuable information for new investigations or longitudinal analysis, or support data integration using new analytical techniques.","? A total of 472 faculty responded and over half (57%) considered their data to have long term value. 45% indicated that their data cannot be recreated or recollected, and 44% reported that their data can be repurposed and reused by researchers in the same discipline or sub-discipline(s) ? 29% believe their data to have important and direct societal, policy or business implications; 28% indicate that their data can be repurposed and reused by researchers in other disciplines",There are differences in openness of data within the college communities at VT,"? There are significant differences among the colleges in their community level of engagement in openness of data (p<0.05) ? The significant differences among the colleges in openness of data suggest different cultures of data sharing activities and community practices.","A significant gap between the rather limited sharing activities and the highly perceived reuse or repurpose values regarding data, indicating that potential values of data for future research are lost right after the original work is done. ","? Reasons researchers who don't make their data openly available after project completion: Confidentiality or data protection issues (58%); time and effort in preparing data for sharing (44%); no incentive because not required to do so (36%) ? 56% of respondents never or seldom reuse existing data; top three concerns: difficulty finding or accessing reusable data, difficulty integrating data, and possible misinterpretation of data. ? 67% of respondents: some of their data could be reused or repurposed; 24%, all of their data could be reused or repurposed (total of 91% who consider their data to have potential reuse value)",Data storage,"? Data storage mainly stays at a personal level, either on personal computers or personal storage devices ? Formal repository systems such as institutional, domain or disciplinary-specific, publisher or publisher-related, and other types of repositories or archives (such as national data centers) are rarely chosen for data storage",Research Data Documentation and Management Practices,? 49% report no standard metadata and documentation schemes in use,Data Management Planning,? 61% do not DMPs,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Si 2015,Offering research data services ,"Of the 87 university libraries investigated, 50 libraries have offered research data services.","Services can be divided into six aspects: 1) research data introduction, 2) data management guideline, 3) data curation and storage service, 4) data management training, 5) data management reference and 6) resource recommendation","Research data introduction is the most frequently provided (47.13 per cent) followed by data curation and storage services (43.68 per cent) data management guideline (42.53 per cent) data management reference (41.38 per cent) resource recommendation (41.38 per cent) data management training (24.14 per cent)","Case study: The research data service workgroup at Wuhan University invited key stakeholders that would benefit from the work to join the work team at the beginning of the program, and these stakeholders gave much help and advice to the service",,"Case study: the library should play an intermediate role in developing research data services. The workgroup found that many scientists were unwilling to make their research data available to others due to fears that the data would be misused or misinterpreted. To ameliorate this, the workgroup provided the platform for data management and sharing, and let researchers themselves decide which datasets to submit, what information would be visible to others and the level of access to the data.",,"Case study: if actively invited or persuaded, many scientists were willing to open their research data that were not needed to be kept private or confidential, as the researchers recognized the value of data sharing to scientific data research.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Silva 2006,Take-Home Message [Case-Study],"? Results indicate that the users found the service easy to learn, comfortable, and useful, which we expect to play a key role in the sustainability of this digital library",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Simons 2013,Engaging with researchers and librarians [Case Study],"outreach strategy included conversing with subject librarians about citation practices in different disciplines, introducing data citation as part of a standard consultation with a specific research group, and engaging with researchers at the point of data deposit into the institution's data repository",Take-home message [Case study],"? Benefits for researchers are not the same as benefits for institutions or for funders, and this needs to be kept in mind when communicating about citation benefits with people who may feel increasingly pressured by the multiple efforts already in place to measure the value of their research. ? Developing a culture of routine data citation is intricately linked to routine data deposit and data management practices",,,,,,,,,,,,,Burden when accessing and reusing data,"? Paying for access was considered the most significant burden ? about half of all respondents considered the varying degrees of data quality in different datasets, varying standards in how the data has been gath- ered and varying data formats as an obstacle",Wishes regarding open data,"? Improved access to climate data from specific countries, e.g. China, India, Russia, Asia, South America, France, and developing countries ? support of long-tail research datasets via repositories ? access to private-sector and economical data",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Simukovic 2014,Take-home message [Case study],"? Researchers are generally willing to share their data; ? They need support from central departments to make it easier to comply with data-relevant regulations (e.g. of funding agencies or journals) ? Professional skills regarding RDM have to be shared more e ciently, especially with younger researchers ? Legal issues are a very hot topic and some fundamental understanding or framework has to be provided as soon as possible ? A lot of issues cannot be solved by individual institutions alone and require strong cooperations and pooling of collective expertise.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Snadjr 2010,,Outlines the research groups' questions and process regarding the preservation of data and of accompanying descriptive documents in ScholarWorks repository ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Soehner 2010 (+ Companion),Approaches to e-science in Institutions,A high proportion of respondents to the survey indicated that their institutions were providing infrastructure or support services for e-science (45 of 58 or 77.5% of respondents),,,Data Support and Services in Institutions,"? 42 respondents indicated there were no designated units to provide data curation and support for scientific research data on their campus ? 22 respondents indicated that they had both central and distributed data centers for research data on their campuses, ? 62% of respondents (26 of 42) indicated that their institution had not conducted an assessment of data resources and needs ? 5 institutions (12%) indicated that their institution supported them ",Approaches to E-science in Research Libraries,"? 73% (29 out of 40) of the respondents indicated the library was involved in e-science support at their institutions and that leadership of these efforts was primarily through a team effort (15 out of 31 respondents) or some combination of individuals, units and teams working together (13 of 31). ? 87% (27 out of 31) of the e-science services offered by libraries are provided in collaboration with other units on campus",Reference and Consultation Service,"Subject librarians provide the following: ? Finding and using available technology infrastructure and tools (22 of 29 respondents) ? Finding relevant data (24 of 29) ? Developing data management plans (23 of 29) ? Developing tools to assist researchers (22 of 29)",Resourcing E-Science Activities in Libraries,"? 17 out of 29 respondents indicated they have both individual discipline librarians/staff and dedicated data librarians/specialists taking on these duties ? 18 out of 28 respondents are reassigning existing staff or providing training to existing staff as part of an overall strategy to incorporate e-science responsibilities into their current portfolios",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Stamatolos 2016,"The analysis identified several themes prevalent among the researchers with respect to data management: a) conception of data; b) research process; c) handling of data; d) roles; e) sharing and ethics; f) ownership and stewardship; and g) institutional support"," n/a ",Theme 1- Conception of Data,"? When researchers hear the word 'data', they don't always think of the same things; assumption that it must involve quantitative data",Theme 2- Research Process,"? generally see data as integral to the research process rather than as product in and of themselves ",Theme 3 - Handling of Data,"? Handling of data did not conform to any categorizations by discipline; all have developed some method of backing up their data, most relying on local computers, external drives and campus servers",Theme 4- Roles,? High level of collaboration with partners; all except one of the interviewees were involved with external collaborators,Theme 5-Sharing and ethics,"? When first asked about sharing data, many of these researchers commented that they have already shared data through publications or presentations ? sharing data: a few would agree to share cleaned data with anyone, many prefer to share with only those who directly request it; concerns over interpretation of data was a worry for several interviewees",Theme 6- ownership and stewardship,? Control is desired by nearly all of these researchers and this extends to ownership and responsibilities,Theme 7- Institutional support,"? Several researchers expressed specific frustrations with the nature and level of technological and administrative support for their research. While some of these researchers did not mention any specific difficulties, others identified challenges early in the interview when describing their research process.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Stanton 2011,eScience professionals duties ,"? Locating collaboration opportunities ? communicating with others ? enabling collaborations and or- ganizing teams ? analyzing researchers' technology needs ? coordinating between researchers and information technology experts (e.g., with technology require- ments and specifications) ? ensuring compliance ? training researchers and others in using technologies","Major duties that mainly pertained to the use of ""things""","Computers and software: ? investigating technology solutions ? recommending technology solutions (by comparing technologies) ? implementing IT for researchers (install O/S, install software/application, manage collaborative technologies, and configure systems by using scripting) ? maintaining and managing the technologies (administer systems, maintain tools/technolo- gies, and facilitate IT usage) ? preparing, compiling, and managing documents ? managing budgets and project processes.","Worker characteristics required for effective performance on the job, including knowledge areas, skills, and abilities.","Knowledge: ? of databases ? of terminology and methods in scientific subject domain areas (e.g., physics) ? of information technology ? of programming or scripting languages. Skills: ? administrative skills ? communication skills ? database management skills, ? programming and scripting skills, ? project management skills, ? research skills ? system administration skills ? general computer skills Ability: ? ability to work well in a team environment ? ability to quickly learn new material ? ability to communicate with others.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Steinhart 2012,Data management planning requirements,? 62% interested in help with writing a data management plan; 13% said not interested,Respondents produce a wide variety of types and formats of data,"? Text, databases, image and code were the most common types of data ? 48% report generating three or more of the data types",Respondents are generally uncertain as to whether the data they produce conforms to disciplinary standards,"? 43% said they don't know if their data conform to discplinary standards, 12% indicated that their data do not conform to disciplinary standards","The majority create no metadata; of those that do, most do not create metadata according to a particular standard","? 42% had created, or plan to create metadata for their data sets; 1/3 of these create metadata conforming to disciplinary standards","Access, sharing, confidentiality, security, and intellectual property","? A majority of respondents (65%) reported no need for assistance from an intellectual property specialist to develop usage statements for or apply licenses to their data sets ? 95% reported that they would be able to share their data at some stage in their research; 68% prefer to wait until at least 6 months after analyzing their data to share",Researchers report using a variety of strategies (on-campus and commercial) for backing up and for providing access to their data sets.,"? 77% plan to share less than 100GB of data; 11% more than 100GB; 4% 1-100TB; 4% more than 100TB ? More than 80% indicated they rely on their own infrastructure for backups; 23% use a campus service for backups; 7% use a commercial solution; and 5% reported not backing up their data at all",Potential impact (of NSF requirements) on campus services,"Challenges: ? Considerable confusion exists as to what ?counts? as data, even among researchers who are likely among their discipline?s experts ? providers of data services will encounter a very broad array of digital content in the course of planning and delivering data management services ",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Strasser 2012,Data management education is not currently a priority for ecology instructors teaching general ecology courses at the undergraduate level. ,"Less than half of the courses surveyed covered any of the 12 data management topics","Undergraduates in ecology are not learning how to properly document, organize, and archive data sets in the context of their relevance to practicing ecology","Few instructors of these courses required that students keep laboratory or field notebooks, although student generated data was used in more than 50% of the ecology courses","There is an urgent need for trained scientists who are instructing future scientists to understand the importance of data management education.","Notably, 77% of instructors indicated that data management should be taught in a different course Based on self-reported barriers to teaching, the majority of the instructors surveyed (over 60%) struggled with the amount of time they were given to cover a broad range of ecological topics, theories, and concepts",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Sturges 2014,Publishers vary widely in their approach to sharing data on which articles are based,"We found at the time of analysis that the overall landscape of journal data sharing policies contained patchy and inconsistent coverage. Such a situation appeared inadequate in an environment in which the rhetoric and policy advise and encourage data sharing. For example, some journals had multiple policies (two or three) whereas 50% of the journals examined had no data sharing policy at all. Of the 230 journal policies found 76% were by Piwowar and Chapman?s definition weak, with the remaining 24% being strong. Significantly, the journals with high impact factors tended to have the strongest policies. Not only did fewer low impact journals actually have any data sharing policy, those policies these were less likely to mandate data sharing.",Summary of main points discovered from survey of journal data policies,"What to deposit Vague terms ? Supporting information; Unspecified data; Other data; Supplemental data (after discussion) Least commonly mentioned ? Structures; Protein; DNA sequencing; Program code; Software Most commonly mentioned ? Data sets; Multimedia; Specimens; Samples; Material Where to deposit: Vague - 7%, Unnamed repository - 17%, Named repository - 15% Expectations of access: Low cost access - 8%. Free access - 2%, Open Access - 1% When to deposit: With submission - 51%. For peer review - 23%. On publication or later - 26%",Stakeholder consultations,"All stakeholders purported to be in favour of shared data and were willing to list the benefits of data sharing, they all raised caveats and concerns and identified barriers to the sharing of data. For instance, it was clear from researchers? comments during the focus Around 40% of the respondents admitted that they did not allow others access to their data, and the rest mainly shared only with collaborators and colleagues.",,Publishers also indicated a number of concerns about linking data from repositories.,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Sulakhe 2012,"How Globus Online, a hosted service, addresses the data movement challenges by providing highperformance and fire-and-forget data movement capabilities - it is a transfer solution for managing 'enormous' volumes of data with solutions for specifc cases","How Globus Online, a hosted service, addresses the data movement challenges by providing highperformance and fire-and-forget data movement capabilities - it is a transfer solution for managing 'enormous' volumes of data with solutions for specifc cases",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Suntae 2014,Repository operation status and language of the repository content,"? Repositories operated in Korea, China and Japan account for a large ratio in Asia (42.2 percent) but account for a small ratio (8 percent) of the global total ? The repositories that provide the content in Japanese and Chinese were 5.57 percent and 4.14 percent, respectively. ? The Korean content was 0.72 percent and accounted for 17th place among the 64 languages of content provided.",Registration status,"? OpenDOAR provides 197 repositories ? ROAR provides 288",China/Japan/Korea repository subject,"? The Multidisciplinary subject area ranked first and second in the top of the repositories operated in three countries. ? Korea and Japan, the repositories in ?Health and Medicine? area are in the top ? China the ? Ecology and Environment ? ranked at the top. ? China did not have the repository of ?Health and Medicine? subject area in the top ten. ? In Korea and Japan, the repositories in the field of humanities and social sciences, such as ?Law and Politics,??Business and Economics,?and?Arts and Humanities General?appeared all in the top ten. ? For China the repositories in the science and technology related subject areas mainly appeared in the top ten. ",Repository software,"? ?DSpace? software is the most widely utilized repository system in the world and this is the same for China/Japan/Korea ? Followed by ? DSpace (OpenDOAR-Global 1087; CJK 146, ROAR?Global 1446; CJK 224),??EPrints,??Digital Commons? and ? OPUS?",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Tenopir 2011,Majority of respondents willing to share at least some of their data and re-use others? data pending certain conditions or restrictions on use,"Nearly two thirds of the respondents (65%) reported that they would be more likely to make their data available if they could place conditions on access. Respondents do not differentiate much between what they consider fair conditions for use of others? data and fair conditions for use of their own data ","The results imply that there is a lack of awareness about the importance of metadata among the scientific community ?at least in practice","More than half of the respondents (56%) reported that they did not use any metadata standard and about 22% of respondents indicated they used their own lab metadata standard.",Current Data Practices,"When respondents were asked whether their primary funding agency requires them to provide adata management plan, more than half (55%) reported no, 29% yes, and 16% said they do not know. Note: AT the time of the survey the NSF did not require plans for funded projects",Data use,"An organization-specific system: 351, 38.5% Long-tem Ecological Research Network: 292, 32.1% Other data access: 246 , 27.0% A Distributed Active-Archive Center: 173, 19.0% A Global Biodiversity Information Facility : 73 , 8.0% National Biological Information Infrastructure: 70, 7.7% National Ecological Observatory Network: 64, 7.0% International Long-term Ecological Research Network: 58,6.4% Taiwan Ecological Research Network: 7 , .8% South African Environmental Observation Network: 6 , .7%",Data types,"Experimental: 711, 54.6% Observational: 63 , 48.5% Data Models: 499, 38.3% Biotic Surveys : 446 , 34.3% Abiotic Surveys: 442, 33.9% Remote-Sensed Abiotic: 358 , 27.5% Remote-Sensed Biotic : 264, 20.3% Social Science Surveys: 251, 19.3% Interviews: 195 , 15.0% Other: 80, 6.1%",Data practices,"A majority of the respondents are satisfied with their current processes for most of the initial and short-term parts of the research and data lifecycle, including collecting their research data, searching for their data, analyzing their data, and short-term storage of their data. A smaller majority say they are satisfied with cataloging or describing their data (59.8% agreestrongly or somewhat). Satisfaction rate for the process of storing their data beyond the life of the project (long-term) is much lower than the short-term, only 45% versus 73% More than a third (35%) of the respondents stated that they are dissatisfied with the long-term storage process",Data management tools,"Only about a quarter (26%) of the respondents is satisfied with the tools for preparing metadata, while over 32% are dissatisfied. The large number of respondents who replied that they neither agree nor disagree (42%)",Data management support and policies," 43% of the respondents agreed that their organization or project has a formal established process for managing data during the life of the project 47% of the respondents disagreed with the statement that their organization or project has a formal established process for storing data beyond the life of the project. 38% of therespondents reported that they have a formal established process for storing data long-term 45% of the respondents replied their organization provides, to a degree, the necessary tools and technical support for data management during the life of theproject (short-term). 35% of the respondents areprovided with the necessary tools and technical support for long-term data management. 48% of the respondents reported that their organization or project does not provide the necessary funds to support data management during the life of a research project. 59% indicated that their organization or the project does not provide training on best practices for data management. 59% of the respondents replied that their organization or project does not provide the necessary funds to support data management beyond the life of the project",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Tenopir 2013,"Interaction with faculty, student, and/or staff regarding RDM",More than two-thirds of the 223 respondents have provision of research data services as an occasional or integral part of their job responsibilities,"Do academic librarians have the background,skills, and education to provide library-based researchdata services (RDS)?","? 78% for whom RDS are an integral part of their job responsibilities somewhat or strongly agreed that they have the necessary skills, knowledge, and training toprovide RDS ? 46% of those who have occasional responsibilities for RDS agreed that they have the necessary skills, knowledge and training.",Responses to the statement that their jobs allow them sufficient time to provide RDS to their patrons,"The majority of respondents, for whom RDS are integral the their role, feel they have enough time to dedicate to RDS, that the library provides opportunitites to develop skills related to RDS. aining or/and education on RDS.",What are librarian attitudes regarding the importance of RDS for their libraries and their institutions?,"The majority of respondents agree strongly or somewhat. A vast majority of respondents indicated that they feel RDS are necessary services The majority strongly believe the library is best suited to provide RDS There was overwhelming agreement that providing RDS will increase the visibility and impact of their institutional research",What are the factors that contribute to or inhibit engagement of librarians in RDS?,"Some of the aspects that motivates librarians currently involved in RDS, from higher to lower, are: A professional interest in RDS RDS is important subject disciplines they support RDS is a primary responsibility of their role Their role includes facilitating contributions to their IR Their role includes metadata creation, training and/or managament Their research includes RDS For librarians not currently involved in RDS, from higher to lower, are: Patrons requesting RDS In case RDS became a responsibility in their role In case their institution becomes more involved with RDS In case their instution develops an IR that accepts data If external funding agencies require RDS If RDS becomes important to subject disciplines they support Other",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Tenopir 2015,Data sharing and reuse: Perceptions,Respondents were asked about willingness to engage in scientific data sharing and reuse. A MANOVA reveals that there has been a significant shift in agreement about these topics since the baseline study,Data sharing and reuse: Practices.,"When reusing data, univariate ANOVAs show that respondents? agreement with the idea that data may be misinterpreted due to complexity of the data has seen a significant increase over time. Respondents? perceptions that data may also be misinterpreted due to poor quality of the data, have also increased from the baseline to the current study. Finally, respondents expressed significantly higher agreement in the current study that data may be used in ways other than intended than in the baseline.",Age of researchers with regards to data sharing practices,"Younger researchers are more concerned than older researchers are about the lack of access to data. Younger researchers also express more interest in using the datasets of others if access were easy and in sharing their own data if they could place restrictions",Subject discipline,"In particular, those in Medicine/ Health Sciences and others who work with human subjects were significantly less willing to share their data than respondents other disciplines. Social scientists expressed significantly less satisfaction with their ability to use others? data to address their own research questions. those from Education, Medicine/Health Science, and Psychology were more inclined than the overall total to agree that their data shouldn?t be available for others to use in the first place, also more inclined to agree that they don?t have the right to make their data available in the first place",Satisfaction with data practices.,"Comparing results from the two studies demonstrates that there are some significant changes in the level of satisfaction with different processes withinthe research and data lifecycle. There is less satisfaction with long-term data storage processes and tools for preparing documentation, and processes for searching for respondents?own data; however, respondents continue to be the least satisfied with processes for storing databeyond the life of the project (long-term) and tools for preparing metadata",Perceptions of organizational support,"There is greater agreement about the need for training on data management best practices, and the need for research organizations to provide the funding for data management.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Trimble 2015,Data storage [Case Study],"Creation of a centralized system for geospatial data storage, description and access (essentially, a spatial data infrastructure for Ontario libraries), intended to house consortially licensed geospatial datasets",Preservation [Case Study],"? Work has included sharing scanning specifications and georeferencing techniques, group funding for multi-institution projects, and a push to disseminate the outputs of these projects on platforms such as Scholars GeoPortal, ArcGIS Online, and Google Earth.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Tuyl 2015,Survey and interview results suggest moderate level of awareness of the regulatory environment around research data management.,"? Across all colleges, 64% of respondents are aware of current requirements by funding agencies; ? only 44% have been required to submit a data management plan as part of a grant",Results also present a clear picture of the types and quantities of data being produced at CMU and how these differ among research domains.,"? Nearly 5% of projects at CMU produce more than 10TB of data, 12-15% produce 1GB-10TB, 40% produce less than 1 GB of data, and 18% of projects produce less than 50 MB of data ? estimates of the amount of data shared outside of a project?s immediate collaborators are, as expected, lower than the total amount of data produced by the project. ? Faculty indicated that they use many types of les for their data, with the ve most common formats being: data tables (e.g. Microsoft Excel, comma delimited text), documents (Microsoft Word, PDF), code (e.g. python, R, MATLAB), text les, and image formats (e.g. JPG, TIFF). Across colleges, uptake of various le formats varies slightly, though major format categories show some uniformity from college to college",Researchers identified a number of services that they would find valuable including assistance with data management planning and backup/storage services.,"? The top services of interest were: help creating data management plans for grant proposals, services for long-term preservation and access to datasets, in-depth data management planning for the lifecycle of the project (i.e. operational data management planning), and providing guidance for identifying appropriate discipline-specific repositories for research data.","Aware of Impending regulatory changes ","? 51% Aware ? 49% Unaware",Data Management,"The major modes of storage for research data at CMU are desktop/ laptop computers, external hard drives, cloud storage, and IT infrastructure maintained by the research group or the department",Preference for potential research data management services,"73% DMPs for Grant proposals 59% Preservation of Data 51% DMPs- Operational 45% Domain Repositories 39% Intellectual property 38% Privacy and Confidentiality 35% Documentation and Metadata 32% Hosting Applications 32% Impact Metrics 28% Access controls and Embargoing 25% Connectivity and Linking 16% Data Formats",Data management plans [Interviews],"? Researcher use of DMPs, of any type, for research projects was quite limited. ?Formal data management plans were largely used by researchers who were mandated by funding agencies Informal data plans or standards were more common and were set at the laboratory or project level.",Data backup and storage [Interviews],"? Most researchers storing and backing up data on local computers and external hard drives, with departmental IT units, or in cloud services. ",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Valentino 2015,Similarities among the interviews," ? The types of data collected by students vary a great deal, and there are di erences in how and whether existing data is repurposed and integrated with data collected by the student. ? the environment in which the graduate students gather their data and analyzed it was also found to vary widely ? All five students understood the benefits of data management even though their understanding of what good data management entails was not fully formulated. ? Each student enthusiastically supported the preservation and sharing of their data ? after its purpose was described, the five students understood the importance of metadata for discovery and sharing.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Van den Eynden 2014,Diverse Modes of Sharing data,"? Sharing data is seen by most researchers interviewed, across all case studies, as part of the normal scientific process. ? Usually an infromal agreement of understanding about the ownership of the data and how other researchers can use them ? Researchers frequently share data via publications (i.e supplementary data). ? journals usually do not set standards or expectations for supplementary files ",Optimal Data Sharing moments ,"Research data tend to be shared at two main phases in the research cycle: 1. early in the research process 2. at the time of publication",Using Data,"Whilst all interviewed researchers share their own research data and many use data from other researchers, not many access data from public data repositories or community repositories. ",Tracking reuse,"Interested in tracking use of their shared data, although they routinely expect citation or co-authorship when data are used by other researchers. ",Sharing Culture,"Norms within research groups, departments, projects or entire disciplines strongly influence data sharing in these case studies, either in favour of sharing ",Conditions for data sharing," Formal research assessment, with its emphasis on the impact of articles, was seen as doing untold damage to motivations to share data. Researchers find it easier when sharing activities build on normal and routine work practices.",Sharing negative findings and failed experiments,Many researchers in the [Chem] and [Adaptomics] cases regret the fact that currently failed experiments and negative findings are typically not published and therefore the data or information related to these is not shared.,Variation across stage of career ,"Researchers early in their careers, typically experience two competing pressures. First, they confront fears of exposing their work, both out of concern about getting scooped, and also because of potential embarrassment for showing immature, naive, and possibly wrong data or procedures. On the other hand, it is these same early career researchers who are highly motivated to make a name for themselves by getting credit for new methods, procedures, and increasingly, by sharing data. ?For researchers in the middle of their careers, there is less concern about showing immature work, although fears of being scooped can still persist. ?As researchers enter the later stages of their careers, interest in sharing data can grow quickly in some cases ",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, vanderGraaf 2008,Coverage of universities,Between 34% and 47% of the European Universities has a research repository,Annual growth in the number of repositories,"The number of institutional repositories is growing with an annual number of approximately 25 to 35 institutional repositories The disciplines are fairly even represented in the materials, be it with a slight overrepresentation of Humanities and Social Sciences with 35% emarkable increase in the percentage of repositories that use a combination of various workflows",Content,"70% of the participating institutes have one research repository, nearly a quarter more than one. Approximately 8% have outsourced it.",Repositories by institute,68% of the participating institutes maintain one research repository for research output themselves,Types of material and discipline,"The majority of the research repositories of the participating institutes contain the full text of textual materials, the majority of these being journal articles, theses and working papers. The minority contain metadata, and a small number of repositories contain non-textual material such as images, and video The disciplines are fairly even represented in the materials, be it with a slight overrepresentation of Humanities and Social Sciences with 35%",Access,The vast majority of repositories contain textual materials with open access availability,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, VanTuyl 2016,Effectiveness of sharing data by federally funded researchers was poor in the large majority of cases,"25 project DMPs for which we attempted to locate shared datasets and generate a DATA score nineteen (76%) had an overall score of 0, one each had a score of 2, 5, 6 or 8, and two had a score of 7. (zero is bad)",Rresearchers are largely unaware of how or where to share their data effectively,"When searching journal articles, data were almost never shared as a citation and were often found in acknowledgements, mentioned or linked in methods (but not cited)",,Creation of a data management plan has very little bearing on whether or how datasets are shared,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Varvel 2012,Course categories,"? Data centric - less than 8% ? data inclusive - 11% ? digital - 27% ? traditional LIS - over 50% ",Concept representation,"Most prevalent terms included: ? metadata, ? preservation ? management ? retrieval ? archiving ? digitization ? human computer interaction",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Varvel 2013,The functions of Data Mgmt Consultants,"Data Mgmt Consultants serve to consult on data management, build data management processes, manage partnerships, market services, and perform early data lifecycle functions",The requirements placed upon ,"? Consultants require a diverse set of knowledge or backgrounds includinng LIS or data management, domain experience, computer science or information technology, and some statistics knowledge. ? A customer service mindset coupled with project managementand technical writing abilities are also necessary in the JHU context",Role Requirements,"? Specifically mentioned skills or knowledge required for a DMCfrom interviews and source documents include: a) LIS degree or backgroundwith archival, data curation, data repositories, and intellectual properties knowledge; b) Domain expertise (not domain specificity) with research un-derstanding or experience; c) Computer science or information technology background or knowledge with database experience; d) Understanding of data management; e) Statistics knowledge or experience; f) Customer servicemindset including communication abilities and collaboration skills; g) Technical writing ability; h) Project management experience; i) Flexibility and quick learning",Data Mgmt Consultant Specific Backgrounds,"Depending on the scopeof the services developed within any particular data management service,domain expertise may be more or less important.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Verbaan 2014,Professional views of RDM,"? IT professionals saw RDM as about data storage, especially active data storage and information security. ? Research administrators tended to see it as about data sharing, driven by research quality and compliance to funders' requirements. ? Librarians saw it predominantly as about data storage and preservation, but also advocacy and training. ",Distribution of RDM Roles Between Services,"? The longer term preservation of data was identified asan area whereat present only the Library had an interest. ? IT highlighted first of all storage from both an infrastructure (hard-ware) and an application (software) point of view, guidance, training and supportas the areas they were likely to get involved in",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Verbakel 2014,NA [Take-home message],"? Participants enjoyed and appreciated their discussions resulting from the homework assignments, which were seen as the most valuable element of the course. ? Four days of face-to-face tuition were seen as a considerable time investment, but useful because of the relevant discussions and networking possibilities. ? Participants were interested to hear from researchers themselves in terms of how they deal with data management issues, and about di erences between disciplines. ? Participants missed the opportunity to practice writing an actual data management plan. ? Participants urgently needed practical information about setting up a front o ce for data management services. ? The participants appreciated the images included in the course material on the website, which they thought were a memorable way to clarify concepts. ",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Vines 2013,Archiving policy impact," ? Likelihood ratio test statistic: 4.27, p 0.038), such that the odds of getting the data were 25 times higher [95% confidence interval (CI): 1.5?416.7]. ? Having a ?recommend archiving? policy made it 3.6 times more likely that the data were online compared to having no policy. However, the 95% CI overlapped with 1 (0.96?13.6); hence, this increase in the odds is not significant. ? Overall, recommending data archiving is only marginally more effective than having no policy at all. ? The data were 17 times more likely to be available online for journals that had adopted a mandatory data archiving policy but did not require a data accessibility statement in the manuscript. This odds ratio was significantly 1 (95% CI: 3.7?79.6). ",Combination of a mandatory policy and an accessibility statement is much more effective than any other policy type,"For ?mandate archiving? journals where a data accessibility statement is required in the manuscript, the odds of finding the data online were 974 times higher compared to having no policy. The 95% CI on these odds is very wide (97.9?9698.8)",Rrquesting Data Directly From Authors,"More than one email had to be sent to authors requesting data. Not all authors replied, and at times the data sets were incomplete. Requesting data directly from authors can also provide access to research data, but this approach can be hampered by delays and the potential for disagreement between requester and the authors.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Vines 2014,Availability of the data was strongly affected by article age,"Finding a working e-mail address for the first, last, or corresponding author fell by 7% per year suggests that for every yearly increase in article, the odds of the data set being extant decreased by 17% asking authors for their data shortly after publication does yield a moderate proportion of data sets (~40%) when the authors did give the status of their data, the proportion of data sets that still existed dropped from 100% in 2011 to 33% in 1991 ",Age of the paper and contact information to request data from authors,"There was a negative relationship between the age of the paper and the probability of finding at least one apparently working e-mail either in the paper or by searching online: ? finding a working e-mail address for the first, last, or corresponding author fell by 7% per year. ? suggests that for every yearly increase in article, the odds of the data set being extant decreased by 17%. ? asking authors for their data shortly after publication does yield a moderate proportion of data sets (~40%). ? when the authors did give the status of their data, the proportion of data sets that still existed dropped from 100% in 2011 to 33% in 1991",The conditional probability that the data were extant (either??shared?? or ??exists but unwilling to share??) given that an informative response was received,"There was a strong negative relationship between the age of the paper and the probability that the data set was still extant (either ??shared?? or ??exists but unwilling to share??), given that a response indicating the status of the data was received",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Vlaeminck 2014,Datasets accepted,73.7% of responding organizations accept external datasets for storage,Code accepted ,Almost 70% of organizations responding offer options to store and host computational cdes,Common metadata standards in social science,"More than 70% of respondents use DDI, dublin core 29%",Support for metadata creation,Majority (almost 65%) of all responding organizations have procedures in place to support researchers in generating the necessary metadata,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Voell 2010,"Culture of sharing, exists within the C. elegans community","The majority of worm researchers surveyed report sharing data and research pre- and post-publication, regardless of the requirements of their funding bodies. 518 (76.2%) agreed that one should not restrict access or use of scientific data; 616 (90.6%) reported that they encourage their colleagues to share data and research materials.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Vogeli 2006,Exposure to data withholding,"? 246 trainees (23.0%) reported that they had asked for and had been denied access to information, data, materials or programming associated with published research ? 221 (20.6%) denied access to material for unpublished research ? 85 (7.9%) had denied another academic scientist's request for their own data ",Trainees denied access to research were significantly more likely to report that data withholding had had a negative effect on several aspects of the educational experience.,"? 533 (50.8%) reported that withholding had had a negative effect on the progress of their research, 508 (48.5%) on the rate of discovery in their lab/research group, 472 (45.0%) on the quality of their relationships with academic scientists, 346 (33.0%) on the quality of their education, and 299 (28.5%) on the level of communication in their lab/research group. ",Predictors of data withholding/ multivariate analyses ,"? Trainees with industry support were significantly more likely than those without industry support to have been denied access to both published (odds ratio [OR] 1.45, 95% confidence interval [CI] 1.01?2.09) and unpublished (OR 1.49, 95% CI 1.03?2.15) information, data, and materials ? Trainees in high-competition research groups were 1.7 times more likely than those in low-competition research groups to have been denied access to unpublished information (95% CI 1.19 ?2.44). ? significant predictor of being denied access to published information, data, and materials was the race/ethnicity of the trainee making the request. ? Compared to whites, trainees who classified themselves as Asian were significantly less likely be denied access (OR .67, 95% CI .46 ?.96). ? Trainees in high-competition research groups were also almost twice as likely as trainees in low-competition research groups to report having denied another?s request for information, data, or materials (OR 1.81, 95% CI 1.02?3.22).",Consequences of withholding ," ? Five-hundred thirty-three respondents (50.8%) reported that withholding had had a negative effect on the progress of their own research, 508 (48.5%) on the rate of discovery in their own lab orgroup, 472 (45.0%) on the quality of their relationships with other academic scientists, 346 (33.0%) on the quality of the education they receive, and 299 (28.5%) on the level of communication in their lab or group. ? 125 respondents (11.8%) reported that their research had been delayed by more than six months because another academic was unwilling to share data with them. ? On every measure, trainees who had been denied access to information, data, or materials (both published or unpublished) were significantly more likely to report data withholding had had a negative effect (all p <.05). ? Those who had denied others? requests were significantly more likely to report negative effects on communication within their research group (OR 1.74, 95% CI 1.02?2.97), on the quality of the education they receive (OR 1.75, 95% CI 1.04 ?2.93), and on the quality of their academic relationships (OR 1.90, 95% CI 1.14 ?3.16). ",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Vrana 2013,Need for a digital repository for scientific works and research data,"70.7% had a sufficient quantity of scientific works and research data in order to justify the establishment of a digital repository. 12,1%did not 17,2%couldn?t estimate the quantity",Does the institution in which you library exists in have a digital repository of scientific information?,The majority of libraries did not have a repository. Les than 25% did.,Does your academic institution have a sufficient number of scientific works and scientific research data in order to justify the establishment a digital repository?,"41 (70,7%) confirmed that they had a sufficient quantity of scientific works and research data in order to justify the establishment of a digital repository 7 libraries (12,1%) claimed that they did not have a sufficient quantity of scientific works and research data for justify establishment of a digital repository 10 libraries (17,2%) couldn?t estimate the quantity of scientific content.",Does your academic institution have a written plan of development of a digital repository of scientific information?,"A large percentage (77,6%) of libraries which participated in this research did not have such a plan ",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Waddington 2013,Perceived use and value of cloud services,"This is influenced by several factors, including: previous financial or time investments that had been made into local storage/processing infrastructure (IT staff wereless willing to consider cloud alternatives, if they had an existing infrastructure inplace); staff time and training requirements necessary to transition to cloud services; the costs of using cloud services (cloud services must be affordable and offer better value over storage and processing data on non-cloud systems), awareness of?pressure points?in existing local infrastructure (e.g. insufficient storage capacity, limited processing capabilities).",Payment for cloud storage,"A ?pay per use?funding model adopted by many cloud-based services makes it an appealing option for researchers who wish to store and/orprocess data over a short time period/ have limited or no existing infrastructure to perform the activity. Maintaining data in the cloud beyond the project's lifecycle is more problematic, but increasingly demanded by funders.",Handling Intellectual Property Rights and legal compliance issues,"Data assets created and used by researchers may be subject to Intellectual Property Rights, legal compliance, or contractual requirements that specify the location where data assets may be stored and the security arrangement that must be made to prevent unauthorised access.",Trust ,Many researchers perceived that their data were more secure if stored on portable drives or on local server infrastructure than it would be in commercial cloud,,,,,,,,,"The following factors are seen as most important : The increased visibility for the publications of the academics A simple and user-friendly depositing process A mandatory policy for the deposit of research output by the institute An improvement in the situation with regard to the copyright of published materials Requirements by research funding organisations for the deposit of research output in repositories Awareness campaigns for academics Interest from decision-makers in the institute.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Wallis 2008,Several findings are of particular import for data curation,"? Cumulative effect of decisions made at each stage of the life cycle; decisions made in the experimental design stage determine what data exist for analysis, calibration decisions are essential for interpreting the data, etc. ? balance of decision making with respect to data between scientific and technology research partners",Identified nine stages of the data lifecycle that are common in ecological deployments at CENS,"? experimental design, calibration, capture, derivation, cleaning, integration, analysis, publication, preservation",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Wallis 2013,Only a few domain areas do CENS funders or journals require them to deposit data.,,Few repositories exist to accept data in CENS research areas.,,Data sharing tends to occur only through interpersonal exchanges.,,"CENS researchers are willing to share data if they receive credit and retain first rights to publish their results.",,"Neither CENS researchers nor those who request access to CENS data appear to use external data for primary research questions or for replication of studies.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Wang 2014,"Large portion of repository data items can be mapped to the current DO ontology but that document attributes do not always link consistently with Document Ontology axes and additional values for certain axes, particularly ?Setting? and ?Role? are needed for better coverage. ",,"To achieve a more comprehensive representation of clinical documents, more effort on algorithms, Document Ontology value sets, and data governance over clinical document attributes is needed.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Wang 2015,"Combining their familiarity with emerging professional practices and resources, their efforts to gain a deeper understanding of the specific data management needs of researchers in the department, and their research into the evolving research data infrastructure in that particular discipline, the two are able to successfully connect researchers with the best practices in data management, suitable data repositories, and experts in the campus? Computing Services unit.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Weller 2014,Data storage practices,"Most common methods of digital file storage, from most used to least used: Hard drive or a CD Cloud-based commercial storage like Dropbox or Google Drive images, recordings downloaded web content For datasets though, a central university server was more common than cloud-based storage. For text files, printing and saving the physical versions of the filesis still utilized relatively regularly ",Influences on data storage,"While the data storage practices are relatively consistent across researchers of different methodologies, the reasons they cite for their data storage practices varies by methodology Historians are most influenced by long-term sustainability Natural scientists are influenced by grant requirements",Data-related research challenge,"Among all respondents, results were relatively consistent across the first three phases of the research process ? identification, acquiring and managing materials - as the most challenging data related stages.",Data-related future needs,"Overall, researchers primarily anticipate needing assistance with data analysis , data storage and data dissemination.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Weller 2015,"Graduate students and faculty members report differences in their data practices, research challenges and data-related needs.","? Graduate students more likely to rely on cloud-based storage than faculty members; faculty members use university servers more often ? faculty more influenced by ease-of-storage method and grant requirements than graduate sudents ? graduate students have more difficulty identifying relevant data and managing data than faculty ? Faculty find acquiring access to data significantly more challenging than the graduate students. ? Graduate students have more difficulty identifying relevant data and managing data than faculty. ? Analysis and dissemination are two areas where graduate students anticipate needing more assistance than faculty. Faculty, on the other hand, expect that they will need support with digitization and data storage. ",Usage of Cloud and University Server Storage ,"? 11% of Faculty Used University server ? 4% of students Used University server ? 10% of faculty used Cloud ? 22% of students used cloud",Factors in Data Storage Decisions ,"? 85% (Faculty) and 75% (Students) based on Ease of storage method ? 65% (Faculty) and 71% (Students) based on backup needs ? 55% (Faculty) and 60% (Students) based on File size ? 55% (Faculty) and 58% (Students) based on long-term sustainability ? 38% (Faculty) and 41% (Students) based on privacy/security concerns ? 35% (Faculty) and 49% (Students) based on Physical space requirements ? 25% (Faculty) and 41% (Students) based on cost ? 12% (Faculty) and 10% (Students) based on grant requirements",Most Challenging Phases of the Research Process ,"? 26% (Faculty) and 32% (Students) Identifying relevant data ? 32% (Faculty) and 23% (Students) Acquiring access to data ? 23% (Faculty) and 28% (Students) Managing Data ? 13% (Faculty) and 15% (Students) Disseminating research ",Future Needs with Research Data ,"? 15% (Faculty) and 36% (Students) Writing DMPs ? 41% (Faculty) and 35% (Students) Digitizing resources ? 40% (Faculty) and 62% (Students) Analysis ? 55% (Faculty) and 46% (Students) Storage, archiving, and preservation ? 41% (Faculty) and 62% (Students) Dissemination and publication",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, White 2010,"Contrary to assumptions that organization is completely individual and unique, there are trends in the way evolutionary biologists organize their research data.","? All of the scientists interviewed used some type of metadata or personally created organization scheme to arrange and later use their data collections. ? Organizing based on research question was a very popular method either primarily or secondarily for many scientists. ? The perception of organization was mixed among scientists. While some scientists indicated that they organized and acknowledged its importance in their daily work activities, others thought that organization was less important. ? Five participants asserted that organizing behavior and metadata choices would change based on the intended audience of the data set. ? Four out of the seven researchers interviewed expressed some type of organization anxiety.",Scientists Perception of Organization,"? The perception of organization was mixed among scientists. ? Three participants saw organization as a vital part of the research process. ? One participant explained that organizing data was about research need, not about preference. In this situation, data was gathered, stored, and organized by individual (an easy collection unit) and organization was a powerful tool for answering research questions.",Continum between practice and perception,"? Five participants asserted that organizing behavior and metadata choices would change based on the intended audience of the data set. ? Four out of the seven researchers interviewed expressed some type of organization anxiety. ",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Whitmire 2015,"Data types, volume, and storage locations","The most common data types that OSU researchers produce are quantitative data, followed by digita limages, and non-digital (handwritten) text faculty report that fora?typical?research project, they generate less than 100 GB OSU faculty report storing short-term data (data less than five years old)most often on personal computers and external storage devices",Data management tasks and roles,"With the exceptions of data analysis, sharing, and disposal, the survey results indicatethat research personnel handle the majority of data management tasks ",Metadata practices,The proportion of faculty who report that they create metadata varies widely by college,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Whyte 2013,NA [Take-home message regarding creation of RADAR the institutional repository],"? A key point for all involved has been the need to build support around the link between managing data assets in the short term and their longer-term visibility. ? Sonic Art Research Unit?s data is of highly varied types. ? Defining metadata and populating the repository with this at a suitable level remains a challenge that has required restructuring the collection on several occasions. ? Challenge: Using the collection to demonstrate impact",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Wicherts 2011,The reluctance to share data is associated with weaker evidence and a higher prevalence of apparent errors in the reporting of statistical results. The unwillingness to share data was particularly clear when reporting errors had a bearing on statistical significance.,"? Higher p-values, like those in the interval between .03 and .05 (which have a low chance of occurring regardless of actual effect sizes), were indeed more common in papers from which no data were shared (16.7%) than in the other papers (9.1%).",Responses to Data requests,"? 21 (42.9%) had shared some data ? 13 corresponding authors (26.5%) failed to respond to the request or any of the two reminders. ? 3 corresponding authors (6.1%) refused to share data either because the data were lost or because they lacked time to retrieve the data and write a codebook ? 12 corresponding authors (24.5%) promised to share data at a later date, but have not done so in the past six years (we did not follow up on it).",Errors in the Reporting of Statistical Results [N=1148 Test statistics],"? 49 of these statistics (4.3%) were inconsistent with the reported (range of) p- values. ? In 47 of the inconsistent results (95.9%), the reported p-value (range) was smaller than the recalculated p-value. ? Although 51.1% (587) of the tests statistics were from papers from which no data were shared, most incorrectly reported p-values (36 out of 49; 73.5%) originated from these papers. ? 10 cases (from seven papers) in which the recomputed p-value was above .05, whilst the result was presented as being significant (p,.05). None of the authors of these had shared data",Relationship between sharing data and errors in reporting,"The W statistic computed on the basis of actual difference between shared and non-shared gave a p-value of .0298 Hence, the analyses of individual p-values corroborated that p-values were significantly higher in papers from which data were not shared.",,,,,,,,,Metadata [Interviews],"Common themes: ? not knowing enough about metadata standards to create metadata ? the effort required to properly document datasets was disproportionate to the return on investment current documentation (simple readme les, etc.) was sufficient ? documentation in program code was sufficient for documenting output (results of the code) ?data was simple and thus did not require metadata ?data was ?self describing in some way? and thus did not require metadata ?data would not be usable even if it was well documented or the data creator would need to be involved in data re-use due to data complexity, so why create metadata",Data sharing practices [Interviews],"? Data sharing practices and expectations varied widely across interviews, and it is di cult to describe a pattern across the faculty we spoke to. ? the largest amount of research data is shared within a research group or between project collaborators but less data is shared with those not involved with a project. ? sharing within a research group is relatively common (though not always done), sharing outside of the research group is less common, and sharing with researchers outside of the specialized domain of research is even less common.",Restrictions on sharing [Interviews],"More than half of the researchers interviewed indicated that they had some concerns about data sharing due to a need to publish on data before sharing, privacy, or ownership",Services [Interviews],"Services that interviewees most frequently ranked highest priority were assistance with creating data management plans for grant proposals, providing a data preservation platform, assistance with making datasets citable in other academic works, and providing metrics for use of research data",,,,,,,,,,,,,,,,,,,,,,,,,,, Wiley 2015,Types of Datasets found in the IDEALS repository,"494 files deposited are text files; 240 are Microsoft Excel files; and 219 PDFs are represented in the repository. Text files are the most frequently deposited file type within IDEALS. ",Research Methodologies associated with datasets ,"? Common research methodologies indicated in this table are the total number of datasets that referenced case studies, surveys, questionnaires, conceptual analysis, statistical analysis, framework models, technology reviews, experiments, data files, and databases within their description ? 26% of these types of datasets are represented in the repository. ? Bibliographies represent 38% of the datasets ? Farming Inventory lists represent 29% of the databases",Peer Reviewed and funding agency related datasets ,"4% of the datasets were associated with peer-reviewed publications published within library, atmospheric, and biology disciplines",Research Discipline and Community ,"? 21% of the datasets within the repository are associated with research disciplines. Rare Books represents 38 %, Illinois Department of Agriculture 35 % and Special Collection are 7%. ? 2.8% of the datasets represents disciplines or communities that are unknown. ? Research communities represent 78% of the datasets within the repository. ",A broad understanding of the representation of data in the IDEALS.,"? 507 datasets in IDEALS dating from 1905-2015 ? Text files are the most frequently deposited file type; bibliographies represent 34% of the datasets; and, farming inventory lists are 26% of the datasets. ? Various research disciplines represent 18% of the datasets and research communities are associated with 78% of the datasets. ? 7% of the datasets are sponsored by NSF, NIH, IMLS and DOE funding agencies.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Wilkinson 2014,NA [Take-home message regarding case study of RDM infrastrcutres project at University College London],"? People do not trust new services: Pilot users are generally using the service to ?back up? their data. This is slowly starting to change. ? There is a large skills variance that needs to be addressed. This often requires more one-to-one consultations, which are resource intensive. ? There is a wide variety of user/researcher needs and requirements. ? Technology, particularly infrastructures, are changing rapidly, as are the costs. ? Neither technology nor culture change will work independently. There needs to be a merging of the two and this represents a new professional space. Skilled and enthusiastic people are extremely hard to nd and often attract a premium cost. Institutes should not ignore this. ",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Williams 2012,Data source articles,"Fifty-five of the reviewed articles (44%) used a source of data other than traditional literature citation for information or ideas. ",Data Sources,"The data sources used in these articles varied. Numerous repositories and organizational websites also were the source of data used in these publications. Other sources of data included data from other articles, supplementary files, data from weather stations, and unpublished data files. Data sources varied somewhat depending on research/publication type.",Data Sharing Articles,"24% of the articles reviewed noted publically sharing data beyond what was published in the journal article. The most common sharing method was via supplementary files published on the journal website. Other methods included supplmentary material files, and external sources The articles that noted data sharing, whether on the journal website orvia an external resource, were primarily genetic research articles ",Data type,"The common data types are broadly categorizable in two ways. One categoryincludes data that describe the experiment The other category covers data generated by the experiment. ",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Williams 2013 (+ Companion),Value in offering data training specifically for agricultural graduate students and research assistants and compiling examples of data management plans from successful grant proposals.,"? No respondents used library assistance with author rights, intellectual property, open access, or research data. ? Most respondents (6/7) did not feel well informed about or well prepared for funding agencies? data management requirements ? The most frequently selected services that were preceived to be helpful were: data management training for laboratory assistants and graduate students (5), involvement/integration in grant proposals and research projects (4), consultation on data management challenges/questions (3), and data management plan templates/tools (i.e., do-it-yourself re-sources) (3). ? everyone at the presentation (6) agreed that examples of successful grant proposals would be useful",Reasons for Sharing Data,"? Provide Confidence in research results ? Help the research communty ?Meets expectations in this area of research ?Help manage and distribute data within the laboratory ?Meet funding requirements ? Make data more accessible",Supplementary Files,"? None of the participants voiced drawbacks about supplementary files as a data sharing method. ? All found benefit ? All of the participants who had shared data in supplementary files had also reused or tried to reuse data from supplementary files published by other researchers",Disciplinary Repositories,"? Several participants had used disciplinary repositories, and they noted many benefits ? In most cases, the repository served as a backup for local data, but one faculty member said that depositing data in a repository freed her and her computer resources from maintaining old data. Repositories also facilitated data distribution and access ? The most commonly cited drawback of disciplinary repositories was the amount of time and effort required to deposit the data",Challenges of Sharing Data,"? Some participants described general challenges of sharing data, regardless of the method. ? difficulty of organizing data sharing tasks around grant deadlines, academic calendars, and student transitions ? selecting the data sharing method can be challenging. ",Role of the Library,"? Most participants said they had not previously considered the library for assistance with data, but when asked how the library could help facilitate data sharing, they had a variety of ideas. ? A few faculty members suggested the library should provide a system that would allow researchers to share data for which no disciplinary repository exists. ? Other ideas focused on data education and training. ",Impact of Funding Agency Requirements,No participants felt their data sharing practices would change as more funding agencies require data to be shared. ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Willis 2012,Significant relationships were found between the domains and objectives of the schemes (p < .05),"? Schemes describing observational data are more likely to have ?scheme harmonization? (compatibility and interoperability with related schemes) as an objective ? schemes with the objective ?abstraction? (a conceptual model exists separate from the technical implementation) also have the objective ?sufficiency? (the scheme defines a minimal amount of information to meet the needs of the community) ? schemes with the objective ?data publication? do not have the objective ?element refinement.?",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Willoughby 2014,Use of metadata fields,"? 70% of ELNs surveyed used the ""section"" metadata five or less times; half of those used it once. Most frequently filled in with a filler, no descriptive value. These two results indicate a 'minimum required' approach to metadata notation ? Almost half of all notebooks used no Keys at all",Characteristics of metadata,"? The results of the survey showed that the majority of the metadata used could be classified as high-level rather than specific, with more than two-thirds of the Section metadata and 80% of the Key metadata classified as High-Level ? Less than 5% of the metadata items are Verb-type. Noun-type is the majority of metadata.",Impact on Metadata of Privacy and Number of Notebook Authors,"? The highest numbers of Sections are observed in notebooks with the highest numbers of authors. ? The majority of notebooks in the survey (65%) have only a single author, but 8% of notebooks have five or more","A survey of patterns of metadata use in these notebooks, together with feedback from the user community, indicated that while a few groups are comfortable with metadata and are able to design a metadata structure that works effectively, many users adopt a ?minimum required? approach to metadata.","? Observations from our user studies and the ELN trial with students match the results seen in the metadata survey in that the majority of new users add the minimum amount of metadata required ? The results of these activities have indicated that although some users are comfortable adding metadata to their notebooks and understand the benefit the metadata provides in helping them to locate information, many others have indicated that they felt it was too difficult to use and did not see the benefit in adding it. ",How researchers use metadata: what types of metadata are used and whether it is used effectively,"? Biggest inhibitor to adding useful metadata is the ?blank canvas? effect, where the users may be willing to add metadata but do not know where to start ? making the addition of metadata mandatory without assistance is not helpful at encouraging the creation of meaningful metadata ? metadata is representative of the community that uses it, and therefore, no one set of terminology is likely to work for all users ? users more readily add certain types of information as metadata. For example, the majority of metadata used describes things and objects, rather than activities, indicating that some information that would be useful is not captured as metadata, such as experiment activities and conditions",Patterns of metadata use ,? Just providing a mechanism to add metadata is not sufficient,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Wilson 2014,Researchers tend to have different views on the requirements or necessity of any given component of RDM infrastructure depending on their disciplinary background.,"? Those from disciplines such as astronomy, physics, crystallography and so forth, where researchers have generally already recognised the benefits of data sharing and put in place an appropriate infrastructure, may be uncertain of any benefits that central support can bring; Others, particularly those from disciplines where data is very heterogeneous or where researchers often manage their data as individuals, may regard institutional support with RDM as welcome, given that many are uncertain as to what constitutes good practice.",NA [Take-home message regarding case study of RDM infrastrcutres project at Oxford],"? With so many stakeholders involved, each bringing their own particular interests and understanding of the issue involved to bear on the topic, it can be a challenge to ensure buy-in. ?Much of the work of setting up an RDM infrastructure therefore involves gaining a clear understanding of what is required by whom and ensuring that those who will be involved in resourcing and supporting the infrastructure are aware of the rationale for implementing it. ?Costs are high and many of the benefits are intangible",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Winget 2004,Nature of materials to be archived,"? Humanist: primary materials related to one book of Chinese poetry translations published in the latter half of his career, and consists of both paper and digital formats including a hard copy draft with evaluative comments from two well-known poets, the book pre-print, daily activity and writing logs, correspondence, unpublished poem translations and original poetry. ? Scientist: a detailed thematic review of his career, with archival materials (photographs, lab write-ups, unpublished findings) acting as complementary material.",Attitudes towards archiving academic work,"? Both had little understanding of nature of archive in a university, and lacked apprecation of importance of depositing materials in a digital archive ? digital archivists should be able to address the following questions: ""what is an archive?""; ""why my papers""",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Wright 2013,Discovery and a basic public description,"? Researchers strongly in favour of internet discoverability of datasets ? Researchers supported providing public descriptions of their datasets;",Data Citation and Citation Tracking,"? Five researchers considered 'ability to enable version control for this dataset' to be a high priority ? Five of the eight researchers interviewed ranked tracking citations of medium importance, two ranked it of high importance and one researcher did not know how important it was to him personally, ""since he believed people would do this regardless"". For those who did rank it as a high priority, one researcher did specifically mention that ""he considers this the real measure of the value of his data"".",Provenance of data not necessairly a priority for all,? Provenance of the dataset was ranked as a high priority by three researchers interviewed; a medium priority for one and not applicable to another,Lack of interest in formalized metadata standards,? Only one researcher felt this a high priority; three as low; remaining 4 unsure,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Wynholds 2011,"What are the data management, curation, and sharing practices of this community ","? The methods and degree of data preservation in astronomy vary widely, with significant differences based on funding source ? Astronomers reported varying practices for preserving these types of data, usually storing only what they expected to use in the future. None reported having preservation resources for their work outside their group of collaborators project size, and type of instruments ? Many astronomers considered it easier to duplicate the original data manipulations than to follow the opaque footsteps of a third party ? They were equally hesitant to share their own derivative data products openly, citing the amount of work required to create adequate documentation or expressing concerns that data could be misused or misinterpreted, due to an inadequate understanding of the operating constraints. ? Astronomers reported few incentives to document data adequately for sharing.","who uses what data when, with whom, and why","Trust in Sources: astronomers were reluctant to use data that did not come from thoroughly tested, reliable sources","What data are most important to curate, how, and for whom","? Astronomers tend to evaluate and select data for preservation based on anticipated use ? Assessing that value, however, is a nontrivial matter, both for technical and social reasons: immediate utility, not just an historical legacy ? Poor interoperability among archives and a steep learning curve for integrating sources presented both a challenge and a liability",Burden of documentation,"? Documenting datasets is very challenging and time consuming; where astronomers use other data they contact the researchers and enquire. ? Report few incentives to document data adequately for sharing",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Wynholds 2012,"1. What are the characteristics of data use and reuse within each research community? ","? (Astronomy): taking data from public repositories and sky surveys are reuses in the sense of ?using again,? but not in the sense of maintaining a team?s empirical data in ways that those data might be exploited by others in the future. ? We found that CENS researchers and astronomers alike describe their data with respect to the purposes for which they are used. Their data exist only in relation to their research question, hypothesis, model, instrument, or study. ? these researchers also act on data in ways that would be considered 'use' by librarians, archivists",2. How do characteristics of data use and reuse vary within and between research communities?,"? Reuse suggests using data repeatedly, whether for the same or different purposes. We identified cases in CENS, particularly, where researchers would maintain some laboratory or field data for later analysis, either alone or in combination with new data. ? Rarely were these data deposited publicly; rather, they were kept for reuse within the research team. ? Some data that are essential to the research process are not kept at all, and thus are not available for future use by the research team or others. In CENS, these include data produced during the long processes",Diference in perception of data sources,"Researchers distinguished between the physical instrument as a data source and data obtained from venues such as repositories or catalogs.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Xia 2013,Both funder-based and publisher-based mandates have a strong impact on scholars? likelihood to contribute to open data repositories.,"? The implementation and revision of the data mandate policy by the NIH had a strong impact on the years following, 2004 and 2008, with significance values of 0.001 and 0.014 respectively","Open access journals have been apparently preferred by those authors whose projects are sponsored by the federal government agencies, and these journals are also highly ranked in the biomedical fields.","? The results of our statistical analysis are self-explanatory, namely, the r values are all positive in a range from 0.614 to 0.879, indicating strong correlations of OA journal mandates and data contributions to GEO.",Correlations between funder-based mandate implmentations and data contributions over time [Ordinal regression],Only the year immediately following the implementation and revision of the data mandate policy by NIH in 2003 and 2007 returns a low significance value to indicate a change of the data contribution rate. [2003 p=0.000; 2007 p=0.332],Top Journals where GEO data contributiors oublished their articles [Ordinal regression],"? Several Journals, including BMC genomics, Journal of clinical Investigation, Molecular systems biology, the pLOS jounrlas, and RNA have an early implmentation of data mandates, yielding a perfect correlation. ? The results indicate r values are all positive int he range from 0.614 to 0.879, indicating strong correlations of the open access journal mandates and data contributions to GEO.",Data contributions and journal reputation rankings by impact factor and h-index. ,"? With regard to impact factor (IF) values, the lowest is Microbiology (IF = 0.718) and the highest is Nature (IF = 36.280). ? More than one third of all journals in the list have an IF value higher than 10, and three fourths of the journals are higher than 5. ? For the h-indexes, the range is 53?734 with approximately eighty-three percent of the journals being higher than 100. ? Correlation tests showed a weak relationship (r=-0.012) between the number of GEO articles and Journal IF scores ?Correlation tests showed a weak relationship (r=-0.229) between the number of GEO articles and h-index values. ?Thus, suggests the importance of open access mandate policies, rather than reputation of a journal in the decision of making data contributions to a digital data repository for free access. ",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Xia 2014,Trend 1: Types of employers posting jobs,"College/universities: 50.3% Research Institute: 13.8%",Trend 2: Job titles are diverse,"Librarian: 44.12% Specialist/Consultant: 12.9%",Trend 3: Degree requirements support need for MLS from ALA accredited University,MLS: 31.7%,Trend 4: Years of experience,Less than half ads listed years of experience required: 64.1%,Trend 5: Type of experience requried,"GIS/Stat Software: 17.6% Academic library expreience: 12.9%",Trend 6: words in job responsibilites for library positions,"Data: 15.6% Services: 4.2%",Job responsibilities in posts syntheiszed to correspond to the data life cycle stages,"? 27.83% of library job posts had responsibilities corresponding to data management planning ? 57.14% to data discovery ?44.04% to data collection ?30.95% to data analysis ?23.80% to data sharing ?32.14% to data preservation",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Yardley 2014,"Relationships were significant with respect to trust, sharing data, transparency and clarity, anonymity, permissions, and responsibility.","? The relationship forged between participant, interviewer, and the academic institution was of crucial importance to the research users and the researchers ? The priority of trust in relationships over trust in procedures was particularly strong within our discussion group ? Research users thought that an accredited academic institution should manage any data sharing to ensure use by only bona fide researchers.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Yoon 2014,Users? definition of trust is largely based on a lack of deception,,Several factors develop the user's trust in repositories,"? Organizational attributes, user communities (recommendations and frequent use), past experiences, repository processes (documentation, data cleaning, and quality checking), and users? perception of the repository roles were identified.",Defining trust: what does trust mean to users? ,"? Interviews showed that trust became relevant to a particular situation when the trustor was uncertain about something (uncertainty) and when the trustor can depend upon the trustee (dependability). ? Trust was also related to dependability. ? In the context of data repositories, lack of deception had two components: data validity and repositories? integrity. ? The integrity of repositories was mentioned by most of the participants. Participants? belief that organizations will be honest rather than deceitful comprises trust.","Building trust: where does end users? trust originate, and how do users develop trust? ","Regardless of the repository, participants? trust seemed to be based on five broad components: ? organizational attributes ? the internal repository process ? user (or designated) communities ? their own past experiences ? their perceptions of the roles of repositories",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Youngseek 2015,"Attitudinal beliefs, attitude, normative influence and perceived availability of data repositories influence researchers' data sharing behaviours","? Perceived career benefit (?=0.271, p < 0.001), perceived career risk (?=?0.177, p < 0.001), and perceived effort (?=?0.162, p < 0.001) were significant influences on attitudes towards data sharing (R2 =0.388) ? Resarcher's attitudes towards data sharing had a significant effect on data sharing behaviour (? = 0.417, p < 0.001) ? Normative influence had a positive, sig effect on researchers' attitudes (?=0.228, p < 0.001) but not on data sharing behaviours (?=0.030, p < 0.05). ? The perceived availability of data repositories was found to have significant effects on both researchers' attitudes toward data sharing (? = 0.083, p< 0.001) and actual data sharing behavior (? = 0.309, p < 0.001). ? The perceived availability of data repositories positively influences the relationship between researchers' attitudes toward data sharing and their actual data sharing behaviors (? = 0.224, p < 0.01). ? The perceived pressure by journals has a significant relationshipwith data sharing behaviors (?=0.172, p < 0.001).",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Zachary 2015,Overlap in the data elements necessary for both cancer registries and researchers,"? A gap exists between data elements collected and the data elements needed for surveillance and research ? there is an overlap in the data elements required for both, however they are repectively needed for different reasons ? 96% of registries indicate the data are also used for research",Data elements in registries,"? Registries have good data on demographics as well as stage, size and histology as a part of registry certification, but are lacking data on treatment variables. ? accuracy and completeness is heavily dependent on site and histology ? 87% of registries had 10 years of data available; 9% had 6-10y; 4% had 0-5y ? 68% receive update information for each case ? When asked if the cancer registries could fill all data requests they receive, 15 (68%) answered that they could not fill all data requests, six (27%) answered that they could fill all data requests, and one (5%) was not sure if they could fill all data requests. ? 77% believe too many data items were required; 23% right amount",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Zenk-Moltgen 2014,"Although only a few sociology journals have explicit data policies, most journals make reference to a common policy supplied by their association of publishers","? 7% had explict data policy, 67.1% refer to common policy provided by an association of their publishers, 27.9% do not have a data policy","Among the journals selected, relatively few articles provide data citations and even fewer make data available ? this is true both for journals with and without a data policy.",? 51.8% of the articles state that data are available,Authors writing for journals with higher impact factors and with data policies are more likely to cite data and to make it really accessible., A positive correlation (Spearman pho = 0.244) can be found between data availability as stated in an empirical paper and the existence of a data policy for the journal. This also correlates with a high journal impact factor,Data policies of sociology journals,"? Seven journals (5 per cent) were found to have an explicit data policy: ASR, SM, Sociological Theory, Sociology of Education, Social Science Quarterly (SSQ), Contemporary Sociology and Teaching Sociology. ? 94 Journals (67.1%) refer to a common policy provided by an association of their publishers: The association of Learned and Professional Society publishers (ALPSP) ?101 (72.1 %) of journals recommended that authors deposit and share their data ?39 (27.9%) do not have a data sharing policy in place ?A higher impact factor is postively correlated with the availability of a dedicated data policy",Data Sharing in Sociology papers,"?Just over half of the papers state their data are available (51.8% of papers) ?ASR had the highest rate of papers which data were available (75.3% of papers) ?Journals with no formal data policy also had a high rate of data availability and thus data availability is not entirely dependent on the journal having a data policy",Author behaviour and journal characteristics,"? A positive correlation found between data availability and the extistence of a data policy for the journal ?This also correlates with journals impact factor ?English language also postively correlated ",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Zhang 2013,User design difficulties in HABRI Central,"? Participants seemed to have difficulty understanding repository and bibliography as two separate spaces on HABRI Central. ? Adding author metadata is complicated as you cannot enter more than one at a time ? A number of usability issues related to the layout of interface elements resulted in participants taking longer time how the interface worked or where to start.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Zimmerman 2003 (+ Companion),Overcoming challenges to ecological data reuse,"? Use formal and informal knowledge that they have gained through disciplinary training and through their own datagathering experiences to help them overcome hurdles related to finding, acquiring, and validating data collected by others ? ecologists rely on formal notions of scientific practice that emphasize objectivity to justify the methods they use to collect data for reuse.",Informal Knowledge," Ability to comprehend data is the key to their reuse, and ecologists rely heavily on knowledge from their own fieldwork experiences in order to ""reconstruct"" data they did not collect themselves ",Formal Knowledge and Norms of Scientific Practice," Ecologists are heedful of future public scrutiny, and so they work hard to follow norms of scientific practice in their gathering of data for reuse and in their reporting of results based on secondary data use ",Limits of Knowledge,"Ecologists' concerns about the use of data that they did not collect themselves are put to rest by a combination of factors, the most important of which is an ability to understand the data.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Zinner 2016,"Proportion of faculty who had made requests to other academic scientists in the past three years for information, data, or materials concerning published findings",? 2000: 66.7% of respondents reported making at least one request compared with 65.6% in 2013 (p = .65).,"Proportions of faculty who reported receiving requests, denying at least one request, or the percentage of requests denied ","? Receiving requests: (76.0% in 2000 versus 73.4% in 2013, p = .26) ? Denying at least one request (10.1% in 2000 versus 8.7% in 2013, p = .40) ? Percentage of requests denied (2.1% in 2000 versus 2.0% in 2013, p = .11) ",The proportion of respondents who made at least one request of another academic scientist decreased for nearly all types of requests for additional information not included in the publication ,"? Lab techniques (54.0% in 2000 versus 39.1% in 2013, p < .001), pertinent findings (33.7% in 2000 versus 24.0% in 2013, p < .001) ? Phenotypic information (15.6% versus 13.4%, p = .11) ? Genetic sequences (17.0% in 2000 versus 10.1% in 2013, p < .001) ? Requests for biomaterials including probes, cell lines, tissues, reagents, and organisms that were mentioned in the publication (58.5% in 2000 versus 44.6% in 2013, p < .001).","Volume of requests that life scientists received for information, data, or materials after publication of a finding significantly declined from 2000 to 2013",Average number (SD) of requests received by respondents in the last three years (among those who received at least one request) was 19.4 (1.2) in 2000 compared with 10.8 (0.6) in 2013 (p < .001).,The total volume of requests received from or made to other scientists dropped substantially ,"19.4 received in 2000 versus 10.8 in 2013, p < .001; 8.4 made in 2000 versus 6.6 in 2013, P < .001",Volume of requests,"? The total volume of requests received from or made to other scientists dropped substantially (19.4 received in 2000 versus 10.8 in 2013, P < .001; 8.4 made in 2000 versus 6.6 in 2013, P < .001).","Characteristics of researchers making, receiving, and refusing requests","? Respondents in 2013 were half as likely as those in 2000 to experience a denial of a request (odds ratio [OR], 0.51; 95% confidence interval [CI], 0.39?0.66). ? In 2000, industry support and engagement in commercial activities (e.g., patenting or licensing an innovation) were significant predictors of secrecy and data withholding.2 In 2013, researchers with industry support were roughly twice as likely as those without industry support to make (OR, 2.34; 95% CI, 1.82?3.01) and to receive (OR, 1.94; 95% CI, 1.47?2.56) at least one request, but they were not more likely to experience a denial of a request or to refuse a request.",Prevalence of online supplement and third-party repository requirements,"? 2013: 44.2% of life scientists required by a journal to submit a detailed description of scientific methods or data as a supplement & 24.8% to a third-party repository; for genticists: 58.5% as supplements, & 47.1% to a third party repository; for clinical departments ? In comparison, 36.4% of faculty in clinical departments and 50.7% of faculty in basic life science departments were required to submit additional materials online, and 12.3% of faculty in clinical departments and 30.9% of faculty in basic life science departments made data or materials available to a third-party repository. ? Faculty in 2013 also made an average of 8.4 requests to third-party repositories.",Consequences and perceived effects of sharing and withholding,"? Across both time periods, a similar proportion of faculty reported being ?scooped? by another scientist (28.1% in 2000 versus 25.7% in 2013, P = .24) or that sharing compromised the ability of a junior member of the team to publish (10.1% in 2000 versus 9.4% in 2013, P = .55). ? Researchers in 2013 were less likely to report sharing lted in new research or collaborations.","Prevalence of other forms of withholding","? In 2013, 24.0% of respondents reported that they had intentionally excluded pertinent information from a manuscript submitted for publication to protect their scientific lead, and 39.5% admitted that they had excluded pertinent information from a presentation of published work at a national conference or meeting.",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,