Published October 28, 2024 | Version 1.0.0.
Dataset Open

Dataset for ELGO-DIMITRA Data Management Practices & Requirements: A Scoping Report

Description

This is a comprehensive data repository of the data management survey carried out in Autumn of 2023 through a collaboration between the PHIL_OS project and the Research Directorate of the Hellenic Agricultural Organization ELGO-DIMITRA.

Please cite as: 

Tsiroukis F., Leonelli S. and ELGO-DIMITRA (2024) Dataset for ELGO-DIMITRA Data Management Practices & Requirements: A Scoping Report. PHIL_OS Report. DOI: 10.5281/zenodo.14003418

Notes

Navigating the Data Repository

README.txt

Dataset Structure

This dataset contains 3 main categories of data types.

  1. graphs & visualizations ("_Graphs")
  2. data spreadsheets ("_Data")
  3. miscellaneous documentation ("_Questions", "_Logs")

File Naming Protocol

The dataset naming is according to the rule:

 [(capitalized word) + (underscore)].

The ordering of words is done in terms of a hierarchy of meaning, starting from "ELGO_Survey", followed by a category of artifact ("_Data"), the datatype ("_Text"), the question (if applicable) and finally the kind of analysis ("_Coding"). 

Example Structure:

  • ELGO_Survey
    • ELGO_Survey_Data
      • ELGO_Survey_Data_Text
        • ELGO_Survey_Data_Text_Q6-Q9

Contents

Spreadsheets

  • NOTE: Responses in the dataset are in Greek. Translations for free text have been generated through ChatGPT and used for quantitative analysis (see Methodology)
  • The survey includes a file with the raw data ("ELGO_Survey_Data_Raw") exported directly from Qualtrics in both .csv and a .pdf format.
  • Since Qualtrics takes care of analysis from standardized choice responses (multiple choice, Y/N, ranked, numerical) of individual questions, further analysis was conducted on free text questions (Q6-Q9) and free text input from questions where the "Other" choice was included (Q3.1, Q4-Q5, Q19), as well as for more specific quantitative and comparative analyses (Q2)(see Methodology).

Visualizations

  • Visualizations of the data for each question, as well as a graph comparison for y/n questions Q10-14 and word clouds for free text questions Q6-Q9 (in Greek) were produced with Qualtrics. These can be find in the zipped file "ELGO_Survey_Visualizations.zip"
  • NOTE: Some questions had simple answers and visualizations (Q10-Q14), while Q4 and especially Q5, which were focused on the diversity of data types were highly complex, in their format, visualization and analysis. This makes them open to multiple ways of visualization and data analysis which still remains open to further inquiry (see Data Collaboration).

Documentation

  • The miscellaneous documentation contains a file of the full survey contents and questions, and a file with the ChatGPT logs.

Quantitative Methodologies

  • LLM-aided quantitaive text analysis: ChatGPT has been used for working with text responses from various questions for 4 objectives: a) translation from Greek to English, b) identifying main themes and semantically similar responses and clustering them, c) identifying keywords and their frequency of occurrence, d) acquiring quantitative results from clustered themes and keywords. Logs for questions Q3.1-Q9 can be found on the zipped file "ELGO_Survey_Logs_GPT.zip".
  • Choice Count: For Q2 (Data Storage Formats) an additional analysis was performed on the raw data to determine the number of choices survey respondents chose and correlate the exact number of choices to the responses to determine how diverse the storage practices are. Additionally, the following statistical measures were calcualtes: Average, Median, SD, Average Deviation (See ELGO_Survey_Data_Quant_Q2)

Data Collaboration

The analysis software Qualtrics allowed for a question by question data analysis, but there are myriad possibilities with cross-question comparative analyses that can be done. If you wish to further the analysis or use the data of this survey for your own research, it is highly advised that you contact Fotis Tsiroukis at fotis.tsiroukis@tum.de.

Methods

Methods Information

Medium

Online survey (text form) with a variety of question types:

  • multiple choice
  • ranked choice
  • scale
  • yes/no
  • free text

Distribution

Anonymous Link & Email in Mailing List (From Directorate to Institutes to Researchers)

Software

  • Qualtrics (Survey Design, Database & Visualization)
  • Excel (Formatting, Coding, Open Data)
  • ChatGPT (Free Text Quantiative Analysis, Thematic Analysis)

Analysis

  • Quantitative (Percentages, Keyword Analysis, Mean)
  • Qualitative (Thematic Analysis – Open Coding)
  • Mixed (Theme Frequency Analysis)

Sample Information

Location

Greece (Athens, Northern, Central, Peloponnese, Crete, Aegean)

Sample Size

Target Sample: 182

Survey Sample: (103 partial responses, 70 fully completed)

The survey results are based on how many respondents have answered each question which can vary, depending on the percentage of completion of the survey. 

Demographic Information

  • ELGO-DIMITRA Employees: Researchers, Directors, Lab Technicioans, Supportive Personnel
  • Gender:  51% male, 41% female
  • Qualifications: 96% PhD, 3% MSc, 1% Diploma

 

Files

README.txt

Files (2.6 MB)

Name Size Download all
md5:ccac3f037a81f1471505f9f4aec85913
1.0 MB Preview Download
md5:b084399e07ab1b4703c526cfceb43518
824.8 kB Preview Download
md5:d98da21f862fe23d7c774d2f9fda0fc2
24.7 kB Download
md5:55f8eee41905d65788424351a33af692
767.7 kB Preview Download
md5:6946ee2457194428fe5204b316b21455
3.8 kB Preview Download

Additional details

Related works

Is supplement to
Report: 10.5281/zenodo.13999260 (DOI)

Funding

PHIL_OS – A Philosophy of Open Science for Diverse Research Environments 101001145
European Commission

Dates

Collected
2023-12
Submitted
2024-11