# Survey on Reproducible Workflows and Open Science in Lattice Field Theory

This repository contains the results of a survey on software workflows and open science in lattice field theory conducted in 2022 by Andreas Athenodorou, Ed Bennett, Julian Lenz, and Elli Papadopolou. These data were collected using [LimeSurvey][limesurvey], and were first presented in [a talk at Lattice 2022 by Andreas Athenodorou][lattice2022-talk].

The analysis is based on Julian Lenz's [LimeSurvey CSV parser][parser].

## Data
The data are included in `survey-results-redacted.csv`, with `;` delimiting fields, and `%%%` separating question codes from question texts in column headings.

The survey structure is included as `survey-structure.lss`. This gives more detail on the options presented for each question. This was generated using [LimeSurvey][limesurvey] version 5.3.18+220530, and can be reimported into any version of LimeSurvey compatible with files generated by that version.

## Setup
```sh
# Install dependencies
pipenv install --dev

# Setup pre-commit and pre-push hooks
pipenv run pre-commit install -t pre-commit
pipenv run pre-commit install -t pre-push
```

## Usage
With the dependencies installed, it should be sufficient to run
```sh
make
```

This will run the `analysis.ipynb` Jupyter notebook and generate the plots used in [Andreas Athenodorou's talk at Lattice 2022][lattice2022-talk].  Alternatively, you can open the notebook directly and interrogate it in more detail.

If you have an updated raw data file, and place it in `survey-results.csv` (or otherwise specify its location in the `Makefile`), then this will additionally strip personally-identifiable information from it to update the file `survey-results-redacted.csv` prior to running the notebook.

## Credits
This package was created with Cookiecutter and the
[sourcery-ai/python-best-practices-cookiecutter](https://github.com/sourcery-ai/python-best-practices-cookiecutter)
project template.

The analysis makes use of [a list of common English words provided by Josh Kaufman][10000-words], which is included for convenience as `supporting_data/most_common_words.txt`.

[10000-words]: https://github.com/first20hours/google-10000-english/blob/master/google-10000-english-no-swears.txt
[lattice2022-talk]: https://indico.hiskp.uni-bonn.de/event/40/contributions/695/
[limesurvey]: https://www.limesurvey.org
[parser]: https://github.com/chillenzer/limesurvey-parser
