Published March 30, 2023 | Version v1
Dataset Open

Dataset - Understanding the software and data used in the social sciences

  • 1. University of Edinburgh
  • 2. University of Southampton

Description

This is a repository for a UKRI Economic and Social Research Council (ESRC) funded project to understand the software used to analyse social sciences data.

Any software produced has been made available under a BSD 2-Clause license and any data and other non-software derivative is made available under a CC-BY 4.0 International License. Note that the software that analysed the survey is provided for illustrative purposes - it will not work on the decoupled anonymised data set.

Exceptions to this are:

Contents

  • Survey data & analysis: esrc_data-survey-analysis-data.zip
  • Other data: esrc_data-other-data.zip
  • Transcripts: esrc_data-transcripts.zip
  • Data Management Plan: esrc_data-dmp.zip

Survey data & analysis

The survey ran from 3rd February 2022 to 6th March 2023 during which 168 responses were received. Of these responses, three were removed because they were supplied by people from outside the UK without a clear indication of involvement with the UK or associated infrastructure. A fourth response was removed as both came from the same person which leaves us with 164 responses in the data.

The survey responses, Question (Q) Q1-Q16, have been decoupled from the demographic data, Q17-Q23. Questions Q24-Q28 are for follow-up and have been removed from the data. The institutions (Q17) and funding sources (Q18) have been provided in a separate file as this could be used to identify respondents. Q17, Q18 and Q19-Q23 have all been independently shuffled.

The data has been made available as Comma Separated Values (CSV) with the question number as the header of each column and the encoded responses in the column below. To see what the question and the responses correspond to you will have to consult the survey-results-key.csv which decodes the question and responses accordingly. 

A pdf copy of the survey questions is available on GitHub.

The survey data has been decoupled into:

  • survey-results-key.csv - maps a question number and the responses to the actual question values.
  • q1-16-survey-results.csv- the non-demographic component of the survey responses (Q1-Q16).
  • q19-23-demographics.csv - the demographic part of the survey (Q19-Q21, Q23).
  • q17-institutions.csv - the institution/location of the respondent (Q17).
  • q18-funding.csv - funding sources within the last 5 years (Q18).

Please note the code that has been used to do the analysis will not run with the decoupled survey data. 

Other data files included

  • CleanedLocations.csv - normalised version of the institutions that the survey respondents volunteered.
  • DTPs.csv - information on the UKRI Doctoral Training Partnerships (DTPs) scaped from the UKRI DTP contacts web page in October 2021.
  • projectsearch-1646403729132.csv.gz - data snapshot from the UKRI Gateway to Research released on the 24th February 2022 made available under an Open Government Licence.
  • locations.csv - latitude and longitude for the institutions in the cleaned locations.
  • subjects.csv - research classifications for the ESRC projects for the 24th February data snapshot.
  • topics.csv - topic classification for the ESRC projects for the 24th February data snapshot.

Interview transcripts

The interview transcripts have been anonymised and converted to markdown so that it's easier to process in general. List of interview transcripts:

  • 1269794877.md
  • 1578450175.md
  • 1792505583.md
  • 2964377624.md
  • 3270614512.md
  • 40983347262.md
  • 4288358080.md
  • 4561769548.md
  • 4938919540.md
  • 5037840428.md
  • 5766299900.md
  • 5996360861.md
  • 6422621713.md
  • 6776362537.md
  • 7183719943.md
  • 7227322280.md
  • 7336263536.md
  • 75909371872.md
  • 7869268779.md
  • 8031500357.md
  • 9253010492.md

Data Management Plan

The study's Data Management Plan is provided in PDF format and shows the different data sets used throughout the duration of the study and where they have been deposited, as well as how long the SSI will keep these records. 

Files

esrc_data-survey-analysis-data.zip

Files (19.6 MB)

Name Size Download all
md5:9618f2fbb2bd4b5307ce2f386c3e5d77
52.3 kB Preview Download
md5:bfbdb2838853b93dd729828b8e548ea1
341.6 kB Preview Download
md5:2187b5cfbfb14ba07c93c867ce685f54
328.2 kB Preview Download
md5:5de298651c5c21042906646d7d590884
18.9 MB Preview Download

Additional details

Related works

Is supplement to
Report: 10.5281/zenodo.7785707 (DOI)
Software: 10.5281/zenodo.8086305 (DOI)

Funding

UK Research and Innovation
The UK Software Sustainability Institute: Phase 3 EP/S021779/1