Published February 15, 2021 | Version 2.0
Dataset Open

PP-ind: A Repository of Industrial Pair Programming Research Data

  • 1. Freie Universität Berlin

Description

PP-ind is a repository of research data on industrial pair programming sessions. Since 2007, our research group has collected audio-video-recordings and questionnaire data in 13 companies. A total of 57 developers worked together (mostly in groups of two, but also three or four) in 67 sessions with a mean length of 1:35 hours. A separate tech report provides many details on how this data was collected.

While we cannot share the original video recordings due to confidentiality agreements, we do provide transcripts of the pairs' dialog in this data set. Since we perform our analyses directly on the video material, we only transcribe our data on an is-needed basis, e.g., in preparation for a publication. This data set will therefore contain only few and partial transcripts, which may be amended in future versions.

Files named session-<ID>-transcript.txt contain original quotations in the language spoken by the recorded developers. For non-English sessions, we also provide non-authoritative session-<ID>-transcript_translated.txt files (following the same is-needed rule for translating the originals). All our analyses, however, are performed on the raw data as reflected in the original transcripts. See file transcription-notation.txt for details on the special notation we use.

Files

pp-ind.zip

Files (2.2 MB)

Name Size Download all
md5:5a7a299db6b64ff2c517bd01da535592
2.2 MB Preview Download

Additional details

Related works

Is documented by
Technical note: arXiv:2002.03121 (arXiv)