Dataset Open Access

PP-ind: A Repository of Industrial Pair Programming Research Data

Zieris, Franz; Prechelt, Lutz

PP-ind is a repository of research data on industrial pair programming sessions. Since 2007, our research group has collected audio-video-recordings and questionnaire data in 13 companies. A total of 57 developers worked together (mostly in groups of two, but also three or four) in 67 sessions with a mean length of 1:35 hours. A separate tech report provides many details on how this data was collected.

While we cannot share the original video recordings due to confidentiality agreements, we do provide transcripts of the pairs' dialog in this data set. Since we perform our analyses directly on the video material, we only transcribe our data on an is-needed basis, e.g., in preparation for a publication. This data set will therefore contain only few and partial transcripts, which may be amended in future versions.

Files named session-<ID>-transcript.txt contain original quotations in the language spoken by the recorded developers. For non-English sessions, we also provide non-authoritative session-<ID>-transcript_translated.txt files (following the same is-needed rule for translating the originals). All our analyses, however, are performed on the raw data as reflected in the original transcripts. See file transcription-notation.txt for details on the special notation we use.

Files (2.2 MB)
Name Size
2.2 MB Download
All versions This version
Views 6323
Downloads 3813
Data volume 28.4 MB28.2 MB
Unique views 4318
Unique downloads 1912


Cite as