# PP-ind: A Repository of Industrial Pair Programming Research Data

Zieris, Franz; Prechelt, Lutz

PP-ind is a repository of research data on industrial pair programming sessions. Since 2007, our research group has collected audio-video-recordings and questionnaire data in 13 companies. A total of 57 developers worked together (mostly in groups of two, but also three or four) in 67 sessions with a mean length of 1:35 hours. A separate tech report provides many details on how this data was collected.

While we cannot share the original video recordings due to confidentiality agreements, we do provide transcripts of the pairs' dialog in this data set. Note that we transcribe our data on an is-needed basis. Early versions of this data set will therefore contain only few and partial transcripts which will be amended over time.

Files named "session-<ID>-transcript.txt" contain original quotations in the language spoken by the recorded developers. For non-English sessions, we also provide non-authoritative "session-<ID>-transcript_translated.txt" files (following the same is-needed rule for translating the originals). All our analyses, however, are performed on the raw data as reflected in the original transcripts. See file "transcription-notation.txt" for details on the special notation we use.

Files (50.5 kB)
Name Size
session-ca1-transcript.txt
3.7 kB
session-ca1-transcript_translated.txt
md5:5841d31fee107e0a35731765441746e1
814 Bytes
session-da2-transcript.txt
md5:03c6703cdfeca60b165bc6366f22c076
16.1 kB
session-da2-transcript_translated.txt
8.9 kB
session-ja1-transcript.txt
md5:3f9fec2317748c303194656fb7503432
12.8 kB
session-ja1-transcript_translated.txt
md5:6ebb839ce2fc4f4d11bf2922bb88940e
2.6 kB
session-pa3-transcript.txt
md5:30bf41152fa6484ba1355b340bc6c82d
2.6 kB
session-pa3-transcript_translated.txt
md5:bb47c41731f257e69e169582fc1c5f29
2.0 kB
transcription-notation.txt
md5:c12bf3cd5d8bfe89b6a4ec40265376c3
861 Bytes
63
38
