Published August 21, 2023 | Version v1
Conference paper Open

Do CONTRIBUTING Files Provide Information about OSS Newcomers' Onboarding Barriers?

Authors/Creators

  • 1. Virginia Commonwealth University

Description

Effectively onboarding newcomers is essential for the success of
open source projects. These projects often provide onboarding
guidelines in their ‘CONTRIBUTING’ files (e.g., CONTRIBUTING.md
on GitHub). These files explain, for example, how to find open tasks,
create contribution packs, and submit code for review. However,
these files often do not follow a standard structure, can be too large,
and miss barriers commonly found by newcomers. In this paper, we
propose an automated approach to parse these CONTRIBUTING
files and assess how they address onboarding barriers. We manually
classified a sample of files according to a model of onboarding bar-
riers from the literature, trained a machine learning classifier that
automatically predicts the categories of each paragraph (precision:
0.655, recall: 0.662), and surveyed developers to investigate their
perspective of the predictions’ adequacy (75% of the predictions
were considered adequate). We found that CONTRIBUTING files
typically do not cover the barriers newcomers face (52% of the
analyzed projects missed at least 3 out of the 6 barriers faced by
newcomers; 84% missed at least 2). Our analysis also revealed that
information about choosing a task and talking with the community,
two of the most recurrent barriers newcomers face, are neglected in
more than 75% of the projects. We made available our classifier as an
online service that analyzes the content of a given CONTRIBUTING
file. Our approach may help community builders identify missing
information in the project ecosystem they maintain and newcomers
can understand what to expect in CONTRIBUTING files.

Files

FSE-2023-master.zip

Files (88.5 MB)

Name Size Download all
md5:81d82e25d814b1023548250d45632f64
88.5 MB Preview Download