Published January 20, 2022 | Version v1
Journal article Open

Painting the Automotive Software Landscape in GitHub

Authors/Creators

Description

This dataset comprises of the raw data that we used for analyzing the automotive software landscape on GitHub. 

'all topics from search queries' comprises of the all the topics from our initial search to identify automotive software on GitHub.

'selected_topics' consists of the final list of topics that were used to identify automotive software repositories on GitHub.

'Data from GitHub via PyGitHub' comprises of a curated list of automotive software projects, its associated meta data, and tags that classify it on 4 dimensions: (a) in-vehicle vs. tools, (b) safety critical vs. non-safety critical, (c) safety-critical based on application, and (d) broy's classification.

'auto_projects_data' and 'nonauto_projects_data' comprises of the development activities of automotive software and non-automotive software repositories respectively as derived from the GHTorrent data. For ease, the csv files follow the same schema as in the original GHTorrent data and can be directly imported in the mysql format for analysis. 


How can I cite this work?
If you find this dataset useful and want to use it in your work, please cite the following paper:

@inproceedings{kochanthara2022,
  title={Painting the Landscape of Automotive Software in GitHub},
  author={Kochanthara, Sangeeth and Dajsuren, Yanja and Cleophas, Loek and van den Brand, Mark},
  booktitle={The 2022 Mining Software Repositories Conference},
  url={https://arxiv.org/pdf/2203.08936.pdf}
  year={2022}
}

Files

all topics from search queries.csv

Files (922.0 MB)

Name Size Download all
md5:7bdb187142dec96d417ffa38f08895be
57.0 kB Preview Download
md5:d9e873112e0bf76716a9517843752730
12.7 MB Preview Download
md5:569797ca7990248b57c43fcb16e2ee01
156.1 kB Preview Download
md5:9bb074d2d73b9243e1139e73746c6b3f
909.1 MB Preview Download
md5:fd0e8b8a81a1071ccd611eae320dbe89
6.6 kB Preview Download