Published June 8, 2017 | Version v1
Dataset Open

Application Domain of 5,000 GitHub Repositories

  • 1. Federal University of Minas Gerais

Description

We provide a manual classification of the application domain of 5,000 GitHub repositories (the most popular ones, by number of stars, on January, 2017).

We classified each system in one of the following application domains:

  • Application software: systems that provide functionalities to end-users, like browsers and text editors (e.g., WordPress/WordPress and adobe/brackets).
  • System software: systems that provide services and infrastructure to other systems, like operating systems, middleware, and databases (e.g., torvalds/linux and mongodb/mongo).
  • Web libraries and frameworks (e.g., twbs/bootstrap and angular/angular.js).
  • Non-web libraries and frameworks (e.g., google/guava and facebook/fresco).
  • Software tools: systems that support development tasks, like IDEs, package managers, and compilers (e.g., Homebrew/homebrew and git/git).
  • Documentation: repositories with documentation, tutorials, source code examples, etc. (e.g., iluwatar/java-design-patterns).

To cite the dataset, please use the following paper (which proposes and uses a first dataset version):

Hudson Borges, Andre Hora, Marco Tulio Valente. Understanding the Factors that Impact the Popularity of GitHub Repositories. In 32nd IEEE International Conference on Software Maintenance and Evolution (ICSME), pages 334-344, 2016.

Files

Domains of 5,000 GitHub Repositories - Public - Domains.csv

Files (884.6 kB)