Dataset Open Access

Joint Autoregressive and Graph Models for Software and Developer Social Networks

Hazra Rima; Aggarwal Hardik; Goyal Pawan; Mukherjee Animesh; Chakrabarti Soumen

This zip contains three CSV files and one folder. This dataset contains information for the recent ten distributions.

  • developer_attributes.csv: There are seven columns in this file.  "distro" (str) represents distribution name. "source" (str) denotes source package name. "person_id" (str) indicates developer identity. "closes" (int), "high" (int),  "medium" (int), "low" (int) are the features.
  • source_bugs.csv: In this file, three columns are present. "distro" (str) represents the distribution name. "source" (str) represents the source package name. "bug_count" (int) denotes the number of bugs that source package has at a particular distribution.
  • source_sizes.csv: In this file, three columns are present. "distro" (str) represents the distribution name. "source" (str) represents source package name. "size" (int) denotes the size of the package.
  • Dependency folder: Within this folder, ten dependency lists are present. Each file contains two columns i.e "start" (str) and "target" (str). Both of them represent source packages. So, we read as the "start" source package depends on "target" source package. 

Here is the arxiv version of our paper:

Here is the portal link:

Here is the arxiv version of our paper:
Files (8.2 MB)
Name Size
data_share (SWNET).zip
8.2 MB Download
All versions This version
Views 230230
Downloads 1212
Data volume 98.2 MB98.2 MB
Unique views 200200
Unique downloads 1212


Cite as