Published November 15, 2018 | Version 06.09.2018-snapshot
Dataset Open

Maven central dependency graph

  • 1. Diverse team, Inria, Univ Rennes, CNRS, IRISA, France
  • 2. KTH, Sweden

Description

The Maven dependency graph is an open dataset of Maven Central artifacts, their dependencies, as well as other relationships. Its main intent is to domesticate the wild within and around the Maven central ecosystem, in particular, and JVM-based libraries at large, making it more harnessable to both academics and industry. It is intended to answer high-level research questions concerning artifacts releases, evolution, and usage trends over time. It can also be used to assist researchers in selecting relevant datasets, among the mass of existing software artifact, for assessing particular empirical software engineering challenges. The complexity of these questions can range from simple pattern matching to advanced big data analysis and machine learning techniques.

The accompanying paper to this dataset is has been accepted for publication in the proceedings of the International Conference on Mining Software Repositories 2019 and has received the MSR 2019 Data Showcase Award. This paper is available for download on arXiv.

Notes

The Maven dependency graph is the fruit of a collaboration between the DiverSE team (Inria Rennes, France) and CASTOR project (KTH, Sweden). Instructions on how to use and reproduce the dataset can be found in the dataset's repository on [Github](https://github.com/diverse-project/maven-miner). A complete description of the dataset and usages can be found in the accompanying [paper] (https://arxiv.org/abs/1901.05392).

Files

Files (3.1 GB)

Name Size Download all
md5:f95582c55246e826cf2f8bc009746b7c
72.0 MB Download
md5:247fb32f6d431b59c21c6bd2504b3222
2.8 GB Download
md5:e34db6419bd1541b1eda86002ff15267
282.9 MB Download

Additional details

Funding

STAMP – Software Testing AMPlification 731529
European Commission