Dataset Open Access
Maven Central Dependency Graph
This is an updated version of the artifact at https://zenodo.org/record/1489120
The Maven dependency graph is an open dataset of Maven Central artifacts, their dependencies, as well as other relationships. Its main intent is to domesticate the wild within and around the Maven central ecosystem, in particular, and JVM-based libraries at large, making it more harnessable to both academics and industry. It is intended to answer high-level research questions concerning artifacts releases, evolution, and usage trends over time. It can also be used to assist researchers in selecting relevant datasets, among the mass of existing software artifact, for assessing particular empirical software engineering challenges. The complexity of these questions can range from simple pattern matching to advanced big data analysis and machine learning techniques.
The accompanying paper to this dataset is has been accepted for publication in the proceedings of the International Conference on Mining Software Repositories 2019 and has received the MSR 2019 Data Showcase Award. This paper is available for download on arXiv.
What is new?
The previous version included artifacts until September 6, 2018.
This version includes artifacts until September 10, 2019.
This version includes license information as well as information about associated code repository.
This version contains 4 201 392 artifacts (version) of 308116 distinct libraries from 47481 distinct group IDs.
Note 33 638 artifacts represents version ranges and note actual versions. They can be filtered out by excluding version containing ','.
# Pull the image and start the container docker run -d --name mm-neo4j -p 7474:7474 -p 7687:7687 -v /path/to/neo4j-data:/data --env=NEO4J_dbms_memory_heap_max__size=8g lyadis/mm-neo4j:latest