Published March 10, 2024 | Version JSUPE'24
Software Open

alessant/SWHA-license-checker-app: v1.0.0-experiment-reproducibility

  • 1. Università degli Studi di Torino

Description

This repository contains two Jupyter notebooks to reproduce the experiments described in the article _Analyzing FOSS License Usage in Publicly Available Software at Scale via the SWH-Analytics Framework_ by A. Antelmi, M. Torquati, G. Corridori, D. Gregori, F. Polzella, G. Spinatelli, and A. Aldinucci currently sumbitted to the Journal of Supercomputing.

Article abstract

The Software Heritage (SWH) dataset represents an invaluable source of open-source code as it aims to collect, preserve, and share all publicly available software in source code form ever produced by humankind. Although designed to archive deduplicated small files thanks to the use of a Merkle tree as the underlying data structure, querying the SWH dataset presents challenges due to the nature of these structures, which organize content based on hash values rather than any locality principle. The magnitude of the repository, coupled with the resource-intensive nature of the download process, highlights the need for specialized infrastructure and computational resources to effectively handle and study the extensive dataset housed within SWH. Currently, there is a lack of infrastructures specifically tailored for running analytics on the SWH dataset, leaving users to handle these issues manually. 

To address these challenges, we presented the SWH-Analytics (SWHA) framework, a development environment that transparently runs custom analytic applications on publicly available software data preserved over time by SWH. Specifically, this work shows how SWHA can be effectively exploited to study usage patterns of free and open-source software (FOSS) licenses, highlighting the need to improve license literacy among developers.

Files

alessant/SWHA-license-checker-app-JSUPE'24.zip

Files (2.1 MB)

Name Size Download all
md5:8b1c39e99254d77f7c8713b9a5845961
2.1 MB Preview Download

Additional details

Related works