There is a newer version of the record available.

Published November 30, 2024 | Version 1.0.0
Dataset Open

Replication Package for Wild SBOMs: a Large-scale Dataset of Software Bills of Materials from Public Code

Description

Developers gain productivity by reusing readily available Free and Open Source Software (FOSS) components.
Such practices also bring some difficulties, such as managing licensing, components and related security. One approach to handle those difficulties is to use Software Bill of Materials (SBOMs). While there have been studies on the readiness of practitioners to embrace SBOMs and on the SBOM tools ecosystem, a large scale study on SBOM practices based on SBOM files produced in the wild is still lacking. A starting point for such a study is a large dataset of SBOM files found in the wild. We introduce such a dataset, consisting of over 78 thousand unique SBOM files, deduplicated from those found in over 94 million repositories. We include metadata that contains the standard and format used, quality score generated by the tool sbomqs, number of revisions, filenames and provenance information. Finally, we give suggestions and examples of research that could bring new insights on assessing and improving SBOM real practices.

For more details see the included README file and the companion paper: Wild SBOMs: a Large-scale Dataset of Software Bills of Materials from Public Code.

Files

README.md

Files (6.0 GB)

Name Size Download all
md5:721ca44ad37a930a61f009c5cde2fbfe
122.9 kB Download
md5:efa1388876f1a69cfbaf91d5a5fe2457
1.9 kB Download
md5:b4ce993b4bce122101a33e24fe77ee71
10.6 kB Preview Download
md5:7c74a94004b7d353c0ec0db19a8355cc
54 Bytes Download
md5:5963d98b0291f0487e278e12e5ac87ad
5.1 GB Download
md5:9e3ac4a0d24a795cb6b35cbb7fc8cda9
4.1 MB Download
md5:f89e7b7a07e0a19652ed3b26bb85b032
26.8 MB Download
md5:b70d82f315168e24cfe2c3161ef2390e
906.2 MB Download
md5:ea375e0d3505d1295ba94b933d7617ba
2.1 MB Download

Additional details

Dates

Collected
2024-11-30