Dataset Open Access

The Debsources Dataset

Stefano Zacchiroli

The Debsources Dataset is a dump of the database that underpins the most notable instance of the Debsources platform, currently running at

Debsources is a platform that provides access on the Web to the source code of the Debian operating system. Debsources allows to browse through Debian source packages and render contained source code files on the Web. Debsources indexes all Debian source code and permits to search through it via various means (defined symbols, checksums, regular expressions, etc). A notable live instance of Debsources is available at, providing access to both current and historical Debian releases dating back to 1998.

The Debsources Dataset contains both Debian metadata (e.g., which software packages are available in which release, which source code file belong to which package, release dates, etc.) and source code information obtained by running popular indexing and measurement tools on Debian source packages. In particular, the source code of all available packages has been subject to:

  • SHA256 checksum computation on each source file
  • ctags indexing
  • sloccount measurement
  • disk usage measurement

For more information see the README file.

Files (3.6 GB)
Name Size
COPYING.CC-BY-SA md5:d839aac91bb370e9aca545fd058ca5ba 20.0 kB Download
dbschema.html md5:8218f217e33aa80ab8d4bdb2365c6ec6 38.1 kB Download
dbschema.pdf md5:dec26a9ec9e23f4801029613c462e5fe 21.2 kB Download
debsources-src.07a7eae4.tar.gz md5:1b8fe21b892dfacebc0f7256e2c78a97 394.3 kB Download
debsources.1423576120.xz md5:08c421a9ef2cbc0feb327d4c062b4cef 3.6 GB Download
LICENSE.txt md5:357c514f1173bc374264d11bb2b230d0 364 Bytes Download
README.txt md5:1e3ab68460991a565ad46c43f6bf6b0d 4.9 kB Download


Cite as