Published January 27, 2023 | Version 1.0.0
Dataset Open

Synthesizing Patches in Dockerfiles: Base Dataset

  • 1. TU Wien

Description

Docker containers are a standardized way of packaging applications and their execution environment in a reproducible manner. This dataset is an extension of an existing docker dataset with over 100,000 Dockerfiles in 15,000 projects (https://zenodo.org/record/1200869/). 

This dataset was used to extract patching patterns for Dockerfiles with the goal of improving quality in an automatic fashion.

The extension of the original dataset includes: 

  • Static analysis results of every version of every Dockerfile
  • Vulnerability data from analysing a limited amount of built Docker images
  • A second database containing quality patches based on the static analysis results 

 

Files:

  • msr18_extended
    A compressed, binary PostgreSQL database dump of the docker dataset extended with analysis results
  • patch_database
    A compressed, binary PostgreSQL database dump of extracted patches
  • patch_datbase.sql
    A PostgreSQL plain SQL database dump of extracted patches

Furhter information on the artifact used to extract and apply patches and instructions to import the database dumps are provided here: https://github.com/mandoway/dfp

Files

Files (1.6 GB)

Name Size Download all
md5:c677a5be71c48e0345e98245313d0b3b
1.6 GB Download
md5:58cf5a32b46bcf691426ddb5e34cfe58
962.8 kB Download
md5:53a6b288a45ddac9bcc3b48fd4494a71
7.1 MB Download