Structured Information on State and Evolution of Dockerfiles on GitHub
Description
Docker containers are standardized, self-contained units of applications, packaged with their dependencies and execution environment. The environment is defined in a Dockerfile that specifies the steps to reach a certain system state as infrastructure code, with the aim of enabling reproducible builds of the container. To lay the groundwork for research on infrastructure code, we collected structured information about the state and the evolution of Dockerfiles on GitHub and release it as a PostgreSQL database archive (over 100,000 unique Dockerfiles in over 15,000 GitHub projects). Our dataset enables answering a multitude of interesting research questions related to different kinds of software evolution behavior in the Docker ecosystem.
Notes
Files
Files
(1.5 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:cf7583effc1d699a3f6dcccf34f1e2e0
|
1.5 GB | Download |