There is a newer version of the record available.

Published April 17, 2020 | Version 2020-01-24
Dataset Open

An Annotated Dataset of Stack Overflow Post Edits

  • 1. The University of Adelaide


To improve software engineering, software repositories have been mined for code snippets and bug fixes. Typically, this mining takes place at the level of files or commits. To be able to dig deeper and to extract insights at a higher resolution, we hereby present an annotated dataset that contains over 7 million edits of code and text on Stack Overflow. Our preliminary study indicates that these edits might be a treasure trove for mining information about fine-grained patches, e.g., for the optimisation of non-functional properties.


Files (175.9 MB)

Name Size Download all
9.5 kB Download
11.0 kB Download
941 Bytes Download
175.9 MB Download
2.9 kB Download