There is a newer version of the record available.

Published May 6, 2022 | Version 1.0.0
Dataset Open

GouDa - Generation of universal Data Sets

  • 1. FernUniversiät in Hagen

Description

GouDa is a tool for the generation of universal data sets to evaluate and compare existing data preparation tools and new research approaches. It supports diverse error types and arbitrary error rates. Ground truth is provided as well. It thus permits better analysis and evaluation of data preparation pipelines and simplifies the reproducibility of results.

Publication: V. Restat, G. Boerner, A. Conrad, and U. Störl. GouDa - Generation of universal Data Sets. In Proceedings of Data Management for End-to-End Machine Learning (DEEM’22), Philadelphia, USA, 2022. https://doi.org/10.1145/3533028.3533311

Notes

Please use the current version 1.1.0!

Files

doc-error-generating-functions.md

Files (405.8 MB)

Name Size Download all
md5:bf1f438131d7448692d18d28778b8641
6.8 kB Preview Download
md5:363b71da174e54649ed3765724c15bb0
8.1 kB Preview Download
md5:a83d3fb7cde6f93b7cc5d1fe7178d655
387.2 MB Download
md5:f380531908326c326fb8cf6f5ffc2277
1.9 MB Preview Download
md5:c4decba6977798552966bbb1c65ff0dd
201.7 kB Preview Download
md5:f30f99c3f87dbd44857ffb69798b0332
399.6 kB Preview Download
md5:8639efad757f01d6059487ac251b21e8
559.1 kB Preview Download
md5:d7535368be34faf95f9fb266be530c2e
3.0 MB Preview Download
md5:cc578b51557a37703733994c0851b0c9
12.4 MB Download
md5:2c433a714daafd706beef0dc52e94efa
4.3 kB Preview Download