Dataset Open Access

LakeBench: Benchmarks for Data Discovery over Data Lakes

Anonymous

LakeBench: Benchmarks for Data Discovery over Data Lakes

The labels.json data is shared under Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license: https://creativecommons.org/licenses/by-sa/4.0/

The data in the tables folder comes from different sources under various open licenses as detailed on the README.txt file in each folder. All the datasets included in the benchmark have been verified to have a public license that allows distribution, derivatives, and commercial use.

THIS DATA IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

 

Files (4.4 GB)
Name Size
ckan-subset.tar.bz2
md5:74662d5e7476219952e02a427f258361
1.5 GB Download
ecb-join.tar.bz2
md5:b68cc3eaaddd06cf0d14822e9840546a
1.3 GB Download
ecb-union.tar.bz2
md5:fd772b8e142c13b1f4895c7ecf99653a
13.7 MB Download
spider-join.tar.bz2
md5:b0c360777bbe4c74198f0e91375d1077
1.5 GB Download
tus-santos.tar.bz2
md5:9b1c44b34b508cbf1ab8f9a057d5e80f
84.7 kB Download
wiki-containment.tar.bz2
md5:e5699ef5a81e30b97f0dff7d0da9749a
7.0 MB Download
wiki-jaccard.tar.bz2
md5:792e01dbda0e646cca82ba1da113af9e
5.6 MB Download
wiki-union.tar.bz2
md5:9b35d2efd0509991a55be9090690eb79
28.5 MB Download
133
127
views
downloads
All versions This version
Views 133133
Downloads 127127
Data volume 106.6 GB106.6 GB
Unique views 119119
Unique downloads 4848

Share

Cite as