LakeBench: Benchmarks for Data Discovery over Data Lakes
Authors/Creators
Description
LakeBench: Benchmarks for Data Discovery over Data Lakes
Version 3 adds the wiki-join-search benchmark used in the "join search" experiments in our paper.
The data in the labels files (i.e., labels.json files or files under the labels folder) are shared under Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license: https://creativecommons.org/licenses/by-sa/4.0/
The data in the tables folder comes from different sources under various open licenses, as detailed in the README.txt file in each folder. All the datasets included in the benchmark have been verified to have a public license that allows distribution, derivatives, and commercial use.
THIS DATA IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Files
Files
(4.5 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:cf20548495b7df8aa6ea95299ee52880
|
1.5 GB | Download |
|
md5:b68cc3eaaddd06cf0d14822e9840546a
|
1.3 GB | Download |
|
md5:fd772b8e142c13b1f4895c7ecf99653a
|
13.7 MB | Download |
|
md5:650f0516173d611fe9876ed58d90d5fa
|
1.5 GB | Download |
|
md5:9b1c44b34b508cbf1ab8f9a057d5e80f
|
84.7 kB | Download |
|
md5:e5699ef5a81e30b97f0dff7d0da9749a
|
7.0 MB | Download |
|
md5:792e01dbda0e646cca82ba1da113af9e
|
5.6 MB | Download |
|
md5:c0fa67a58e273800fabfdb885fda7e90
|
78.8 MB | Download |
|
md5:9b35d2efd0509991a55be9090690eb79
|
28.5 MB | Download |