BabyLM Evaluation Data

Published March 21, 2023 | Version v2

Dataset Open

Evaluation data for the BabyLM Challenge. We filter for examples where each word has appeared in our strict-small dataset at least twice.

Files

Name	Size	Download all
filter_data.zip md5:00dee5ddc45a1b622e942e069a2b5ad7	51.5 MB	Preview Download

206

Views

Downloads

Show more details

DOI

Resource type

Dataset

Publisher

Zenodo

Languages

English

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more