Published March 21, 2023 | Version v2

BabyLM Evaluation Data

  • 1. Johns Hopkins University
  • 2. ETH Zurich
  • 3. IBM Research
  • 4. Massachusetts Institute of Technology
  • 5. UNC Chapel Hill

Description

Evaluation data for the BabyLM Challenge. We filter for examples where each word has appeared in our strict-small dataset at least twice.

Files

filter_data.zip

Files (51.5 MB)

Name Size Download all
md5:00dee5ddc45a1b622e942e069a2b5ad7
51.5 MB Preview Download