There is a newer version of the record available.

Published March 7, 2022 | Version v0.2.0
Software Open

EleutherAI/lm-evaluation-harness: v0.2.0

Description

Major changes since 0.1.0:

  • added blimp (#237)
  • added qasper (#264)
  • added asdiv (#244)
  • added truthfulqa (#219)
  • added gsm (#260)
  • implemented description dict and deprecated provide_description (#226)
  • new --check_integrity flag to run integrity unit tests at eval time (#290)
  • positional arguments to evaluate and simple_evaluate are now deprecated
  • _CITATION attribute on task modules (#292)
  • lots of bug fixes and task fixes (always remember to report task versions for comparability!)

Files

EleutherAI/lm-evaluation-harness-v0.2.0.zip

Files (731.0 kB)

Name Size Download all
md5:903ef491489d5b3e98ac87af7ac3886d
731.0 kB Preview Download

Additional details