Published June 13, 2024
| Version v1
Software
Open
Autonomous Assessment of LLM Truth Maintenance in Formal Translation Tasks without Human Labeling: Dynamic Datasets, Assessment Paradigms, and End-to-End Benchmarks
Authors/Creators
Description
Code and Datasets for the NeurIPS-24 dataset track submission.
Files
autoeval.zip
Files
(346.8 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:2980cda4b6636be6538608a33ff40ea7
|
346.8 MB | Preview Download |