Autonomous Assessment of LLM Truth Maintenance in Formal Translation Tasks without Human Labeling: Dynamic Datasets, Assessment Paradigms, and End-to-End Benchmarks

Published June 13, 2024 | Version v1

Software Open

Code and Datasets for the NeurIPS-24 dataset track submission.

Files

Name	Size
autoeval.zip md5:2980cda4b6636be6538608a33ff40ea7	346.8 MB	Preview Download

155

Views

100

Downloads

Show more details

DOI

Resource type

Software

Publisher

Zenodo

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more