Published June 13, 2024
| Version v1
Software
Open
Autonomous Assessment of LLM Truth Maintenance in Formal Translation Tasks without Human Labeling: Dynamic Datasets, Assessment Paradigms, and End-to-End Benchmarks
Authors/Creators
Description
Code and Datasets for the NeurIPS-24 dataset track submission.