SemEval-2025 Task 5: LLMs4Subjects--LLM-based Automated Subject Tagging for a National Technical Library's Open-Access Catalog
Description
This presentation was delivered at the ACL 2025 conference on July 31, 2025, and reports on the outcomes of SemEval-2025 Task 5: LLMs4Subjects, a shared task on automated subject indexing of scientific and technical records in English and German using the GND (Gemeinsame Normdatei) taxonomy. Participants developed LLM-based systems to recommend top-k subject terms, evaluated using quantitative metrics (precision, recall, F1-score) and qualitative assessments by subject specialists. The presentation summarizes the task setup, data sources, evaluation methodology, and key findings, including insights into multilingual performance, ensemble approaches, and the role of synthetic training data.
Files
semeval2025-task5-overview.pdf
Files
(1.7 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:c3f156e1502386a87d84d24713a44ec0
|
1.7 MB | Preview Download |
Additional details
Dates
- Created
-
2025-07-31
Software
- Repository URL
- https://github.com/sciknoworg/llms4subjects
- Programming language
- JSONLD , Python
References
- D'Souza, J., Sadruddin, S., Israel, H., Begoin, M., & Slawig, D. (2025). SemEval-2025 Task 5: LLMs4Subjects--LLM-based Automated Subject Tagging for a National Technical Library's Open-Access Catalog. arXiv preprint arXiv:2504.07199. https://arxiv.org/abs/2504.07199