There is a newer version of the record available.

Published August 1, 2025 | Version v1
Presentation Open

SemEval-2025 Task 5: LLMs4Subjects--LLM-based Automated Subject Tagging for a National Technical Library's Open-Access Catalog

  • 1. ROR icon Technische Informationsbibliothek (TIB)

Description

This presentation was delivered at the ACL 2025 conference on July 31, 2025, and reports on the outcomes of SemEval-2025 Task 5: LLMs4Subjects, a shared task on automated subject indexing of scientific and technical records in English and German using the GND (Gemeinsame Normdatei) taxonomy. Participants developed LLM-based systems to recommend top-k subject terms, evaluated using quantitative metrics (precision, recall, F1-score) and qualitative assessments by subject specialists. The presentation summarizes the task setup, data sources, evaluation methodology, and key findings, including insights into multilingual performance, ensemble approaches, and the role of synthetic training data.

Files

semeval2025-task5-overview.pdf

Files (1.7 MB)

Name Size Download all
md5:c3f156e1502386a87d84d24713a44ec0
1.7 MB Preview Download

Additional details

Dates

Created
2025-07-31

Software

Repository URL
https://github.com/sciknoworg/llms4subjects
Programming language
JSONLD , Python

References

  • D'Souza, J., Sadruddin, S., Israel, H., Begoin, M., & Slawig, D. (2025). SemEval-2025 Task 5: LLMs4Subjects--LLM-based Automated Subject Tagging for a National Technical Library's Open-Access Catalog. arXiv preprint arXiv:2504.07199. https://arxiv.org/abs/2504.07199