Published September 1, 2021 | Version v1
Dataset Restricted

LHA Sentence Alignments Extracted From the Austria Press Agency Corpus

  • 1. University of Zurich


Conference paper: Exploring German Multi-Level Text Simplification

Sentence alignments extracted with LHA (Nikolov and Hahnloser, 2019) from the Austria Press Agency (Austria Presse Agentur, APA) corpus. It contains alignments from news items between August 2018 and April 2021. There are alignments for CEFR levels A2 and B1 to the original standard German text.


Nikola I. Nikolov and Richard Hahnloser. 2019. Large-scale hierarchical alignment for data-driven text rewriting. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019), pages 844–853, Varna, Bulgaria. INCOMA Ltd.



