There is a newer version of the record available.

Published July 1, 2023 | Version 1.0
Dataset Restricted

DEplain-APA

  • 1. Heinrich Heine University

Contributors

Contact person:

Data collector:

Producer:

  • 1. Heinrich Heine University
  • 2. APA - Austria Press Agency eG

Description

DEplain: A corpus for German Text Simplification

This repository contains the corpus called DEplain-APA for German text simplification (document and sentence simplification). The corpus contains Austrian nexts text provided by the APA - Austria Presse Agentur eG. All of the sentence-wise aligned pairs (complex-simple) are manually aligned. The following table summarizes the most important meta data of the corpus.

 

meta data value
language DE-AT (Austrian German)
domain news
source language level B1
target language level A2
# document pairs (total, train/dev/test) 483 (387/48/48)
# sentence pairs (total, train/dev/test) 13,122 (10,660/1,231/1,231)
# complex sentences 25,607
# simple sentences 26,471

 

For more information, please have a look at our paper. If you use this corpus, please also cite our paper and name APA - Austria Presse Agentur eG as data provider:

Regina Stodden, Omar Momen, and Laura Kallmeyer. 2023. DEplain: A German Parallel Corpus with Intralingual Translations into Plain Language for Sentence and Document Simplification. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 16441–16463, Toronto, Canada. Association for Computational Linguistics.

Files

Restricted

The record is publicly accessible, but files are restricted to users with access.

Request access

If you would like to request access to these files, please fill out the form below.

You need to satisfy these conditions in order for this request to be accepted:

When you request the data, you confirm to accept the following conditions:

  • You agree to use the dataset for non-commercial research purposes only.
  • You agree that you may not share the data with others. Others must request their own access.
  • You agree to keep the data set secure and to prevent access to the data by others.
  • You agree not to change the wording of the record or mark a change as such.
  • You agree to irrevocably and demonstrably delete the dataset upon request within less than three months. That would be in case the data provider quits the data usage contract.

Please request access to the data by providing your name, affiliation, and a brief explanation of how you would like to use the data. Please also add a short statement that you agree to the terms above.

Thank you very much!

You are currently not logged in. Do you have an account? Log in here