Florian Zipser
Thomas Krause
Anke Lüdeling
Arne Neumann
Manfred Stede
Amir Zeldes
2015-05-24
<p>Information structure, like many other linguistic phenomena, influences different linguistic levels at the same time (stress, word order, definiteness, etc.). PAULA is a human and machine-readable XML format to store linguistic data which are annotated on multiple layers. Corpus-based research on information structure therefore needs access to different types of annotation (Lüdeling et al., to appear). There are now many multi-layer corpora with annotations of linguistic phenomena on several levels (see, e.g. Tüba-D/Z<br />
(Telljohann et al. 2009), Falko (Reznicek et al. 2012) or PCC (Stede<br />
& Neumann 2014)). Unfortunately most tools have different formats which may not be interoperable, that means data can hardly be exchange between tools. Furthermore, there is no possiblity for analysis on multiple layers.</p>
<p>Goals:<br />
1.Merging different types of annotations of the same primary text to a single corpus → Pepper<br />
2.Storage of different types of annotations in only one format → PAULA<br />
3.Search in different corpora and different phenomena in one single system → ANNIS</p>
<p> </p>
https://doi.org/10.5281/zenodo.20713
oai:zenodo.org:20713
Zenodo
https://zenodo.org/communities/eu
https://zenodo.org/communities/linguistics
https://doi.org/
info:eu-repo/semantics/openAccess
Creative Commons Attribution 4.0 International
https://creativecommons.org/licenses/by/4.0/legalcode
Final Conference of the SFB 632 Information Structure: Advances in Information Structure Research 2003 - 2015, Berlin, 08-09 May 2015
ANNIS
Salt
Pepper
PAULA
corpus
corpora
linguist
multilayer
ANNIS, SaltNPepper & PAULA: A multilayer corpus infrastructure
info:eu-repo/semantics/conferencePoster