Published May 12, 2023 | Version v1
Dataset Open

Supplementary material for "Using a parallel corpus to study patterns of word order variation: Determiners and quantifiers within the noun phrase in European languages"

  • 1. Universität des Saarlandes

Description

- output-{ciep,treebanks}-full.csv: frequency and entropy for all the categories, using four types of combinations of layers;
- plots.R: R script to draw plots from the output files;
- readReport-{CIEP+,treebanks}.R: R script to extract frequency and compute entropy from the report files (not included);
- ud-wordorder.py: Python script to extract word order pairs from conllu files and write them in report files.

Unfortunately, I cannot include the report files, as CIEP+ is protected by copyright; the analysis can be however replicated with respect to the UD Treebanks.

Files

output-ciep-full.csv

Files (59.8 kB)

Name Size Download all
md5:3490e726cb7ae8a7c542ccca25c186fd
7.4 kB Preview Download
md5:072c99d74c87d44fba01c4b617c78fe1
7.1 kB Preview Download
md5:21a0d14aa0deccce8c71eae61276e2ec
4.4 kB Download
md5:558f3a82a8d7db6a8fb2abb2d8ea2590
745 Bytes Preview Download
md5:403d8311e213527f72cf5110528dd1ca
17.2 kB Download
md5:4a47852787b422a40ad876cff96dd2db
18.0 kB Download
md5:a4926c54ca22506b318d7ab9a0c94308
5.0 kB Download