Info: Zenodo’s user support line is staffed on regular business days between Dec 23 and Jan 5. Response times may be slightly longer than normal.

Published December 15, 2022 | Version v1
Workflow Open

Supplement the manuscript "Sourcing Data from Wikipedia for the Study of Language Contact: the csbwiki"

  • 1. Institute of Slavic Studies, Polish Academy of Sciences

Description

Data and workflow supplement to the paper Sourcing Data from Wikipedia for the Study of Language Contact: the csbwiki

Author: Robert Borges,

Affiliation: Institute of Slavic Studies, Polish Academy of Sciences

Funding acknowledgment: This work is a result of research conducted under the auspices of the project New Speakers of Minority Languages: Proficiency, Variation, and Change 2021–2023, hosted at the Institute of Slavic Studies: Polish Academy of Sciences, funded by the Polish National Science Centre (NCN) under the “POLS” instrument financed by Norway Grants. Contract nr. 2020/37/K/HS2/02779.

Included Files:

  • babel_by_language.json
  • babel_by_user.json
  • ck-filtered-current.py
  • count-filtered.py
  • count-moe.py
  • count_oe-a_variants.py
  • count-pages.py
  • count-words.py
  • csbwiki-20220501-pages-meta-history.xml
  • csbwiki-20220520-babel.sql
  • csbwiki-latest-pages-articles-multistream.xml
  • file-list.txt
  • jsonify-babel.py
  • ma-dump_filtered-1.json
  • ma-dump_filtered-2.json
  • ma-dump_filtered-3.json
  • ma-dump_filtered-4.json
  • ma-dump_he-has.json
  • ma_non-NEG_variants-dump.json
  • ma_variants-dump.json
  • mk_oe-a_variants.py
  • moe-ma_variants-dump.json
  • moe_variants-dump.json
  • ni-moe.txt
  • oe-a_counts.json
  • oe-a_variants.json
  • oe-words.txt
  • raw_oe-a_counts.json
  • README.md
  • README.pdf
  • re-dump-variants.py
  • re-filter-dump.py

Files

csbwiki-manuscript-supplement.zip

Files (37.2 MB)

Name Size Download all
md5:d2f67f80976a6fc444dae6dabde9dda7
37.2 MB Preview Download