Published October 5, 2023 | Version 1.0
Dataset Open

SoftwaresOccitanTranslations corpus

  • 1. Lo Congrès

Description

The SoftwaresOccitanTranslations corpus was compiled and aligned from translations of many open source softwares. This work was made by Lo Congrès permanent de la lenga occitana (https://locongres.org) as a part of its projetc "Còrpus" (http://abrac.at/corpusproject).

There are three corpora :

  • A bilingual corpus : files with occitan sentences aligned with their translations in another language.
  • A bivariety corpus : files with occitan sentences in one variety aligned with their transcription in another occitan varietiy.
  • A monolingual corpus : files with occitan sentences in one variety.

Thanks to all the people who contributed to the translations we used to build this corpus.

 

Files

bilingual_files.zip

Files (238.6 MB)

Name Size Download all
md5:313ce93ee1e18ef914c76fb171628ee9
237.0 MB Preview Download
md5:ab50bb7befde7046be0adfcca417f6b3
620.8 kB Preview Download
md5:54df1e94a46354bf2efed80f23bbc30a
11.0 kB Preview Download
md5:a5b2585de5ee43c10799f282d9db03da
979.2 kB Preview Download
md5:37f1b9092f6759d3d4ec052a425885cd
1.5 kB Preview Download
md5:112053087dfb779b3cd27e29812a0ea9
2.2 kB Preview Download