Published July 8, 2022
| Version v1
Conference paper
Open
Building of Parallel and Comparable Cybersecurity Corpora for Bilingual Terminology Extraction
Authors/Creators
- 1. Vytautas Magnus University
- 2. Mykolas Romeris University
Description
The paper aims at presenting English-Lithuanian corpora for bilingual term extraction (BiTE) in the cybersecurity domain within the framework of the project DVITAS. It is argued that a system of parallel, comparable, and training corpora for BiTE is particularly useful for less-resourced languages, as it allows efficiently to combine strengths and avoid weaknesses of comparable and parallel resources. A special focus is given to the availability of sources in the cybersecurity domain and issues related to copyright-protected publications, as well as the data curation performed for building the corpora and depositing them to CLARIN-LT repository.
Files
CLARIN2021_published article.pdf
Files
(566.3 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:7530794a65b73ec1537dc5b9b25036dc
|
566.3 kB | Preview Download |