Hartung, Matthias
Orlikowski, Matthias
VerĂssimo, Susana
2020-03-12
<p>Rolling out text analytics applications or individual components thereof to multiple input languages of interest requires scalable workflows and architectures that do not rely on manual annotation efforts or language-specific re-engineering per target language. These scalability challenges aggravate even further if specialized technical domains are targeted in multiple languages. In recent work, it has been shown that cross-lingual projection of sentiment models in deep learning frameworks based on bilingual sentiment embeddings (BLSE) is feasible without any annotated data in the target language, capitalizing on monolingual embeddings and a bilingual translation dictionary only (Barnes et al., 2018). We use their framework and apply it to multilingual text analytics problems in the pharmaceutical domain in order to (i) investigate under which conditions the BLSE approach scales to technical domains as well, and (ii) assess the impact of different configurations of underlying lexical resources. For the language pair English/Spanish, our findings corroborate the strength of cross-lingual projection approaches such as BLSE in technical scenarios, given the availability of bilingual resources that provide broad lexical coverage, on the one hand, and complementary domain- and task-specific knowledge, on the other.</p>
https://doi.org/10.5281/zenodo.3707940
oai:zenodo.org:3707940
Zenodo
https://zenodo.org/communities/pret-a-llod
https://doi.org/10.5281/zenodo.3707939
info:eu-repo/semantics/openAccess
Creative Commons Attribution 4.0 International
https://creativecommons.org/licenses/by/4.0/legalcode
Multilingual Text Analytics
Cross-lingual Projection
Linguistic Linked Open Data
Sentiment Analysis
Pharmaceutical Domain
Evaluating the Impact of Bilingual Lexical Resources on Cross-lingual Sentiment Projection in the Pharmaceutical Domain
info:eu-repo/semantics/workingPaper