Working paper Open Access

Evaluating the Impact of Bilingual Lexical Resources on Cross-lingual Sentiment Projection in the Pharmaceutical Domain

Hartung, Matthias; Orlikowski, Matthias; Veríssimo, Susana

Rolling out text analytics applications or individual components thereof to multiple input languages of interest requires scalable workflows and architectures that do not rely on manual annotation efforts or language-specific re-engineering per target language. These scalability challenges aggravate even further if specialized technical domains are targeted in multiple languages. In recent work, it has been shown that cross-lingual projection of sentiment models in deep learning frameworks based on bilingual sentiment embeddings (BLSE) is feasible without any annotated data in the target language, capitalizing on monolingual embeddings and a bilingual translation dictionary only (Barnes et al., 2018). We use their framework and apply it to multilingual text analytics problems in the pharmaceutical domain in order to (i) investigate under which conditions the BLSE approach scales to technical domains as well, and (ii) assess the impact of different configurations of underlying lexical resources. For the language pair English/Spanish, our findings corroborate the strength of cross-lingual projection approaches such as BLSE in technical scenarios, given the availability of bilingual resources that provide broad lexical coverage, on the one hand, and complementary domain- and task-specific knowledge, on the other.

Files (397.9 kB)
Name Size
draft.pdf
md5:70b1866b2409207288fc8ac994e9614f
397.9 kB Download
59
43
views
downloads
All versions This version
Views 5959
Downloads 4343
Data volume 17.1 MB17.1 MB
Unique views 5050
Unique downloads 4040

Share

Cite as