THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES

Martina Toshevska, Frosina Stojanovska and Jovan Kalajdjieski

doi:10.5281/zenodo.3943770

Published July 14, 2020 | Version v1

Journal article Open

THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES

Martina Toshevska, Frosina Stojanovska and Jovan Kalajdjieski

Distributed language representation has become the most widely used technique for language representation in various natural language processing tasks. Most of the natural language processing models that are based on deep learning techniques use already pre-trained distributed word representations, commonly called word embeddings. Determining the most qualitative word embeddings is of crucial importance for such models. However, selecting the appropriate word embeddings is a perplexing task since the projected embedding space is not intuitive to humans.In this paper, we explore different approaches for creating distributed word representations. We perform an intrinsic evaluation of several state-of-the-art word embedding methods. Their performance on capturing word similarities is analysed with existing benchmark datasets for word pairs similarities. The research in this paper conducts a correlation analysis between ground truth word similarities and similarities obtained by different word embedding methods.

Files

1.pdf

Files (2.3 MB)

Name	Size	Download all
1.pdf md5:1ba8d8ef8399c5ec8af661b8c8972aaa	2.3 MB	Preview Download

Citations

Oops! Something went wrong while fetching results.

	All versions	This version
Views	38	38
Downloads	83	83
Data volume	187.2 MB	187.2 MB

THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES

Creators

Description

Files

1.pdf

Files (2.3 MB)