Published April 20, 2020 | Version v1
Conference paper Open

Capturing Evolution in Word Usage: Just Add More Clusters?

  • 1. Jozef Stefan Institute
  • 2. LIMSI - CNRS
  • 3. University of Helsinki

Description

The way the words are used evolves through time, mirroring cultural or technological evolution of society. Semantic change detection is the task of detecting and analysing word evolution in textual data, even in short periods of time. In this paper we focus on a new set of methods relying on contextualised embeddings, a type of semantic modelling that revolutionised the NLP field recently. We leverage the ability of the transformer-based BERT model to generate contextualised embeddings capable of detecting semantic change of words across time. Several approaches are compared in a common setting in order to establish strengths and weaknesses for each of them. We also propose several ideas for improvements, managing to drastically improve the performance of existing approaches.

Files

WWW_2020_Capturing Evolution in Word Usage.pdf

Files (576.2 kB)

Name Size Download all
md5:8270379254df254538390fc1729b81d2
576.2 kB Preview Download

Additional details

Funding

European Commission
NewsEye - NewsEye: A Digital Investigator for Historical Newspapers 770299