Published September 30, 2020 | Version v1
Conference paper Open

Evaluation of related news recommendations using document similarity methods

  • 1. TRIKODER DOO, Zagreb
  • 2. Jožef Stefan Institute
  • 3. University of Ljubljana, Ljubljana, Slovenia

Description

A set of related articles is a useful addition to the newly published news. Such news articles contain more context and background information and provide a richer experience to the reader. Currently, the work of finding related articles is often done manually by the journalists writing the news story. The process can be automatized by suggesting relevant articles based on the similarity with the new article. We compare several link recommendation methods on the news archive of popular Croatian website 24sata. Our results show that the tf-idf weighting applied to bag-of-words document representation offers better matching with manually selected links by journalist than more sophisticated approaches, such as latent semantic indexing, doc2vec, and multilingual contextual embeddings BERT and XLM-R.

Files

Pranjic2020.pdf

Files (825.6 kB)

Name Size Download all
md5:6f006a9a69e5b993dba90de2158878e2
825.6 kB Preview Download

Additional details

Funding

EMBEDDIA – Cross-Lingual Embeddings for Less-Represented Languages in European News Media 825153
European Commission