EMBEDDIA at SemEval-2022 Task 8: Investigating Sentence, Image, and Knowledge Graph Representations for Multilingual News Article Similarity

Elaine Zosa; Emanuela Boros; Boshko Koloski; Lidia Pivovarova

doi:10.5281/zenodo.6369944

Published March 19, 2022 | Version v1

Conference paper Open

EMBEDDIA at SemEval-2022 Task 8: Investigating Sentence, Image, and Knowledge Graph Representations for Multilingual News Article Similarity

In this paper, we present the participation of the EMBEDDIA team to the SemEval 2022 Task 8 (Multilingual News Article Similarity). We cover several techniques and propose different methods for finding the multilingual news article similarity by exploring the dataset in its entirety. We take advantage of the textual content of the articles, the provided metadata (e.g., titles, keywords, topics), the translated articles, the images (those that were available), and knowledge graph-based representations for entities and relations present in the articles. We, then, compute the semantic similarity between the different features and predict through regression the similarity scores. Our findings show that, while our researched methods obtained promising results, exploiting the semantic textual similarity with sentence representations is unbeatable. Finally, in the official SemEval 2022 Task 8, we ranked fifth in the overall team ranking cross-lingual results, and second in the English-only results.

Files

SemEval_2022___28_February_2022___5_pages___EMBEDDIA_at_SemEval_2022_Task_8__Investigating_SentenceImageand_Knowledge_Graph_Representations.pdf

Files (2.3 MB)

Name	Size	Download all
SemEval_2022___28_February_2022___5_pages___EMBEDDIA_at_SemEval_2022_Task_8__Investigating_Sentence__Image__and_Knowledge_Graph_Representations.pdf md5:152af77ead089f38cf624d3830cafb3d	2.3 MB	Preview Download

Additional details

European Commission
NewsEye - NewsEye: A Digital Investigator for Historical Newspapers 770299
European Commission
EMBEDDIA - Cross-Lingual Embeddings for Less-Represented Languages in European News Media 825153

	All versions	This version
Views	444	436
Downloads	290	288
Data volume	676.1 MB	671.4 MB

EMBEDDIA at SemEval-2022 Task 8: Investigating Sentence, Image, and Knowledge Graph Representations for Multilingual News Article Similarity

Authors/Creators

Description

Files

SemEval_2022___28_February_2022___5_pages___EMBEDDIA_at_SemEval_2022_Task_8__Investigating_Sentence__Image__and_Knowledge_Graph_Representations.pdf

Files (2.3 MB)

Additional details

Funding

SemEval_2022___28_February_2022___5_pages___EMBEDDIA_at_SemEval_2022_Task_8__Investigating_SentenceImageand_Knowledge_Graph_Representations.pdf