Query reformulation based on word embeddings: A comparative study

Panos Panagiotou; George Kalpakis; Theodora Tsikrika; Stefanos Vrochidis; Ioannis Kompatsiaris

doi:10.5281/zenodo.3947769

Published April 30, 2020 | Version 1.0

Book chapter Open

Query reformulation based on word embeddings: A comparative study

1. Information Technologies Institute, Centre for Research and Technology Hellas

Formulating effective queries for retrieving domain-specific content from the Web and social media is very important for practitioners in several fields, including law enforcement analysts involved in terrorism-related investigations. Query reformulation aims at transforming the original query in such a way, so as to increase the search effectiveness by addressing the vocabulary mismatch problem. This work presents a study comparing the performance of global versus local word embeddings models when applied for query expansion. Two query expansions methods are employed (i.e., CombSum and Centroid) for defining the most similar terms to each query term, based on Glove pre-trained global embeddings and local models trained on four large-scale benchmark and one terrorism-related datasets. We assessed the performance of the global and local models on the benchmark datasets based on commonly used evaluation metrics, and performed a qualitative evaluation of the respective models on the terrorism-related dataset. Our findings indicate that the local models yield promising results on all datasets.

Files

Query reformulation based on word embeddings.pdf

Files (563.0 kB)

Name	Size	Download all
Query reformulation based on word embeddings.pdf md5:d1a1b274a55c547cc112a79775f08162	563.0 kB	Preview Download

Additional details

European Commission
TENSOR - Retrieval and Analysis of Heterogeneous Online Content for Terrorist Activity Recognition 700024
European Commission
CONNEXIONs - InterCONnected NEXt-Generation Immersive IoT Platform of Crime and Terrorism DetectiON, PredictiON, InvestigatiON, and PreventiON Services 786731

	All versions	This version
Views	375	374
Downloads	407	404
Data volume	233.6 MB	231.9 MB

Query reformulation based on word embeddings: A comparative study

Creators

Description

Files

Query reformulation based on word embeddings.pdf

Files (563.0 kB)

Additional details

Funding