Published September 28, 2022 | Version v1
Journal article Open

Approximate Nearest Neighbor Search on Standard Search Engines

  • 1. ISTI CNR, Pisa, Italy

Description

Approximate search for high-dimensional vectors is commonly addressed using dedicated techniques often combined with hardware acceleration provided by GPUs, FPGAs, and other custom in-memory silicon. Despite their effectiveness, harmonizing those optimized solutions with other types of searches often poses technological difficulties. For example, to implement a combined text+image multimodal search, we are forced first to query the index of high-dimensional image descriptors and then filter the results based on the textual query or vice versa. This paper proposes a text surrogate technique to translate real-valued vectors into text and index them with a standard textual search engine such as Elasticsearch or Apache Lucene. This technique allows us to perform approximate kNN searches of high-dimensional vectors alongside classical full-text searches natively on a single textual search engine, enabling multimedia queries without sacrificing scalability. Our proposal exploits a combination of vector quantization and scalar quantization. We compared our approach to the existing literature in this field of research, demonstrating a significant improvement in performance through preliminary experimentation.

Files

2022_sisap_postprint.pdf

Files (515.1 kB)

Name Size Download all
md5:fb88a71c3998c0497af0af01fffc12c7
515.1 kB Preview Download

Additional details

Funding

European Commission
AI4Media - A European Excellence Centre for Media, Society and Democracy 951911