Published June 6, 2017 | Version v1
Conference paper Open

Balanced Search Space Partitioning for Distributed Media Redundant Indexing

  • 1. Universidade NOVA de Lisboa

Description

This paper addresses the problem of balanced, redundant indexing of media information. Our goal is to partition and distribute the search index, taking advantage of the distributed systems properties: balanced load across nodes, redundancy on node down and efficient node usage under concurrent querying. We follow an information compression approach to solve this problem and propose to represent data with overcomplete codebooks, where each document is represented by only a few codewords and an indexing node is responsible for several codewords. Quantization algorithms are designed to fit the original data as best as possible, leading to bias towards codewords that fit the principal directions of data. In this paper, we propose the balanced KSVD (B-KSVD) algorithm, that distributes the allocation of data across a balanced number of codewords, according to the global distribution of data. Indexing experiments showed that B-KSVD can achieve 38% 1-recall by inspecting only 1% of the full index, distributed over 10 partitions. Traditional methods based on k-means need to either use larger codebooks or to inspect a larger portion of the index to achieve the same retrieval performance.

Files

00-sig-alternate.pdf

Files (2.1 MB)

Name Size Download all
md5:a697b2580c1b1e367382819cdb0b189a
2.1 MB Preview Download

Additional details

Funding

European Commission
COGNITUS - Converging broadcast and user generated content for interactive ultra-high definition services 687605