There is a newer version of the record available.

Published May 26, 2026 | Version 1
Preprint Open

Barycentric Simplicial Hashing for Approximate Nearest Neighbor Search: A Four-State Topological Hash Competitive with Industry-Standard Product Quantization at Half the Memory

Description

We present Barycentric Simplicial Hashing (BSH), a data-dependent binary indexing method for approximate nearest neighbor (ANN) search that uses the local triangulation of a vector space as a discrete coordinate system. Given a set of database vectors partitioned into Voronoi cells, we construct a k-NN simplicial complex within each cell and assign each vector a compact binary code by evaluating its barycentric zone relative to every triangle in the complex.

  

The key contribution is a four-state quantization per triangle: a vector is assigned to the zone of the nearest vertex (states 0, 1, 2) or to the barycentre zone (state 3) when the triangle centroid is the closest reference point. Empirical measurement confirms that the barycentre state is activated for over 51% of all triangle-vector assignments in 24-dimensional subspaces, making it the primary discriminator rather than a rare case.

  

On out-of-sample queries drawn from an independent distribution (separate random seed, never seen during index construction), the four-state hash achieves 84-90% Recall@1 in the top-10% of candidates within a Voronoi cell, using only 34-38 bytes per vector. Industry-standard FAISS IVF-PQ achieves comparable recall (85-90%) at 64 bytes per vector. BSH delivers competitive recall at approximately half the memory footprint. All methods are hardware-agnostic and were empirically validated on ARM Cortex-X3 using NEON intrinsics.

Files

pirolo2026_simplicial_hashing_v1.pdf

Files (304.8 kB)

Name Size Download all
md5:96742944ec8c96b5b927a1f503cbfb26
151.9 kB Preview Download
md5:7e7ef9d4590fce33c83a4981d74cb463
152.9 kB Preview Download

Additional details

Software

Repository URL
https://zenodo.org/records/20389817
Programming language
C++
Development Status
Active