Published February 27, 2026 | Version v1
Preprint Open

An Empirical Study of Sparse, Dense, and Hybrid Retrieval for StackOverflow Question Answering

Authors/Creators

Description

We present a systematic empirical evaluation of sparse (TF-IDF), dense (MiniLM embeddings), and hybrid retrieval methods on a 19,965-document StackOverflow question–answer corpus. Retrieval performance is evaluated using Recall@K and Mean Reciprocal Rank (MRR) across 1,000 benchmark queries. Dense retrieval achieves Recall@5 = 0.779 and MRR = 0.670, significantly outperforming the sparse baseline (Recall@5 = 0.394, MRR = 0.292). A weighted hybrid method (α = 0.8) slightly improves Recall@5 to 0.789 and Recall@10 to 0.843 while marginally reducing ranking precision. These findings highlight trade-offs between recall coverage and ranking precision in domain-specific question answering systems.

Files

research (2).pdf

Files (197.5 kB)

Name Size Download all
md5:cbc55baebe349ee451f33aa2d9a800a0
197.5 kB Preview Download

Additional details

Dates

Submitted
2026-02-28