An Empirical Study of Sparse, Dense, and Hybrid Retrieval for StackOverflow Question Answering
Authors/Creators
Description
We present a systematic empirical evaluation of sparse (TF-IDF), dense (MiniLM embeddings), and hybrid retrieval methods on a 19,965-document StackOverflow question–answer corpus. Retrieval performance is evaluated using Recall@K and Mean Reciprocal Rank (MRR) across 1,000 benchmark queries. Dense retrieval achieves Recall@5 = 0.779 and MRR = 0.670, significantly outperforming the sparse baseline (Recall@5 = 0.394, MRR = 0.292). A weighted hybrid method (α = 0.8) slightly improves Recall@5 to 0.789 and Recall@10 to 0.843 while marginally reducing ranking precision. These findings highlight trade-offs between recall coverage and ranking precision in domain-specific question answering systems.
Files
research (2).pdf
Files
(197.5 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:cbc55baebe349ee451f33aa2d9a800a0
|
197.5 kB | Preview Download |
Additional details
Related works
- Is supplemented by
- Model: https://github.com/aditig80/Hybrid-Sparse-Dense-Retrieval-for-Improving-LLM-Grounded-Responses (URL)
Dates
- Submitted
-
2026-02-28
Software
- Repository URL
- https://github.com/aditig80/Hybrid-Sparse-Dense-Retrieval-for-Improving-LLM-Grounded-Responses
- Programming language
- Python
- Development Status
- Active