Published December 2024 | Version v1
Report Open

Hybrid Retrieval Systems for Enterprise Knowledge

Description

Enterprise knowledge retrieval must search across policy documents, product manuals, incident reports, source repositories, tickets, relational records, and domain-specific analytical stores. Pure lexical retrieval is robust and transparent but can miss paraphrased evidence. Dense retrieval improves semantic recall but may ignore entity scope, temporal validity, permissions, and structured constraints. This paper presents Governed Hybrid Evidence Retrieval (GHER), a retrieval architecture that combines sparse lexical matching, dense passage retrieval, relational filtering, late interaction reranking, and evidence-pack assembly for enterprise knowledge tasks. GHER treats every answerable query as a constrained evidence request: passages are retrieved only within authorized entity scope, scored by multiple retrieval views, fused using calibrated rank aggregation, and returned with structured provenance. A prototype evaluation over policy, support, analytics, and operations corpora shows that GHER improves evidence recall at 10 from 0.64 to 0.78, raises constraint-correct evidence coverage from 0.71 to 0.90, and reduces unsupported response cases by 37.5% compared with dense-only retrieval. The results suggest that enterprise retrieval should be hybrid by design because correctness depends on both semantic relevance and governed structured constraints.

Files

paper.pdf

Files (130.0 kB)

Name Size Download all
md5:e17cce559ee0513587ffde3d160df3a9
130.0 kB Preview Download