Hound: Relation-First Knowledge Graphs for Complex-System Reasoning in Security Audits

Bernhard, Mueller

doi:10.5281/zenodo.17129271

Published September 15, 2025 | Version v2

Preprint Open

Hound: Relation-First Knowledge Graphs for Complex-System Reasoning in Security Audits

Bernhard, Mueller (Researcher)

Hound is a graph-based audit agent that improves system-level reasoning across interrelated components in complex codebases. Instead of relying on broad file chunks or language-specific tooling, Hound builds flexible, analyst-defined knowledge graphs (e.g., monetary/value flows, authentication/authorization roles, call graphs, invariants) with compact annotations. Investigations are planned in two phases: a Coverage sweep to quickly map components, then an Intuition/Saliency phase that targets high-impact, contradiction-rich leads. A persistent belief system tracks hypotheses with explicit evidence and confidence, while a QA Finalizer reviews high-confidence items over full source context to confirm or reject findings.

On a five-project subset of ScaBench, Hound raises micro recall and F1 over a baseline LLM analyzer (recall 31.2% vs. 8.3%; F1 14.2% vs. 9.8%) at a modest precision trade-off typical of exploratory audits. Gains stem from relation-first graphs that enable exact, cross-component retrieval and a disciplined hypothesis lifecycle. The artifact includes code, graph builders, benchmark harnesses, and scripts to reproduce tables and HTML reports.

Files

mueller_hound_2025_v2.pdf

Files (371.3 kB)

Name	Size	Download all
mueller_hound_2025_v2.pdf md5:6f2e6fb0e0f0cf646f94206cc1a3b639	371.3 kB	Preview Download

Additional details

Repository URL: https://github.com/muellerberndt/hound
Programming language: Python

C. Jimenez et al. SWE-bench: Can Language Models Resolve Real-World GitHub Issues? arXiv:2310.06770, 2023.
Y. Wang et al. CodeRAG-Bench: Can Retrieval Augment Code Generation? arXiv:2406.14497, 2024.
S. Ouyang et al. RepoGraph: Enhancing AI Software Engineering with Repository-level Code Graph. arXiv:2410.14684, 2024.
X. Liu et al. CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases. In Proc. NAACL-HLT 2025, 2025.
J. Spracklen et al. We Have a Package for You! A Comprehensive Analysis of Package Hallucinations by Code Generating LLMs. arXiv:2406.10279, 2024.
H. Xu et al. CKGFuzzer: LLM-Based Fuzz Driver Generation Enhanced By Code Knowledge Graph. arXiv:2411.11532, 2024.
H. Zhang et al. AutoCodeRover: Autonomous Program Improvement. In Proc. ISSTA, 2024.
J. Guo et al. RepoAudit: An Autonomous LLM-Agent for Repository-Level Code Auditing. In Proc. ICML, 2025.
H. Jelodar, M. Meymani, and R. Razavi-Far. Large Language Models (LLMs) for Source Code Analysis: Applications, Models and Datasets. arXiv:2503.17502, 2025.
R. Singh et al. Code Researcher: Deep Research Agent for Large Systems Code and Commit History. arXiv:2506.11060, 2025.
R. Rao et al. Insights, Techniques, and Evaluation for LLM-Driven Knowledge Graphs. NVIDIA Technical Blog, 2024.
N. Balić. Building a Knowledge Graph of Your Codebase. Daytona Blog, 2024.
D. Yang et al. Code to Think, Think to Code: A Survey on Code-Enhanced Reasoning and Reasoning-Driven Code Intelligence in LLMs. arXiv:2502.19411, 2025.

	All versions	This version
Views	1,782	1,087
Downloads	1,287	689
Data volume	681.0 MB	382.8 MB

Hound: Relation-First Knowledge Graphs for Complex-System Reasoning in Security Audits

Authors/Creators

Description

Files

mueller_hound_2025_v2.pdf

Files (371.3 kB)

Additional details

Software

References