There is a newer version of the record available.

Published September 15, 2025 | Version v2
Preprint Open

Hound: Relation-First Knowledge Graphs for Complex-System Reasoning in Security Audits

Description

Hound is a graph-based audit agent that improves system-level reasoning across interrelated components in complex codebases. Instead of relying on broad file chunks or language-specific tooling, Hound builds flexible, analyst-defined knowledge graphs (e.g., monetary/value flows, authentication/authorization roles, call graphs, invariants) with compact annotations. Investigations are planned in two phases: a Coverage sweep to quickly map components, then an Intuition/Saliency phase that targets high-impact, contradiction-rich leads. A persistent belief system tracks hypotheses with explicit evidence and confidence, while a QA Finalizer reviews high-confidence items over full source context to confirm or reject findings.

On a five-project subset of ScaBench, Hound raises micro recall and F1 over a baseline LLM analyzer (recall 31.2% vs. 8.3%; F1 14.2% vs. 9.8%) at a modest precision trade-off typical of exploratory audits. Gains stem from relation-first graphs that enable exact, cross-component retrieval and a disciplined hypothesis lifecycle. The artifact includes code, graph builders, benchmark harnesses, and scripts to reproduce tables and HTML reports.

Files

mueller_hound_2025_v2.pdf

Files (371.3 kB)

Name Size Download all
md5:6f2e6fb0e0f0cf646f94206cc1a3b639
371.3 kB Preview Download

Additional details

Software

Repository URL
https://github.com/muellerberndt/hound
Programming language
Python

References

  • C. Jimenez et al. SWE-bench: Can Language Models Resolve Real-World GitHub Issues? arXiv:2310.06770, 2023.
  • Y. Wang et al. CodeRAG-Bench: Can Retrieval Augment Code Generation? arXiv:2406.14497, 2024.
  • S. Ouyang et al. RepoGraph: Enhancing AI Software Engineering with Repository-level Code Graph. arXiv:2410.14684, 2024.
  • X. Liu et al. CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases. In Proc. NAACL-HLT 2025, 2025.
  • J. Spracklen et al. We Have a Package for You! A Comprehensive Analysis of Package Hallucinations by Code Generating LLMs. arXiv:2406.10279, 2024.
  • H. Xu et al. CKGFuzzer: LLM-Based Fuzz Driver Generation Enhanced By Code Knowledge Graph. arXiv:2411.11532, 2024.
  • H. Zhang et al. AutoCodeRover: Autonomous Program Improvement. In Proc. ISSTA, 2024.
  • J. Guo et al. RepoAudit: An Autonomous LLM-Agent for Repository-Level Code Auditing. In Proc. ICML, 2025.
  • H. Jelodar, M. Meymani, and R. Razavi-Far. Large Language Models (LLMs) for Source Code Analysis: Applications, Models and Datasets. arXiv:2503.17502, 2025.
  • R. Singh et al. Code Researcher: Deep Research Agent for Large Systems Code and Commit History. arXiv:2506.11060, 2025.
  • R. Rao et al. Insights, Techniques, and Evaluation for LLM-Driven Knowledge Graphs. NVIDIA Technical Blog, 2024.
  • N. Balić. Building a Knowledge Graph of Your Codebase. Daytona Blog, 2024.
  • D. Yang et al. Code to Think, Think to Code: A Survey on Code-Enhanced Reasoning and Reasoning-Driven Code Intelligence in LLMs. arXiv:2502.19411, 2025.