Grep is All You Need: Zero-Preprocessing Knowledge Retrieval for LLM Agents
Description
Retrieval-Augmented Generation (RAG) has become the dominant paradigm for grounding Large Language Model (LLM) agents in domain-specific knowledge. The standard approach requires selecting an embedding model, designing a chunking strategy, deploying a vector database, maintaining indexes, and performing approximate nearest neighbor (ANN) search at query time. We argue that for domain-specific knowledge grounding — where the vocabulary is predictable and the corpus is bounded — this entire stack is unnecessary.
We present *Knowledge Search*, a two-layer retrieval system composed of (1) `grep` with contextual line windows over raw source texts and (2) `grep` over LLM-compiled per-source concept and FAQ files generated nightly by a free, local, autonomous compilation pipeline. Deployed in production across **76 specialized LLM agents** serving three knowledge domains (Traditional Chinese Medicine, Christian spiritual classics, U.S. civics) — grounded in approximately **500 primary source texts and ~180 MB of corpus** spanning two languages and four-and-a-half millennia of human thought, served by a single Mac mini — our approach achieves 100% retrieval accuracy with sub-10ms latency, zero per-query preprocessing, zero additional memory footprint, and zero infrastructure dependencies.
We also document a reproducible failure-and-recovery cycle (0/5 fabricated quotes → 4/4 grep-verified quotes after a 25-minute fix that touched only text files on disk) which demonstrates the architecture's safety properties are recoverable through prompt hygiene alone — no retraining, no infrastructure change. The key insight is simple: retrieval does not need intelligence. The LLM is the intelligence.
Bilingual paper (English + 中文). Production system: https://faith.localkin.ai · https://heal.localkin.ai. Code: https://github.com/LocalKinAI/grep-is-all-you-need.
Files
grep_is_all_you_need.pdf
Files
(1.6 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:5000eea5bec7184fe7025bbabd898234
|
1.6 MB | Preview Download |
Additional details
Related works
- Is identical to
- Preprint: https://www.localkin.dev/papers/grep-is-all-you-need (URL)
- Is part of
- Software: https://github.com/LocalKinAI (URL)
- Is supplemented by
- Software: https://github.com/LocalKinAI/grep-is-all-you-need (URL)
Dates
- Created
-
2026-04-08Original draft (v1.0)
Software
- Repository URL
- https://github.com/LocalKinAI/grep-is-all-you-need
- Programming language
- Shell , Python , Go
- Development Status
- Active
References
- Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30. Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpathy, A., Goyal, N., Küttler, H., Lewis, M., Yih, W., Rocktäschel, T., Riedel, S., & Kiela, D. (2020). Retrieval-augmented generation for knowledge-intensive NLP tasks. Advances in Neural Information Processing Systems, 33, 9459-9474. Thompson, K. (1973). The UNIX command language. Structured Programming, Infotech State of the Art Report, 375-384. Robertson, S. E., & Zaragoza, H. (2009). The probabilistic relevance framework: BM25 and beyond. Foundations and Trends in Information Retrieval, 3(4), 333-389. Edge, D., Trinh, H., Cheng, N., Bradley, J., Chao, A., Mody, A., Truitt, S., & Larson, J. (2024). From local to global: A graph RAG approach to query-focused summarization. arXiv preprint arXiv:2404.16130. Karpukhin, V., Oğuz, B., Min, S., Lewis, P., Wu, L., Edunov, S., Chen, D., & Yih, W. (2020). Dense passage retrieval for open-domain question answering. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 6769-6781. Shinn, N., Cassano, F., Berman, E., Gopinath, A., Narasimhan, K., & Yao, S. (2023). Reflexion: Language agents with verbal reinforcement learning. Advances in Neural Information Processing Systems, 36. arXiv:2303.11366. Madaan, A., Tandon, N., Gupta, P., Hallinan, S., Gao, L., Wiegreffe, S., Alon, U., Dziri, N., Prabhumoye, S., Yang, Y., Welleck, S., Majumder, B. P., Gupta, S., Yazdanbakhsh, A., & Clark, P. (2023). Self-Refine: Iterative refinement with self-feedback. Advances in Neural Information Processing Systems, 36. arXiv:2303.17651.