Aegis Insight: Knowledge Graph Infrastructure for Detecting Suppression and Coordination Patterns in Document Corpora
Authors/Creators
Description
Current retrieval-augmented generation (RAG) systems optimize for information retrieval accuracy while remaining blind to the epistemological structure of information landscapes—specifically, whether high-quality research is systematically marginalized or whether consensus is manufactured through coordinated messaging. Unlike misinformation detection systems that identify false claims, we address fundamentally different questions: whether true claims are systematically suppressed and whether consensus around accurate information is artificially manufactured. We present Aegis Insight, an open-source knowledge graph system that detects three categories of information manipulation patterns: suppression (quality-visibility gaps, network isolation, institutional dismissal without engagement), coordination (temporal clustering, language similarity, synchronized emotional triggers), and cross-cultural anomalies (isolated cultures exhibiting identical complex patterns). The system employs a seven-dimensional extraction pipeline processing documents into typed claims with entities, temporal markers, geographic references, citations, emotional content, and authority-domain relationships. Detection algorithms implement threshold-based "Goldfinger" scoring where isolated indicators score minimally but accumulated indicators trigger exponential escalation. We validate against historical ground truth: Thomas Paine (documented state suppression, 1790s), Smedley Butler (Business Plot suppression, 1930s), and Yellow Journalism coverage of the USS Maine explosion (documented coordination, 1898). Results demonstrate effective discrimination: suppressed figures score 0.78–0.83 (CRITICAL) while non-suppressed controls (Benjamin Franklin) score 0.39 (MODERATE) with zero suppression indicators. Coordination detection correctly identifies the Yellow Journalism campaign with 66 near-identical phrases across sources, 24 claims within a 14-day window, and synchronized emotional triggers (41.7% fear, 30.1% urgency). The system runs entirely locally on consumer hardware via Docker using local LLM inference through Ollama, with checkpointing for multi-day extraction of large corpora. We release Aegis Insight as open-source infrastructure for epistemological analysis, suitable for integration with existing RAG systems via Model Context Protocol (MCP) endpoints.
Files
Aegis_Insight_Paper_v5.pdf
Files
(199.4 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:ee62e3d92eba9c2b6344285ee41b4859
|
199.4 kB | Preview Download |
Additional details
Dates
- Submitted
-
2026-01-11Initial Submission
Software
- Repository URL
- https://github.com/Eleutherios-project/Eleutherios-docker
- Programming language
- Python
- Development Status
- Active
References
- Archer, J. (2007). The Plot to Seize the White House: The Shocking True Story of the Conspiracy to Overthrow F.D.R. Skyhorse Publishing.
- Bornmann, L., & Daniel, H. D. (2008). What do citation counts measure? A review of studies on citing behavior. Journal of Documentation, 64(1), 45–80.
- Campbell, W. J. (2001). Yellow Journalism: Puncturing the Myths, Defining the Legacies. Praeger.
- Fister, I., et al. (2016). Toward the discovery of citation cartels in citation networks. Frontiers in Physics, 4, 49.
- Fruchtman, J. (1994). Thomas Paine: Apostle of Freedom. Four Walls Eight Windows.
- Gao, Y., et al. (2023). Retrieval-augmented generation for large language models: A survey. arXiv preprint arXiv:2312.10997.
- Horne, B., et al. (2019). Different spirals of sameness: A study of content sharing in mainstream and alternative media. Proceedings of the International AAAI Conference on Web and Social Media, 13, 257–266.
- Ji, S., et al. (2021). A survey on knowledge graphs: Representation, acquisition, and applications. IEEE Transactions on Neural Networks and Learning Systems, 33(2), 494–514.
- Jiang, Z., et al. (2023). Active retrieval augmented generation. arXiv preprint arXiv:2305.06983.
- Lewis, P., et al. (2020). Retrieval-augmented generation for knowledge-intensive NLP tasks. Advances in Neural Information Processing Systems, 33, 9459–9474.
- Rickover, H. G. (1976). How the Battleship Maine Was Destroyed. Naval History Division.
- Saad-Falcon, J., et al. (2024). ARES: An automated evaluation framework for retrieval-augmented generation systems. arXiv preprint arXiv:2311.09476.
- Schmidt, H. (1987). Maverick Marine: General Smedley D. Butler and the Contradictions of American Military History. University Press of Kentucky.
- Shu, K., et al. (2017). Fake news detection on social media: A data mining perspective. ACM SIGKDD Explorations Newsletter, 19(1), 22–36.
- Wu, L., et al. (2022). Graph neural networks for natural language processing: A survey. Foundations and Trends in Machine Learning, 16(2), 119–328.
- Zhou, X., & Zafarani, R. (2020). A survey of fake news: Fundamental theories, detection methods, and opportunities. ACM Computing Surveys, 53(5), 1–40.