Published January 11, 2026 | Version 1.0
Data paper Open

Aegis Insight: Knowledge Graph Infrastructure for Detecting Suppression and Coordination Patterns in Document Corpora

Authors/Creators

Description

Current retrieval-augmented generation (RAG) systems optimize for information retrieval accuracy while remaining blind to the epistemological structure of information landscapes—specifically, whether high-quality research is systematically marginalized or whether consensus is manufactured through coordinated messaging. Unlike misinformation detection systems that identify false claims, we address fundamentally different questions: whether true claims are systematically suppressed and whether consensus around accurate information is artificially manufactured. We present Aegis Insight, an open-source knowledge graph system that detects three categories of information manipulation patterns: suppression (quality-visibility gaps, network isolation, institutional dismissal without engagement), coordination (temporal clustering, language similarity, synchronized emotional triggers), and cross-cultural anomalies (isolated cultures exhibiting identical complex patterns). The system employs a seven-dimensional extraction pipeline processing documents into typed claims with entities, temporal markers, geographic references, citations, emotional content, and authority-domain relationships. Detection algorithms implement threshold-based "Goldfinger" scoring where isolated indicators score minimally but accumulated indicators trigger exponential escalation. We validate against historical ground truth: Thomas Paine (documented state suppression, 1790s), Smedley Butler (Business Plot suppression, 1930s), and Yellow Journalism coverage of the USS Maine explosion (documented coordination, 1898). Results demonstrate effective discrimination: suppressed figures score 0.78–0.83 (CRITICAL) while non-suppressed controls (Benjamin Franklin) score 0.39 (MODERATE) with zero suppression indicators. Coordination detection correctly identifies the Yellow Journalism campaign with 66 near-identical phrases across sources, 24 claims within a 14-day window, and synchronized emotional triggers (41.7% fear, 30.1% urgency). The system runs entirely locally on consumer hardware via Docker using local LLM inference through Ollama, with checkpointing for multi-day extraction of large corpora. We release Aegis Insight as open-source infrastructure for epistemological analysis, suitable for integration with existing RAG systems via Model Context Protocol (MCP) endpoints.

Files

Aegis_Insight_Paper_v5.pdf

Files (199.4 kB)

Name Size Download all
md5:ee62e3d92eba9c2b6344285ee41b4859
199.4 kB Preview Download

Additional details

Dates

Submitted
2026-01-11
Initial Submission

Software

Repository URL
https://github.com/Eleutherios-project/Eleutherios-docker
Programming language
Python
Development Status
Active

References

  • Archer, J. (2007). The Plot to Seize the White House: The Shocking True Story of the Conspiracy to Overthrow F.D.R. Skyhorse Publishing.
  • Bornmann, L., & Daniel, H. D. (2008). What do citation counts measure? A review of studies on citing behavior. Journal of Documentation, 64(1), 45–80.
  • Campbell, W. J. (2001). Yellow Journalism: Puncturing the Myths, Defining the Legacies. Praeger.
  • Fister, I., et al. (2016). Toward the discovery of citation cartels in citation networks. Frontiers in Physics, 4, 49.
  • Fruchtman, J. (1994). Thomas Paine: Apostle of Freedom. Four Walls Eight Windows.
  • Gao, Y., et al. (2023). Retrieval-augmented generation for large language models: A survey. arXiv preprint arXiv:2312.10997.
  • Horne, B., et al. (2019). Different spirals of sameness: A study of content sharing in mainstream and alternative media. Proceedings of the International AAAI Conference on Web and Social Media, 13, 257–266.
  • Ji, S., et al. (2021). A survey on knowledge graphs: Representation, acquisition, and applications. IEEE Transactions on Neural Networks and Learning Systems, 33(2), 494–514.
  • Jiang, Z., et al. (2023). Active retrieval augmented generation. arXiv preprint arXiv:2305.06983.
  • Lewis, P., et al. (2020). Retrieval-augmented generation for knowledge-intensive NLP tasks. Advances in Neural Information Processing Systems, 33, 9459–9474.
  • Rickover, H. G. (1976). How the Battleship Maine Was Destroyed. Naval History Division.
  • Saad-Falcon, J., et al. (2024). ARES: An automated evaluation framework for retrieval-augmented generation systems. arXiv preprint arXiv:2311.09476.
  • Schmidt, H. (1987). Maverick Marine: General Smedley D. Butler and the Contradictions of American Military History. University Press of Kentucky.
  • Shu, K., et al. (2017). Fake news detection on social media: A data mining perspective. ACM SIGKDD Explorations Newsletter, 19(1), 22–36.
  • Wu, L., et al. (2022). Graph neural networks for natural language processing: A survey. Foundations and Trends in Machine Learning, 16(2), 119–328.
  • Zhou, X., & Zafarani, R. (2020). A survey of fake news: Fundamental theories, detection methods, and opportunities. ACM Computing Surveys, 53(5), 1–40.