Evaluating Graph-RAG Systems for Historical Transport Networks: A Berlin Case Study
Description
This research project develops and evaluates specialised Retrieval-Augmented Generation (RAG) approaches for historical knowledge graphs, using Berlin's public transportation system during the Cold War era (1945-1989) as a case study
While traditional RAG systems excel at processing unstructured text, they often struggle with the complex, highly structured temporal data found in historical databases. This study addresses that gap by establishing a comprehensive evaluation framework for "Graph-RAG" systems in the digital humanities.
Key Findings The research demonstrates that no single pipeline excels at all historical query types. While NL-to-Cypher approaches dominated factual retrieval, they struggled with interpretive synthesis. The study found that a multi-pipeline architecture, which intelligently routes user queries to the most appropriate retrieval strategy, achieved a marked improvement in answer quality compared to single-pipeline approaches. Additionally, the findings highlight significant limitations in applying standard community detection algorithms (like Leiden) to sparse, linear infrastructure networks.
Files in this Record:
- Abschlussbericht.pdf: The complete final research report (13 pages) detailing methodology, pipeline architecture, evaluation framework, and findings.
- question_design_methodology.md: A comprehensive guide to the user-centered evaluation framework, including theoretical grounding in digital humanities tool adoption.
- rubric.md: The detailed scoring criteria (0-3 scale) used for evaluating historical accuracy, context retention, and explanatory capability across 33 questions.
- berlin_transport_questions.csv: The complete taxonomy of 33 evaluation questions spanning 6 user personas, 5 difficulty levels, and 5 dimensional categories.
- user_personas.md: Six detailed user archetypes (from neighborhood historians to data journalists) representing the target audience for the public-facing system.
Related Resources
-
Live Demo: https://berlin-transport-history.de/chat
-
Code Repository: https://scm.cms.hu-berlin.de/baumanoa/graph-rag
Funding Acknowledgement Dieses Projekt wurde durch das BMFTR-Datenkompetenzzentrum HERMES gefördert.
Files
Abschlussbericht.pdf
Files
(780.2 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:7a84b6d4d8141bc357c3c6adee0563c4
|
621.5 kB | Preview Download |
|
md5:1cc0ff3dec4f656d36a8687ebbc5c431
|
6.3 kB | Preview Download |
|
md5:c39e9d49d675de9dd6d00e95a0afdead
|
38.9 kB | Preview Download |
|
md5:bcc60b42e6d5dce57fe9a2f76b2d94d0
|
1.9 kB | Preview Download |
|
md5:7826fade7e3c329012b8469d99ca8aa5
|
91.2 kB | Preview Download |
|
md5:8859a7e01b8b216a82aa041ae2d706ac
|
20.4 kB | Preview Download |
Additional details
Software
- Repository URL
- https://scm.cms.hu-berlin.de/baumanoa/graph-rag
- Programming language
- Python , JavaScript , TypeScript
- Development Status
- Active