Published January 15, 2026 | Version v1
Publication Open

Designing a Controlled Retrieval-Augmented System for Behavioral Case Analysis: Methodological Foundations of the PROTEX Prototype

Authors/Creators

Description

The increasing use of retrieval-augmented and large language model–based systems in criminal justice research has raised concerns regarding transparency, auditability, and control over evidentiary boundaries in behavioral case analysis. Many existing applications emphasize prediction or generative synthesis, limiting their suitability for analytically sensitive and high-risk domains.

This article presents Stage I of the PROTEX project as a methodological proof of concept demonstrating how a retrieval-augmented architecture can be designed as a controlled, deterministic fact-extraction framework for criminal justice case analysis. The system operates on a structured corpus of fifty historically documented violent offender cases and enforces a strict case-boundary mechanism to prevent cross-case data contamination.

System performance was evaluated using structured test queries assessing extraction accuracy, refusal behavior in the absence of relevant data, and comparative analytical tasks. Results demonstrate reliable single-case extraction and consistent enforcement of evidentiary constraints, indicating that conservatively designed retrieval-augmented systems can support transparent, auditable, and accountable criminal justice analysis.

Files

Designing_a_Controlled_Retrieval-Augmented_System_PROTEX_Stage_I.pdf

Files (232.4 kB)