Toward LLM-Assisted Policy Enforcement at the Kernel Boundary

Morrison, Sterling

doi:10.5281/zenodo.20048387

Published May 6, 2026 | Version 1

Report Open

Toward LLM-Assisted Policy Enforcement at the Kernel Boundary

Morrison, Sterling

Contributors

Researcher:

Morrison, Sterling

AI coding agents now execute file, process, and network operations on developer hosts with the user's full token authority. We study whether LLM-assisted policy verdicts can support runtime enforcement for these agents under syscall-blocking latency constraints. We describe a Windows runtime-guardrail architecture with kernel-mode hooks for file-system, network, and process events plus a userspace policy pipeline that routes ambiguous events to Claude Haiku 4.5 via AWS Bedrock and matches a small pre-registered pattern set synchronously for unambiguous high-risk actions.

In this paper we measure only the userspace verdict pipeline of the prototype, under a user-mode-fallback configuration in which the kernel drivers were not loaded on the test host. The measurement covers 1,247 events spanning 1,000 scenarios drawn from a five-category threat taxonomy for AI coding agents. We report: (E1) Bedrock round-trip and event-to-verdict latency CDFs at the prototype-default batching configuration, (E6) the impact of a synchronous fast-path that bypasses the LLM for unambiguous cases, (E5) a cost-vs-latency sweep across six batching configurations, and (E4) the geographic latency floor across three Bedrock regions.

Two findings dominate. First, an LLM-only critical path is fragile under typical batching: at the prototype-default configuration (BEDROCK_MAX_BATCH=10, BEDROCK_BATCH_DELAY=2.0 s), event-to-verdict p99 reaches 7,741 ms and 65% of Bedrock-routed events (excluding synchronous fast-path hits) exceed a 4 s userspace timeout in our measurements; counted across all events the rate is 56%. A re-tuned configuration with a shorter batch window (d=0.5 s) reduces the Bedrock-routed rate to 7%, at a 71% increase in API calls. Second, where a synchronous fast-path is implemented, architectural placement matters as much as the pattern set: the same six pre-registered patterns produce p99 fast-path latency of 3,617 ms when executed inside the single-threaded LLM reviewer, versus 1.00 ms when executed synchronously on the publishing thread — a 3,617-fold reduction with no change to the patterns or workload. The fast-path's coverage in our pilot pattern set is small (about 14% of events), so the architectural finding is a latency claim, not a coverage claim. The results support a hybrid design: deterministic synchronous controls for unambiguous high-risk actions, with the LLM reserved for slower semantic review on a path that is not load-bearing for the intended kernel-mode deadline.

Files

whitepaper.pdf

Files (503.9 kB)

Name	Size	Download all
whitepaper.pdf md5:e12f8317fefaaa007e3012ee37a62290	503.9 kB	Preview Download

Additional details

Repository URL: https://github.com/gdf-ai/agent-runtime-guardrail

	All versions	This version
Views	4	4
Downloads	0	0
Data volume	0 Bytes	0 Bytes

Toward LLM-Assisted Policy Enforcement at the Kernel Boundary

Authors/Creators

Contributors

Researcher:

Description

Files

whitepaper.pdf

Files (503.9 kB)

Additional details

Software