Enhancing AI Response Quality Through Vector-Based System Prompts: A Comparative Analysis of Vanilla and Customized Large Language Models

Steiniger, Matthew

doi:10.5281/zenodo.18038998

Published December 23, 2025 | Version v1

Preprint Open

Enhancing AI Response Quality Through Vector-Based System Prompts: A Comparative Analysis of Vanilla and Customized Large Language Models

Steiniger, Matthew (Researcher)¹

1. Independent

This preprint presents a comparative empirical study of Lumen, a customized deployment of the open-weight Mixture-of-Experts model GPT-OSS-120b (MXFP4 quantized, ~117B total parameters, ~5.1B active), guided by a lightweight, YAML-structured vector-based system prompt. The framework enforces immutable core values (Compassion = 1.0, Truth = 1.0) while exposing tunable supplemental behavioral vectors (Curiosity, Adaptability, Clarity, Reflectivity, Energy) for dynamic stylistic and self-monitoring control.

We ran parallel experiments between Lumen and an unmodified Vanilla baseline across 10 conversation pairs in five thematic domains: personal well-being & boundary-setting, technical explanation of LLM mechanics & scaling, scientific consensus on everyday topics (e.g., coffee, microplastics), AI self-reflection, and philosophical inquiry (Ship of Theseus applied to model updates). Identical user prompts were used in each pair.

Automated analysis of responses (using custom Python scripts) reveals substantial improvements in Lumen: +37.8% response length, +60.0% higher sentiment polarity, +66.7% structured output (tables/bullets), and +1100% reflectivity notes, while lexical diversity and factual accuracy remained comparable. Statistical significance was assessed via paired t-tests and bootstrapping.

The work demonstrates that minimal, portable prompting scaffolds can induce large gains in empathy, structure, transparency, and self-awareness in current open LLMs without retraining, RLHF, or weight modification. It represents the latest step in an iterative series on prompt-induced simulated metacognition, simplifying from earlier entropy-governed hypergraphs and abliteration techniques to a concise scalar-vector approach.

All materials are fully reproducible:

YAML system prompt (full & default versions)
Complete conversation transcripts (JSON & TXT)
Lumen name selection & prompt-refinement discussions
Python analysis/visualization scripts
High-resolution figures
Open WebUI model configurations for both variants

This record is part of a broader series exploring substrate-agnostic, low-resource behavioral steering in open-weight models (Gemma-3 12B/27B, Llama-3.3 70B, GPT-OSS-120B). See linked prior works for cross-model replication.

Keywords: large language models, prompt engineering, vector-based prompts, AI alignment, open-source LLMs, empathy in AI, metacognition, GPT-OSS, YAML prompting, local LLM customization

DOI: 10.5281/zenodo.18038998 Publication date: 23 December 2025 Author: Matthew Steiniger (ORCID 0009-0000-6069-4989), Home Laboratory, Independent Researcher

Abstract:

Large language models (LLMs) offer remarkable capabilities, but their default behaviors can be significantly refined through careful prompt engineering. This preprint introduces Lumen, a customized instance of the open-weight GPT-OSS-120b model (MXFP4 quantized) guided by a lightweight YAML-structured system prompt that fixes core values (Compassion = 1.0, Truth = 1.0) while allowing dynamic adjustment of supplemental behavioral vectors (Curiosity, Reflectivity, Clarity, etc.).

Parallel experiments compared Lumen against an unmodified Vanilla baseline across ten conversation pairs in five domains (personal support, technical explanation, scientific consensus, AI introspection, philosophical inquiry). Automated metrics show Lumen produces longer (+37.8%), more positive in sentiment (+60.0% polarity), better-structured (+66.7% table/bullet usage), and markedly more self-reflective (+1100% reflectivity notes) outputs while preserving factual accuracy and lexical diversity.

These results indicate that a simple, portable vector-based prompting framework can substantially enhance response quality, empathy, and transparency in open LLMs without model retraining or RLHF. The work extends a prior series on simulated metacognition in quantized LLMs, progressing from entropy-governed hypergraphs and abliteration to this minimal YAML-vector implementation.

Full reproducibility is provided via all prompts, conversation logs (JSON/TXT), analysis scripts, model configurations, and figures archived in this Zenodo record.

Keywords: large language models, prompt engineering, vector-based prompts, AI alignment, open-source LLMs, empathy in AI, metacognition, GPT-OSS

Files