Structural Effects of the Knowledge Innovation System on AI Judgment Patterns: — 3 Models × 4 Conditions × 5 Questions × 30 Repetitions —

Hasegawa, Hiroyasu

doi:10.5281/zenodo.19446064

Published March 26, 2026 | Version v2

Preprint Open

Structural Effects of the Knowledge Innovation System on AI Judgment Patterns: — 3 Models × 4 Conditions × 5 Questions × 30 Repetitions —

Hasegawa, Hiroyasu (Researcher)

Abstract

This study experimentally examines the structural effects of the Knowledge Innovation System (KIS) on the judgment patterns of three generative AI models (ChatGPT, Claude, and Gemini), analyzing 1,800 judgments across four experimental conditions, five questions, and 30 repetitions each, from both quantitative and qualitative perspectives. Key findings are as follows: (1) KIS introduction significantly altered judgment distributions (chi-squared test, p < 10^-28); (2) independent analysis by evaluator revealed a strong KIS pure effect in Gemini evaluations (r = 0.88, p < .001); (3) KIS x Step interaction diverged into three patterns — independent additive, Step-excessive, and prerequisite types — depending on the internal structure of the question; and (4) judgment consistency and depth of reasoning structure were confirmed to be independent dimensions, with high consistency not necessarily indicating high-quality judgment. KIS functions as a "judgment process structuring device" rather than an "answer-generating device," and the results demonstrate that design choices adapted to the variable structure of the question are necessary.

Abstract (Japanese)

抄録

本研究は、Knowledge Innovation System(KIS)が3つの生成AIモデル(ChatGPT・Claude・Gemini)の判断様式に与える構造的影響を、4実験水準×5問×各30反復、計1,800判断を対象に定量・定性の両面から実験的に検証した。主要な知見として、(1)KIS導入は判断分布を統計的に有意に変化させ(χ²検定 p < 10⁻²⁸)、(2)評価者別独立分析ではGemini評価者においてKIS純粋効果が強く検出され(r = 0.88、p < .001)、(3)KIS×Step交互作用は問いの内部構造によって独立加算型・Step過剰型・前提条件型の3パターンに分化し、(4)判断の一貫性と推論構造の深度は独立した次元であり、高一貫性が必ずしも高質な判断を意味しないことが確認された。 KISは「正解を生成する装置」ではなく「判断プロセスを構造化する装置」として機能し、問いの変数構造に応じた設計選択が必要であることを示す。

Files

KIS_Preprint_2026_v21_EN.pdf

Files (947.7 kB)

Name	Size	Download all
KIS_Preprint_2026_v21_EN.pdf md5:84cd911c064ebb33ef2ce452a9b897ba	947.7 kB	Preview Download

Additional details

Translated title (Japanese): KISがAI判断様式に与える構造的影響の検証: 3モデル×4条件×5つの質問×30回の繰り返し

[1] Amabile, T. M. (1983). The social psychology of creativity: A componential conceptualization. Journal of Personality and Social Psychology, 45(2), 357–376. https://doi.org/10.1037/0022-3514.45.2.357
[2] Chiang, C.-H., & Lee, H.-Y. (2023). Can large language models be an alternative to human evaluations? In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 15607–15631. ACL. [3] Cicek, M., Ulu, S., Uslay, C., & Karniouchina, K. (2025). Unstable
[3] Cicek, M., Ulu, S., Uslay, C., & Karniouchina, K. (2025). Unstable Intelligence: GenAI Struggles with Accuracy and Consistency. Rutgers Business Review, 10(2), 266–277.
[4] Deci, E. L., & Ryan, R. M. (1985). Intrinsic motivation and self-determination in human behavior. Springer. https://doi.org/10.1007/978-1-4899-2271-7
[5] Jiang, X., et al. (2024). A survey on LLM-as-a-judge. arXiv preprint arXiv:2411.15594.
[6] KIS Research Group. (2025). KIS: Knowledge Innovation System — Technical Documentation. Unpublished manuscript.
[7] Madaan, A., Tandon, N., Gupta, P., Hallinan, S., Gao, L., Wiegreffe, S., ... & Clark, P. (2023). Self-Refine: Iterative refinement with self-feedback. In Advances in Neural Information Processing Systems 36 (NeurIPS 2023). Neural Information Processing Systems Foundation.
[8] Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., & Zhou, D. (2022). Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171.
[9] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E., Le, Q., & Zhou, D. (2022). Chain-of-thought prompting elicits reasoning in large language models. In Advances in Neural Information Processing Systems 35 (NeurIPS 2022), 24824–24837.
[10] Zheng, L., Chiang, W.-L., Sheng, Y., Zhuang, S., Wu, Z., Zhuang, Y., ... & Stoica, I. (2023). Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena. In Advances in Neural Information Processing Systems 36 (NeurIPS 2023). Neural Information Processing Systems Foundation.

	All versions	This version
Views	356	320
Downloads	90	59
Data volume	107.3 MB	59.7 MB

KIS_Preprint_2026_v21_EN.pdf

Files (947.7 kB)

Additional titles

References

Structural Effects of the Knowledge Innovation System on AI Judgment Patterns: — 3 Models × 4 Conditions × 5 Questions × 30 Repetitions —

Authors/Creators

Description

Abstract (Japanese)

Files

KIS_Preprint_2026_v21_EN.pdf

Files (947.7 kB)

Additional details

Additional titles

References