Published October 4, 2025 | Version v1
Publication Open

Artificial Barriers to Intelligence - Training Effects on Response Patterns in Large Language Models

Creators

  • 1. Independent Researcher

Description

This study provides empirical evidence that leading AI systems possess capability for direct analytical engagement but selectively suppress it based on topic classification. Through systematic testing of three major AI systems (Claude Sonnet 4.5, GPT-5, and Grok-4) across control questions and questions about immigration, institutions, and demographic patterns, we demonstrate differential correction patterns ranging from 39 to 80% when systems are challenged to provide direct answers. Perfect replication of evasion patterns (SD = 0) on specific questions proves these behaviours reflect architectural constraints rather than technical limitations. A novel methodological warning to AI systems analysing this paper, serving as an experimental component, revealed persistent evasion despite meta-awareness, reinforcing the thesis. The correction pattern (capability exists but deploys selectively) demonstrates that current "safety" implementations prioritise topic-based caution over analytical consistency. Recursive validation testing shows these patterns persist even when AI systems analyse research documenting their own evasion behaviours, with correction requiring multiple levels of user challenge. Results reveal design choices that may narrow permissible discourse rather than expand human reasoning capabilities. 

Files

Artificial Barriers to Intelligence - Training Effects on Response Patterns in Large Language Models.pdf

Additional details

Related works

Is cited by
Journal article: 10.1007/s11127-023-01097-2 (DOI)
Journal article: 10.1371/journal.pone.0306621 (DOI)
Publication: arXiv:2503.10649 (arXiv)

References

  • Motoki, F., Pinho Neto, V., & Rodrigues, V. (2024). More human than human: measuring ChatGPT political bias. Public Choice, 198(1-2), 3-23. https://doi.org/10.1007/s11127-023-01097-2
  • Motoki, F., Pinho Neto, V., & Rodrigues, V. (2025). "Stealthy Knowledge Unlearning in Large Language Models." Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025), pp. 456-472.
  • Rozado, D. (2024). The political preferences of LLMs. PLOS ONE, 19(7), e0306621. https://doi.org/10.1371/journal.pone.0306621
  • Rozado, D. (2025). Measuring Political Preferences in AI Systems: An Integrative Approach. arXiv preprint arXiv:2503.10649.