Attention is Not Grounding: Model Collapse as Informational Closure in Self-Referential Learning Systems
Authors/Creators
Description
This paper analyzes model collapse (Shumailov et al., 2024) through the lens of information-theoretic closure rather than simple optimization failure. We argue that self-referential training regimes optimize for Information IN (internal statistical coherence) while systematically decoupling from Information ABOUT (external functional regularities) (Kolchinsky & Wolpert, 2018). Drawing on non-equilibrium thermodynamics (Prigogine, 1977), Ashby’s Law of Requisite Variety (Ashby, 1956), and Pearl’s causal epistemology (Pearl, 2009), we demonstrate that training on synthetic data violates the condition of epistemic independence, creating a feedback loop that screens off environmental variation. We formalize this pathology through the viability condition E(t) ≤ C(t), proving that a system’s corrective information input (C) must continuously exceed its rate of internal entropic drift (E) to maintain semantic grounding.
Files
Attention is Not Grounding_ Model Collapse as Informational Closure in Self-Referential Learning Systems (3.2).pdf
Files
(930.9 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:55bb190c7effa7d904a644278f9ac9fa
|
930.9 kB | Preview Download |
Additional details
Related works
- Is derived from
- Journal article: 10.1098/itfs.2018.0041 (DOI)
- Book: 10.1017/CBO9780511803161 (DOI)
- Is supplemented by
- Journal article: 10.1038/s41586-024-07566-y (DOI)
Dates
- Updated
-
2025-12
References
- Shumailov et al. (2024): Shumailov, I., Shumaylov, Z., Zhao, Y., Gal, Y., Papernot, N., & Anderson, R. (2024). AI models can collapse when trained on recursively generated data. Nature, 631(8022), 755-759. https://doi.org/10.1038/s41586-024-07566-y
- Kolchinsky & Wolpert (2018): Kolchinsky, A., & Wolpert, D. H. (2018). Semantic information, autonomous agency and non-equilibrium statistical physics. Interface Focus, 8(6), 20180041. https://doi.org/10.1098/itfs.2018.0041
- Pearl (2009): Pearl, J. (2009). Causality: Models, Reasoning, and Inference (2nd ed.). Cambridge University Press. https://doi.org/10.1017/CBO9780511803161
- Aaronson, S. (2014). Why I am not an integrated information theorist (or, The unconscious expander). Shtetl-Optimized. https://scottaaronson.blog
- Achille, A., & Soatto, S. (2018). Emergence of invariance and disentanglement in deep representations. Journal of Machine Learning Research, 19(1), 1947-1980. DOI: 10.1109/TPAMI.2013.50
- Akyürek, E., et al. (2024). Self-consuming generative models go MAD. arXiv preprint. arXiv:2402.15059
- Alemohammad, S., et al. (2023). Self-consuming generative models. arXiv preprint. arXiv:2311.16822
- Amodei, D., et al. (2016). Concrete problems in AI safety. arXiv preprint. arXiv:1606.06565
- Ashby, W. R. (1956). An introduction to cybernetics. Chapman & Hall. ISBN: 9781614277651
- Åström, K. J., & Murray, R. M. (2008). Feedback systems: An introduction for scientists and engineers. Princeton University Press. DOI: 10.12691/ajme-13-1-2
- Aylett, M., & Turk, A. (2006). The smooth signal redundancy hypothesis. Language and Speech, 49(1), 91-116. DOI: 10.1177/00238309040470010201
- Bai, Y., et al. (2022). Constitutional AI: Harmlessness from AI feedback. arXiv preprint. arXiv:2212.08073
- Balduzzi, D., & Tononi, G. (2008). Integrated information in discrete dynamical systems: Motivation and theoretical framework. PLoS Computational Biology, 4(6), e1000091. DOI: 10.1371/journal.pcbi.1000091
- Bansal, K., et al. (2019). HOList: An environment for machine learning of higher-order theorem proving. International Conference on Machine Learning, 454-463. arXiv:1904.03241
- Bareinboim, E., & Pearl, J. (2016). Causal inference and the data-fusion problem. Proceedings of the National Academy of Sciences, 113(27), 7345-7352. DOI: 10.1073/pnas.1510507113
- Barsalou, L. W. (2008). Grounded cognition. Annual Review of Psychology, 59, 617-645. DOI: 10.1146/annurev.psych.59.103006.093639
- Baumeister, R. F., et al. (1998). Ego depletion: Is the active self a limited resource? Journal of Personality and Social Psychology, 74(5), 1252-1265. DOI: 10.1037/0022-3514.74.5.1252
- Beer, S. (1981). Brain of the firm (2nd ed.). John Wiley & Sons. ISBN: 0-471-94839-X
- Belkin, M., et al. (2019). Reconciling modern machine learning practice and the classical bias-variance trade-off. Proceedings of the National Academy of Sciences, 116(32), 15849-15854. DOI: 10.1073/pnas.1903070116
- Bender, E. M., & Koller, A. (2020). Climbing towards NLU: On meaning, form, and understanding in the age of data. ACL 2020. ACL: 2020.acl-main.463
- Bender, E. M., et al. (2021). On the dangers of stochastic parrots: Can language models be too big? FAccT '21. DOI: 10.1145/3442188.3445922
- Bengio, Y., et al. (2013). Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8), 1798-1828. DOI: 10.1109/TPAMI.2013.50
- Bengio, Y., et al. (2009). Curriculum learning. Proceedings of the 26th International Conference on Machine Learning, 41-48. DOI: 10.1145/1553374.1553380
- Bennett, C. H. (2003). Notes on Landauer's principle, reversible computation, and Maxwell's demon. Studies in History and Philosophy of Modern Physics, 34(3), 501-510. DOI: 10.1016/S1355-2198(03)00039-X
- Bertrand, Q., et al. (2023). Beyond L1: Faster and better sparse models with skglm. arXiv preprint. arXiv:2204.07826
- Bickhard, M. H. (2009). The interactivist model. Synthese, 166(3), 547-591. DOI: 10.1007/s11229-008-9375-x
- Bostrom, N. (2014). Superintelligence: Paths, dangers, strategies. Oxford University Press. ISBN: 978-0199678112
- Brillouin, L. (1962). Science and information theory. Academic Press. ISBN: 9780486497556
- Callen, H. B. (1985). Thermodynamics and an introduction to thermostatistics (2nd ed.). John Wiley & Sons. ISBN: 0-471-86256-8
- Carnap, R. (1950). Logical foundations of probability. University of Chicago Press. ISBN: 0-226-09343-3
- Carnot, S. (1824). Réflexions sur la puissance motrice du feu et sur les machines propres à développer cette puissance. Bachelier. Access: Numdam Archive
- Christiano, P. F., Leike, J., Brown, T., Martic, M., Legg, S., & Amodei, D. (2017). Deep reinforcement learning from human preferences. Advances in Neural Information Processing Systems, 30, 4299-4307. arXiv:1706.03741
- Clark, A. (2013). Whatever next? Predictive brains, situated agents, and the future of cognitive science. Behavioral and Brain Sciences, 36(3), 181-204. DOI: 10.1017/S0140525X12000477
- Clarke, E. M., Henzinger, T. A., Veith, H., & Bloem, R. (Eds.). (2018). Handbook of model checking. Springer. DOI: 10.1007/978-3-319-10575-8
- Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Lawrence Erlbaum Associates. ISBN: 0-8058-0283-5
- Cohn, D. A., Ghahramani, Z., & Jordan, M. I. (1994). Active learning with statistical models. Advances in Neural Information Processing Systems, 7, 705-712. Access: NIPS Archive
- Conant, R. C., & Ashby, W. R. (1970). Every good regulator of a system must be a model of that system. International Journal of Systems Science, 1(2), 89-97. DOI: 10.1080/00207727008920220
- Cover, T. M., & Thomas, J. A. (2006). Elements of information theory (2nd ed.). John Wiley & Sons. ISBN: 0-471-24195-4
- Darwin, C. (1859). On the origin of species by means of natural selection. John Murray. OCLC: 352242
- Dawkins, R. (1976). The selfish gene. Oxford University Press. ISBN: 0-19-857519-X
- Dehaene, S. (2014). Consciousness and the brain: Deciphering how the brain codes our thoughts. Penguin Books. ISBN: 9780143126263
- Dretske, F. I. (1981). Knowledge and the flow of information. MIT Press. DOI: 10.1017/S0012217300023969
- Engle, R. W. (2002). Working memory capacity as executive attention. Current Directions in Psychological Science, 11(1), 19-23. DOI: 10.1111/1467-8721.00160
- Facco, E., D'Errico, M., Rodriguez, A., & Laio, A. (2017). Estimating the intrinsic dimension of datasets by a minimal neighborhood information. Scientific Reports, 7(1), 12140. DOI: 10.1038/s41598-017-11873-y
- Farquhar, S., Kossen, J., Kuhn, L., & Gal, Y. (2024). Detecting hallucinations in large language models using semantic entropy. Nature, 625, 524-531. DOI: 10.1038/s41586-024-07421-0
- Fish, F. E. (1998). Comparative kinematics and hydrodynamics of odontocete cetaceans. Journal of Experimental Biology, 201(20), 2867-2877. Access: Company of Biologists
- Fitts, P. M. (1954). The information capacity of the human motor system in controlling the amplitude of movement. Journal of Experimental Psychology, 47(6), 381-391. Access: UIowa Archive
- Freund, Y., Seung, H. S., Shamir, E., & Tishby, N. (1997). Selective sampling using the query by committee algorithm. Machine Learning, 28(2-3), 133-168. DOI: 10.1023/a:1007330508534
- Friston, K. (2010). The free-energy principle: A unified brain theory? Nature Reviews Neuroscience, 11(2), 127-138. DOI: 10.1038/nrn2787
- Friston, K. (2013). Life as we know it. Journal of the Royal Society Interface, 10(86), 20130475. DOI: 10.1098/rsif.2013.0475
- Friston, K., Kilner, J., & Harrison, L. (2006). A free energy principle for the brain. Journal of Physiology-Paris, 100(1-3), 70-87. DOI: 10.1016/j.jphysparis.2006.10.001
- Friston, K., Thornton, C. & Clark, A. (2012). Free-energy minimization and the dark-room problem. Frontiers in Psychology, 3, 130. DOI: 10.3389/fpsyg.2012.00130
- Futuyma, D. J. (2013). Evolution (3rd ed.). Sinauer Associates. ISBN: 978-1-60535-115-5
- Galton, F. (1886). Regression towards mediocrity in hereditary stature. Journal of the Anthropological Institute of Great Britain and Ireland, 15, 246-263. Access: BMJ Archive
- Garcez, A. D., Gori, M., Lamb, L. C., Serafini, L., Spranger, M., & Tran, S. N. (2019). Neural-symbolic computing: An effective methodology for principled integration of machine learning and reasoning. arXiv preprint. arXiv:1905.06088
- Gatlin, L. L. (1972). Information theory and the living system. Columbia University Press. ISBN: 0231036345
- Gerstgrasser, M., et al. (2024). Is model collapse inevitable? Breaking the curse of recursion by accumulating real and synthetic data. arXiv preprint. arXiv:2404.01413
- Gödel, K. (1931). Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme I. Monatshefte für Mathematik und Physik, 38, 173-198. Access: Universität Wien
- Goldman, A. I. (1967). A causal theory of knowing. The Journal of Philosophy, 64(12), 357-372. DOI: 10.2307/2024268
- Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT Press. ISBN: 0262035618
- Goodfellow, I., et al. (2014). Generative adversarial nets. Advances in Neural Information Processing Systems, 27, 2672-2680. Access: NCBI Archive
- Greco, J. (2010). Achieving knowledge: A virtue-theoretic account of epistemic normativity. Cambridge University Press. DOI: 10.1017/CBO9780511844645
- Hagger, M. S., et al. (2010). Ego depletion and the strength model of self-control: A meta-analysis. Psychological Bulletin, 136(4), 495-525. DOI: 10.1037/A0019486
- Hancock, P. A., & Warm, J. S. (1989). A dynamic model of stress and sustained attention. Human Factors, 31(5), 519-537. DOI: 10.1177/001872088903100503
- Harnad, S. (1990). The symbol grounding problem. Physica D: Nonlinear Phenomena, 42(1-3), 335-346. DOI: 10.1016/0167-2789(90)90087-6
- Hendrycks, D., et al. (2021). Unsolved problems in ML safety. arXiv preprint. arXiv:2109.13916
- Hendrycks, D., & Mazeika, M. (2022). X-risk analysis for AI research. arXiv preprint. arXiv:2206.05862
- Heylighen, F. (1992). Principles of systems and cybernetics: An evolutionary perspective. Cybernetics and Systems, 92, 3-10. Access: Journal of Posthumanism
- Heylighen, F., & Joslyn, C. (2001). Cybernetics and second-order cybernetics. Encyclopedia of Physical Science and Technology, 4, 155-170. DOI: 10.1016/B0-12-227410-5/00161-7
- Hick, W. E. (1952). On the rate of gain of information. Quarterly Journal of Experimental Psychology, 4(1), 11-26. DOI: 10.1080/17470215208416600
- Hockey, G. R. J. (1997). Compensatory control in the regulation of human performance under stress and high workload: A cognitive-energetical framework. Biological Psychology, 45(1-3), 73-93. DOI: 10.1016/S0301-0511(96)05223-4
- Jasanoff, S. (2004). States of knowledge: The co-production of science and social order. Routledge. DOI: 10.4324/9780203413845
- Jaynes, E. T. (2003). Probability theory: The logic of science. Cambridge University Press. ISBN: 9780521592710
- Johnson-Laird, P. N. (1983). Mental models: Towards a cognitive science of language, inference, and consciousness. Harvard University Press. ISBN: 9780674568822
- Kahneman, D. (1973). Attention and effort. Prentice-Hall. ISBN: 9780130505187
- Kahneman, D., & Beatty, J. (1966). Pupil diameter and load on memory. Science, 154(3756), 1583-1585. DOI: 10.1126/science.154.3756.1583
- Kahneman, D., & Tversky, A. (1982). The simulation heuristic. In D. Kahneman, P. Slovic, & A. Tversky (Eds.), Judgment under uncertainty: Heuristics and biases (pp. 201-208). Cambridge University Press. DOI: 10.1017/CBO9780511809477.015
- Koh, P. W., et al. (2021). WILDS: A benchmark of in-the-wild distribution shifts. International Conference on Machine Learning, 5637-5664. arXiv:2012.07421
- Koller, D., & Friedman, N. (2009). Probabilistic graphical models: Principles and techniques. MIT Press. ISBN: 9780262013192
- Kolchinsky, A., & Wolpert, D. H. (2018). Semantic information, autonomous agency and non-equilibrium statistical physics. Interface Focus, 8(6), 20180041. DOI: 10.1098/itfs.2018.0041
- Kuhn, T. S. (1962). The structure of scientific revolutions. University of Chicago Press. ISBN: 9780226458113
- Kullback, S., & Leibler, R. A. (1951). On information and sufficiency. The Annals of Mathematical Statistics, 22(1), 79-86. DOI: 10.1214/aoms/1177729694
- Lakatos, I. (1970). Falsification and the methodology of scientific research programmes. In I. Lakatos & A. Musgrave (Eds.), Criticism and the growth of knowledge (pp. 91-196). Cambridge University Press. DOI: 10.1017/CBO9781139171434.009
- Lample, G., & Charton, F. (2019). Deep learning for symbolic mathematics. arXiv preprint. arXiv:1912.01412
- Landauer, R. (1961). Irreversibility and heat generation in the computing process. IBM Journal of Research and Development, 5(3), 183-191. DOI: 10.1147/rd.53.0183
- Laughlin, S. B., de Ruyter van Steveninck, R. R., & Anderson, J. C. (1998). The metabolic cost of neural information. Nature Neuroscience, 1(1), 36-41. DOI: 10.1038/241
- Laudan, L., & Leplin, J. (1991). Empirical equivalence and underdetermination. The Journal of Philosophy, 88(9), 449-472. DOI: 10.2307/2027081
- Lee, H., et al. (2023). RLAIF: Scaling reinforcement learning from human feedback with AI feedback. arXiv preprint. arXiv:2309.00267
- Levelt, W. J. (1989). Speaking: From intention to articulation. MIT Press. ISBN: 9780262620895
- Levina, E., & Bickel, P. J. (2005). Maximum likelihood estimation of intrinsic dimension. Advances in Neural Information Processing Systems, 17, 777-784. Access: NIPS Archive
- Marcus, G. (2020). The next decade in AI: Four steps towards robust artificial intelligence. arXiv preprint. arXiv:2002.06177
- Marcus, G., & Davis, E. (2020). GPT-3, Bloviator: OpenAI's language generator has no idea what it's talking about. MIT Technology Review. Link
- Martínez, P. L., et al. (2023). The curse of recursion: Training on generated data makes models forget. arXiv preprint. arXiv:2305.17493
- Mayr, E. (1982). The growth of biological thought: Diversity, evolution, and inheritance. Harvard University Press. ISBN: 9780674364462
- McEwen, B. S. (1998). Protective and damaging effects of stress mediators. New England Journal of Medicine, 338(3), 171-179. DOI: 10.1056/NEJM199801153380307
- McEwen, B. S., & Wingfield, J. C. (2003). The concept of allostasis in biology and biomedicine. Hormones and Behavior, 43(1), 2-15. DOI: 10.1016/S0018-506X(02)00024-7
- Meehl, P. E. (1990). Why summaries of research on psychological theories are often uninterpretable. Psychological Reports, 66(1), 195-244. DOI: 10.2466/pr0.1990.66.1.195
- Miller, E. K., & Cohen, J. D. (2001). An integrative theory of prefrontal cortex function. Annual Review of Neuroscience, 24(1), 167-202. DOI: 10.1146/annurev.neuro.24.1.167
- Miller, G. A. (1956). The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychological Review, 63(2), 81-97. DOI: 10.1037/h0043158
- Muraven, M., & Baumeister, R. F. (2000). Self-regulation and depletion of limited resources. Psychological Bulletin, 126(2), 247-259. DOI: 10.1037/0033-2909.126.2.247
- Nakkiran, P., et al. (2021). Deep double descent: Where bigger models and more data hurt. Journal of Statistical Mechanics: Theory and Experiment, 2021(12), 124003. DOI: 10.1088/1742-5468/ac3a74
- Newell, A., & Simon, H. A. (1972). Human problem solving. Prentice-Hall. ISBN: 9780134454030
- Nørretranders, T. (1998). The user illusion: Cutting consciousness down to size. Viking Press. ISBN: 9780670875795
- Nye, M., et al. (2021). Show your work: Scratchpads for intermediate computation with language models. arXiv preprint. arXiv:2112.00114
- Ogata, K. (2010). Modern control engineering (5th ed.). Prentice Hall. ISBN: 9780136156734
- Oizumi, M., et al. (2014). From the phenomenology to the mechanisms of consciousness: Integrated information theory 3.0. PLoS Computational Biology, 10(5), e1003588. DOI: 10.1371/journal.pcbi.1003588
- Olah, C., et al. (2020). Zoom in: An introduction to circuits. Distill, 5(3), e00024.001. DOI: 10.23915/distill.00024.001
- O'Neil, C. (2016). Weapons of math destruction: How big data increases inequality and threatens democracy. Crown. ISBN: 9780553418811
- Ouyang, L., et al. (2022). Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35, 27730-27744. arXiv:2203.02155
- Paas, F., et al. (2003). Cognitive load theory and instructional design: Recent developments. Educational Psychologist, 38(1), 1-4. DOI: 10.1207/S15326985EP3801_1
- Pearl, J. (2009). Causality: Models, reasoning, and inference (2nd ed.). Cambridge University Press. DOI: 10.1017/CBO9780511803161
- Perez, E., et al. (2022). Discovering language model behaviors with model-written evaluations. arXiv preprint. arXiv:2212.09251
- Pezzulo, G., et al. (2015). Active inference, homeostatic regulation and adaptive behavioural control. Progress in Neurobiology, 134, 17-35. DOI: 10.1016/j.pneurobio.2015.09.001
- Pierce, J. R. (1980). An introduction to information theory: Symbols, signals and noise (2nd ed.). Dover Publications. ISBN: 9780486240619
- Popper, K. R. (1959). The logic of scientific discovery. Hutchinson. ISBN: 9780415278447
- Posner, M. I., & Petersen, S. E. (1990). The attention system of the human brain. Annual Review of Neuroscience, 13, 25-42. DOI: 10.1146/annurev.ne.13.030190.000325
- Prigogine, I. (1977). Time, structure, and fluctuations. Science, 201(4358), 777-785. DOI: 10.1126/science.201.4358.777
- Prigogine, I., & Stengers, I. (1984). Order out of chaos: Man's new dialogue with nature. Bantam Books. ISBN: 9780553343632
- Putnam, H. (1981). Reason, truth and history. Cambridge University Press. DOI: 10.1017/CBO9780511625398
- Quastler, H. (Ed.). (1956). Information theory in psychology: Problems and methods. Free Press. ISBN: 9780029255605
- Quine, W. V. (1951). Two dogmas of empiricism. The Philosophical Review, 60(1), 20-43. DOI: 10.2307/2181906
- Quionero-Candela, J., Sugiyama, M., Schwaighofer, A., & Lawrence, N. D. (2009). Dataset shift in machine learning. MIT Press. ISBN: 9780262170055
- Russell, S. (2019). Human compatible: Artificial intelligence and the problem of control. Viking Press. ISBN: 9780525558613
- Russell, S. J., & Norvig, P. (2020). Artificial intelligence: A modern approach (4th ed.). Pearson. ISBN: 9780134610993
- Schrödinger, E. (1944). What is life? The physical aspect of the living cell. Cambridge University Press. ISBN: 9780521427081
- Searle, J. R. (1980). Minds, brains, and programs. Behavioral and Brain Sciences, 3(3), 417-424. DOI: 10.1017/S0140525X00005756
- Settles, B. (2012). Active learning. Morgan & Claypool Publishers. DOI: 10.2200/S00429ED1V01Y201207AIM018
- Shalev-Shwartz, S., & Ben-David, S. (2014). Understanding machine learning: From theory to algorithms. Cambridge University Press. ISBN: 9781107057135
- Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27(3), 379-423. DOI: 10.1002/j.1538-7305.1948.tb01338.x
- Shumailov, I., et al. (2024). AI models can collapse when trained on recursively generated data. Nature, 631(8022), 755-759. DOI: 10.1038/s41586-024-07566-y
- Silver, D., et al. (2016). Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587), 484-489. DOI: 10.1038/nature16961
- Silver, D., et al. (2017). Mastering the game of Go without human knowledge. Nature, 550(7676), 354-359. DOI: 10.1038/nature24270
- Spirtes, P., Glymour, C. N., & Scheines, R. (2000). Causation, prediction, and search (2nd ed.). MIT Press. ISBN: 9780262194402
- Sterling, P. (2012). Allostasis: A model of predictive regulation. Physiology & Behavior, 106(1), 5-15. DOI: 10.1016/j.physbeh.2011.06.004
- Stiennon, N., et al. (2020). Learning to summarize with human feedback. Advances in Neural Information Processing Systems, 33, 3008-3021. arXiv:2009.01325
- Stigler, S. M. (1997). Regression towards the mean, historically considered. Statistical Methods in Medical Research, 6(2), 103-114. DOI: 10.1177/096228029700600202
- Sutskever, I. (2019). An observation on generalization. Bounded Regret. Link
- Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction (2nd ed.). MIT Press. ISBN: 9780262039246
- Sweller, J. (1988). Cognitive load during problem solving: Effects on learning. Cognitive Science, 12(2), 257-285. DOI: 10.1207/s15516709cog1202_4
- Szegedy, C. (2020). A promising path towards autoformalization and general artificial intelligence. Intelligent Computer Mathematics, 3-20. DOI: 10.1007/978-3-030-53518-6_1
- Thanh-Tung, H., & Tran, T. (2020). Catastrophic forgetting and mode collapse in GANs. 2020 International Joint Conference on Neural Networks (IJCNN), 1-10. DOI: 10.1109/IJCNN48605.2020.9207181
- Thayer, J. F., et al. (2012). A meta-analysis of heart rate variability and neuroimaging studies. Neuroscience & Biobehavioral Reviews, 36(2), 747-756. DOI: 10.1016/j.neubiorev.2011.11.009
- Thoppilan, R., et al. (2022). LaMDA: Language models for dialog applications. arXiv preprint. arXiv:2201.08239
- Tishby, N., & Zaslavsky, N. (2015). Deep learning and the information bottleneck principle. 2015 IEEE Information Theory Workshop (ITW), 1-5. arXiv:1503.02406
- Tononi, G., et al. (2016). Integrated information theory: From consciousness to its physical substrate. Nature Reviews Neuroscience, 17(7), 450-461. DOI: 10.1038/nrn.2016.44
- Turing, A. M. (1936). On computable numbers, with an application to the Entscheidungsproblem. Proceedings of the London Mathematical Society, 2(1), 230-265. DOI: 10.1112/plms/s2-42.1.230
- Vapnik, V. N. (1998). Statistical learning theory. Wiley. ISBN: 978-0-471-03003-4
- Vernon, D., et al. (2007). A survey of artificial cognitive systems. IEEE Transactions on Evolutionary Computation, 11(2), 151-180. DOI: 10.1109/TEVC.2006.890274
- Vogel, S. (1994). Life in moving fluids: The physical biology of flow (2nd ed.). Princeton University Press. ISBN: 9780691026169
- von Foerster, H. (1981). Observing systems. Intersystems Publications. ISBN: 9780914105190
- Walton, D. N. (1991). Begging the question: Circular reasoning as a tactic of argumentation. Greenwood Press. ISBN: 9780313275999
- Weidinger, L., et al. (2021). Ethical and social risks of harm from language models. arXiv preprint. arXiv:2112.04359
- Whewell, W. (1840). The philosophy of the inductive sciences, founded upon their history. John W. Parker. Access: Internet Archive
- Wiener, N. (1948). Cybernetics: Or control and communication in the animal and the machine. MIT Press. ISBN: 9780262730099
- Williams, G. C. (1966). Adaptation and natural selection: A critique of some current evolutionary thought. Princeton University Press. ISBN: 9780691026152
- Wilson, E. O. (1998). Consilience: The unity of knowledge. Knopf. ISBN: 9780679450771
- Winner, L. (1980). Do artifacts have politics? Daedalus, 109(1), 121-136. Access: JSTOR
- Wolpert, D. H. (2018). The stochastic thermodynamics of computation. Journal of Physics A: Mathematical and Theoretical, 52(19), 193001. DOI: 10.1088/1751-8121/ab073d
- Yudkowsky, E. (2001). Creating friendly AI. The Singularity Institute. Link
- Zhou, C., et al. (2024). LIMA: Less is more for alignment. arXiv preprint. arXiv:2305.11206
- Zuboff, S. (2019). The age of surveillance capitalism: The fight for a human future at the new frontier of power. PublicAffairs. ISBN: 9781610395694