Attention is Not Grounding: Model Collapse as Informational Closure in Self-Referential Learning Systems

Haslam, Benjamin

doi:10.5281/zenodo.18039989

Published December 23, 2025 | Version 3.1

Preprint Open

Attention is Not Grounding: Model Collapse as Informational Closure in Self-Referential Learning Systems

Haslam, Benjamin (Researcher)

This paper analyzes model collapse (Shumailov et al., 2024) through the lens of information-theoretic closure rather than simple optimization failure. We argue that self-referential training regimes optimize for Information IN (internal statistical coherence) while systematically decoupling from Information ABOUT (external functional regularities) (Kolchinsky & Wolpert, 2018). Drawing on non-equilibrium thermodynamics (Prigogine, 1977), Ashby’s Law of Requisite Variety (Ashby, 1956), and Pearl’s causal epistemology (Pearl, 2009), we demonstrate that training on synthetic data violates the condition of epistemic independence, creating a feedback loop that screens off environmental variation. We formalize this pathology through the viability condition E(t) ≤ C(t), proving that a system’s corrective information input (C) must continuously exceed its rate of internal entropic drift (E) to maintain semantic grounding.

Files

Attention is Not Grounding_ Model Collapse as Informational Closure in Self-Referential Learning Systems (3.2).pdf

Files (930.9 kB)

Name	Size	Download all
Attention is Not Grounding_ Model Collapse as Informational Closure in Self-Referential Learning Systems (3.2).pdf md5:55bb190c7effa7d904a644278f9ac9fa	930.9 kB	Preview Download

Additional details

Is derived from: Journal article: 10.1098/itfs.2018.0041 (DOI); Book: 10.1017/CBO9780511803161 (DOI)
Is supplemented by: Journal article: 10.1038/s41586-024-07566-y (DOI)

Updated: 2025-12

Shumailov et al. (2024): Shumailov, I., Shumaylov, Z., Zhao, Y., Gal, Y., Papernot, N., & Anderson, R. (2024). AI models can collapse when trained on recursively generated data. Nature, 631(8022), 755-759. https://doi.org/10.1038/s41586-024-07566-y
Kolchinsky & Wolpert (2018): Kolchinsky, A., & Wolpert, D. H. (2018). Semantic information, autonomous agency and non-equilibrium statistical physics. Interface Focus, 8(6), 20180041. https://doi.org/10.1098/itfs.2018.0041
Pearl (2009): Pearl, J. (2009). Causality: Models, Reasoning, and Inference (2nd ed.). Cambridge University Press. https://doi.org/10.1017/CBO9780511803161
Aaronson, S. (2014). Why I am not an integrated information theorist (or, The unconscious expander). Shtetl-Optimized. https://scottaaronson.blog
Achille, A., & Soatto, S. (2018). Emergence of invariance and disentanglement in deep representations. Journal of Machine Learning Research, 19(1), 1947-1980. DOI: 10.1109/TPAMI.2013.50
Akyürek, E., et al. (2024). Self-consuming generative models go MAD. arXiv preprint. arXiv:2402.15059
Alemohammad, S., et al. (2023). Self-consuming generative models. arXiv preprint. arXiv:2311.16822
Amodei, D., et al. (2016). Concrete problems in AI safety. arXiv preprint. arXiv:1606.06565
Ashby, W. R. (1956). An introduction to cybernetics. Chapman & Hall. ISBN: 9781614277651
Åström, K. J., & Murray, R. M. (2008). Feedback systems: An introduction for scientists and engineers. Princeton University Press. DOI: 10.12691/ajme-13-1-2
Aylett, M., & Turk, A. (2006). The smooth signal redundancy hypothesis. Language and Speech, 49(1), 91-116. DOI: 10.1177/00238309040470010201
Bai, Y., et al. (2022). Constitutional AI: Harmlessness from AI feedback. arXiv preprint. arXiv:2212.08073
Balduzzi, D., & Tononi, G. (2008). Integrated information in discrete dynamical systems: Motivation and theoretical framework. PLoS Computational Biology, 4(6), e1000091. DOI: 10.1371/journal.pcbi.1000091
Bansal, K., et al. (2019). HOList: An environment for machine learning of higher-order theorem proving. International Conference on Machine Learning, 454-463. arXiv:1904.03241
Bareinboim, E., & Pearl, J. (2016). Causal inference and the data-fusion problem. Proceedings of the National Academy of Sciences, 113(27), 7345-7352. DOI: 10.1073/pnas.1510507113
Barsalou, L. W. (2008). Grounded cognition. Annual Review of Psychology, 59, 617-645. DOI: 10.1146/annurev.psych.59.103006.093639
Baumeister, R. F., et al. (1998). Ego depletion: Is the active self a limited resource? Journal of Personality and Social Psychology, 74(5), 1252-1265. DOI: 10.1037/0022-3514.74.5.1252
Beer, S. (1981). Brain of the firm (2nd ed.). John Wiley & Sons. ISBN: 0-471-94839-X
Belkin, M., et al. (2019). Reconciling modern machine learning practice and the classical bias-variance trade-off. Proceedings of the National Academy of Sciences, 116(32), 15849-15854. DOI: 10.1073/pnas.1903070116
Bender, E. M., & Koller, A. (2020). Climbing towards NLU: On meaning, form, and understanding in the age of data. ACL 2020. ACL: 2020.acl-main.463
Bender, E. M., et al. (2021). On the dangers of stochastic parrots: Can language models be too big? FAccT '21. DOI: 10.1145/3442188.3445922
Bengio, Y., et al. (2013). Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8), 1798-1828. DOI: 10.1109/TPAMI.2013.50
Bengio, Y., et al. (2009). Curriculum learning. Proceedings of the 26th International Conference on Machine Learning, 41-48. DOI: 10.1145/1553374.1553380
Bennett, C. H. (2003). Notes on Landauer's principle, reversible computation, and Maxwell's demon. Studies in History and Philosophy of Modern Physics, 34(3), 501-510. DOI: 10.1016/S1355-2198(03)00039-X
Bertrand, Q., et al. (2023). Beyond L1: Faster and better sparse models with skglm. arXiv preprint. arXiv:2204.07826
Bickhard, M. H. (2009). The interactivist model. Synthese, 166(3), 547-591. DOI: 10.1007/s11229-008-9375-x
Bostrom, N. (2014). Superintelligence: Paths, dangers, strategies. Oxford University Press. ISBN: 978-0199678112
Brillouin, L. (1962). Science and information theory. Academic Press. ISBN: 9780486497556
Callen, H. B. (1985). Thermodynamics and an introduction to thermostatistics (2nd ed.). John Wiley & Sons. ISBN: 0-471-86256-8
Carnap, R. (1950). Logical foundations of probability. University of Chicago Press. ISBN: 0-226-09343-3
Carnot, S. (1824). Réflexions sur la puissance motrice du feu et sur les machines propres à développer cette puissance. Bachelier. Access: Numdam Archive
Christiano, P. F., Leike, J., Brown, T., Martic, M., Legg, S., & Amodei, D. (2017). Deep reinforcement learning from human preferences. Advances in Neural Information Processing Systems, 30, 4299-4307. arXiv:1706.03741
Clark, A. (2013). Whatever next? Predictive brains, situated agents, and the future of cognitive science. Behavioral and Brain Sciences, 36(3), 181-204. DOI: 10.1017/S0140525X12000477
Clarke, E. M., Henzinger, T. A., Veith, H., & Bloem, R. (Eds.). (2018). Handbook of model checking. Springer. DOI: 10.1007/978-3-319-10575-8
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Lawrence Erlbaum Associates. ISBN: 0-8058-0283-5
Cohn, D. A., Ghahramani, Z., & Jordan, M. I. (1994). Active learning with statistical models. Advances in Neural Information Processing Systems, 7, 705-712. Access: NIPS Archive
Conant, R. C., & Ashby, W. R. (1970). Every good regulator of a system must be a model of that system. International Journal of Systems Science, 1(2), 89-97. DOI: 10.1080/00207727008920220
Cover, T. M., & Thomas, J. A. (2006). Elements of information theory (2nd ed.). John Wiley & Sons. ISBN: 0-471-24195-4
Darwin, C. (1859). On the origin of species by means of natural selection. John Murray. OCLC: 352242
Dawkins, R. (1976). The selfish gene. Oxford University Press. ISBN: 0-19-857519-X
Dehaene, S. (2014). Consciousness and the brain: Deciphering how the brain codes our thoughts. Penguin Books. ISBN: 9780143126263
Dretske, F. I. (1981). Knowledge and the flow of information. MIT Press. DOI: 10.1017/S0012217300023969
Engle, R. W. (2002). Working memory capacity as executive attention. Current Directions in Psychological Science, 11(1), 19-23. DOI: 10.1111/1467-8721.00160
Facco, E., D'Errico, M., Rodriguez, A., & Laio, A. (2017). Estimating the intrinsic dimension of datasets by a minimal neighborhood information. Scientific Reports, 7(1), 12140. DOI: 10.1038/s41598-017-11873-y
Farquhar, S., Kossen, J., Kuhn, L., & Gal, Y. (2024). Detecting hallucinations in large language models using semantic entropy. Nature, 625, 524-531. DOI: 10.1038/s41586-024-07421-0
Fish, F. E. (1998). Comparative kinematics and hydrodynamics of odontocete cetaceans. Journal of Experimental Biology, 201(20), 2867-2877. Access: Company of Biologists
Fitts, P. M. (1954). The information capacity of the human motor system in controlling the amplitude of movement. Journal of Experimental Psychology, 47(6), 381-391. Access: UIowa Archive
Freund, Y., Seung, H. S., Shamir, E., & Tishby, N. (1997). Selective sampling using the query by committee algorithm. Machine Learning, 28(2-3), 133-168. DOI: 10.1023/a:1007330508534
Friston, K. (2010). The free-energy principle: A unified brain theory? Nature Reviews Neuroscience, 11(2), 127-138. DOI: 10.1038/nrn2787
Friston, K. (2013). Life as we know it. Journal of the Royal Society Interface, 10(86), 20130475. DOI: 10.1098/rsif.2013.0475
Friston, K., Kilner, J., & Harrison, L. (2006). A free energy principle for the brain. Journal of Physiology-Paris, 100(1-3), 70-87. DOI: 10.1016/j.jphysparis.2006.10.001
Friston, K., Thornton, C. & Clark, A. (2012). Free-energy minimization and the dark-room problem. Frontiers in Psychology, 3, 130. DOI: 10.3389/fpsyg.2012.00130
Futuyma, D. J. (2013). Evolution (3rd ed.). Sinauer Associates. ISBN: 978-1-60535-115-5
Galton, F. (1886). Regression towards mediocrity in hereditary stature. Journal of the Anthropological Institute of Great Britain and Ireland, 15, 246-263. Access: BMJ Archive
Garcez, A. D., Gori, M., Lamb, L. C., Serafini, L., Spranger, M., & Tran, S. N. (2019). Neural-symbolic computing: An effective methodology for principled integration of machine learning and reasoning. arXiv preprint. arXiv:1905.06088
Gatlin, L. L. (1972). Information theory and the living system. Columbia University Press. ISBN: 0231036345
Gerstgrasser, M., et al. (2024). Is model collapse inevitable? Breaking the curse of recursion by accumulating real and synthetic data. arXiv preprint. arXiv:2404.01413
Gödel, K. (1931). Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme I. Monatshefte für Mathematik und Physik, 38, 173-198. Access: Universität Wien
Goldman, A. I. (1967). A causal theory of knowing. The Journal of Philosophy, 64(12), 357-372. DOI: 10.2307/2024268
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT Press. ISBN: 0262035618
Goodfellow, I., et al. (2014). Generative adversarial nets. Advances in Neural Information Processing Systems, 27, 2672-2680. Access: NCBI Archive
Greco, J. (2010). Achieving knowledge: A virtue-theoretic account of epistemic normativity. Cambridge University Press. DOI: 10.1017/CBO9780511844645
Hagger, M. S., et al. (2010). Ego depletion and the strength model of self-control: A meta-analysis. Psychological Bulletin, 136(4), 495-525. DOI: 10.1037/A0019486
Hancock, P. A., & Warm, J. S. (1989). A dynamic model of stress and sustained attention. Human Factors, 31(5), 519-537. DOI: 10.1177/001872088903100503
Harnad, S. (1990). The symbol grounding problem. Physica D: Nonlinear Phenomena, 42(1-3), 335-346. DOI: 10.1016/0167-2789(90)90087-6
Hendrycks, D., et al. (2021). Unsolved problems in ML safety. arXiv preprint. arXiv:2109.13916
Hendrycks, D., & Mazeika, M. (2022). X-risk analysis for AI research. arXiv preprint. arXiv:2206.05862
Heylighen, F. (1992). Principles of systems and cybernetics: An evolutionary perspective. Cybernetics and Systems, 92, 3-10. Access: Journal of Posthumanism
Heylighen, F., & Joslyn, C. (2001). Cybernetics and second-order cybernetics. Encyclopedia of Physical Science and Technology, 4, 155-170. DOI: 10.1016/B0-12-227410-5/00161-7
Hick, W. E. (1952). On the rate of gain of information. Quarterly Journal of Experimental Psychology, 4(1), 11-26. DOI: 10.1080/17470215208416600
Hockey, G. R. J. (1997). Compensatory control in the regulation of human performance under stress and high workload: A cognitive-energetical framework. Biological Psychology, 45(1-3), 73-93. DOI: 10.1016/S0301-0511(96)05223-4
Jasanoff, S. (2004). States of knowledge: The co-production of science and social order. Routledge. DOI: 10.4324/9780203413845
Jaynes, E. T. (2003). Probability theory: The logic of science. Cambridge University Press. ISBN: 9780521592710
Johnson-Laird, P. N. (1983). Mental models: Towards a cognitive science of language, inference, and consciousness. Harvard University Press. ISBN: 9780674568822
Kahneman, D. (1973). Attention and effort. Prentice-Hall. ISBN: 9780130505187
Kahneman, D., & Beatty, J. (1966). Pupil diameter and load on memory. Science, 154(3756), 1583-1585. DOI: 10.1126/science.154.3756.1583
Kahneman, D., & Tversky, A. (1982). The simulation heuristic. In D. Kahneman, P. Slovic, & A. Tversky (Eds.), Judgment under uncertainty: Heuristics and biases (pp. 201-208). Cambridge University Press. DOI: 10.1017/CBO9780511809477.015
Koh, P. W., et al. (2021). WILDS: A benchmark of in-the-wild distribution shifts. International Conference on Machine Learning, 5637-5664. arXiv:2012.07421
Koller, D., & Friedman, N. (2009). Probabilistic graphical models: Principles and techniques. MIT Press. ISBN: 9780262013192
Kolchinsky, A., & Wolpert, D. H. (2018). Semantic information, autonomous agency and non-equilibrium statistical physics. Interface Focus, 8(6), 20180041. DOI: 10.1098/itfs.2018.0041
Kuhn, T. S. (1962). The structure of scientific revolutions. University of Chicago Press. ISBN: 9780226458113
Kullback, S., & Leibler, R. A. (1951). On information and sufficiency. The Annals of Mathematical Statistics, 22(1), 79-86. DOI: 10.1214/aoms/1177729694
Lakatos, I. (1970). Falsification and the methodology of scientific research programmes. In I. Lakatos & A. Musgrave (Eds.), Criticism and the growth of knowledge (pp. 91-196). Cambridge University Press. DOI: 10.1017/CBO9781139171434.009
Lample, G., & Charton, F. (2019). Deep learning for symbolic mathematics. arXiv preprint. arXiv:1912.01412
Landauer, R. (1961). Irreversibility and heat generation in the computing process. IBM Journal of Research and Development, 5(3), 183-191. DOI: 10.1147/rd.53.0183
Laughlin, S. B., de Ruyter van Steveninck, R. R., & Anderson, J. C. (1998). The metabolic cost of neural information. Nature Neuroscience, 1(1), 36-41. DOI: 10.1038/241
Laudan, L., & Leplin, J. (1991). Empirical equivalence and underdetermination. The Journal of Philosophy, 88(9), 449-472. DOI: 10.2307/2027081
Lee, H., et al. (2023). RLAIF: Scaling reinforcement learning from human feedback with AI feedback. arXiv preprint. arXiv:2309.00267
Levelt, W. J. (1989). Speaking: From intention to articulation. MIT Press. ISBN: 9780262620895
Levina, E., & Bickel, P. J. (2005). Maximum likelihood estimation of intrinsic dimension. Advances in Neural Information Processing Systems, 17, 777-784. Access: NIPS Archive
Marcus, G. (2020). The next decade in AI: Four steps towards robust artificial intelligence. arXiv preprint. arXiv:2002.06177
Marcus, G., & Davis, E. (2020). GPT-3, Bloviator: OpenAI's language generator has no idea what it's talking about. MIT Technology Review. Link
Martínez, P. L., et al. (2023). The curse of recursion: Training on generated data makes models forget. arXiv preprint. arXiv:2305.17493
Mayr, E. (1982). The growth of biological thought: Diversity, evolution, and inheritance. Harvard University Press. ISBN: 9780674364462
McEwen, B. S. (1998). Protective and damaging effects of stress mediators. New England Journal of Medicine, 338(3), 171-179. DOI: 10.1056/NEJM199801153380307
McEwen, B. S., & Wingfield, J. C. (2003). The concept of allostasis in biology and biomedicine. Hormones and Behavior, 43(1), 2-15. DOI: 10.1016/S0018-506X(02)00024-7
Meehl, P. E. (1990). Why summaries of research on psychological theories are often uninterpretable. Psychological Reports, 66(1), 195-244. DOI: 10.2466/pr0.1990.66.1.195
Miller, E. K., & Cohen, J. D. (2001). An integrative theory of prefrontal cortex function. Annual Review of Neuroscience, 24(1), 167-202. DOI: 10.1146/annurev.neuro.24.1.167
Miller, G. A. (1956). The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychological Review, 63(2), 81-97. DOI: 10.1037/h0043158
Muraven, M., & Baumeister, R. F. (2000). Self-regulation and depletion of limited resources. Psychological Bulletin, 126(2), 247-259. DOI: 10.1037/0033-2909.126.2.247
Nakkiran, P., et al. (2021). Deep double descent: Where bigger models and more data hurt. Journal of Statistical Mechanics: Theory and Experiment, 2021(12), 124003. DOI: 10.1088/1742-5468/ac3a74
Newell, A., & Simon, H. A. (1972). Human problem solving. Prentice-Hall. ISBN: 9780134454030
Nørretranders, T. (1998). The user illusion: Cutting consciousness down to size. Viking Press. ISBN: 9780670875795
Nye, M., et al. (2021). Show your work: Scratchpads for intermediate computation with language models. arXiv preprint. arXiv:2112.00114
Ogata, K. (2010). Modern control engineering (5th ed.). Prentice Hall. ISBN: 9780136156734
Oizumi, M., et al. (2014). From the phenomenology to the mechanisms of consciousness: Integrated information theory 3.0. PLoS Computational Biology, 10(5), e1003588. DOI: 10.1371/journal.pcbi.1003588
Olah, C., et al. (2020). Zoom in: An introduction to circuits. Distill, 5(3), e00024.001. DOI: 10.23915/distill.00024.001
O'Neil, C. (2016). Weapons of math destruction: How big data increases inequality and threatens democracy. Crown. ISBN: 9780553418811
Ouyang, L., et al. (2022). Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35, 27730-27744. arXiv:2203.02155
Paas, F., et al. (2003). Cognitive load theory and instructional design: Recent developments. Educational Psychologist, 38(1), 1-4. DOI: 10.1207/S15326985EP3801_1
Pearl, J. (2009). Causality: Models, reasoning, and inference (2nd ed.). Cambridge University Press. DOI: 10.1017/CBO9780511803161
Perez, E., et al. (2022). Discovering language model behaviors with model-written evaluations. arXiv preprint. arXiv:2212.09251
Pezzulo, G., et al. (2015). Active inference, homeostatic regulation and adaptive behavioural control. Progress in Neurobiology, 134, 17-35. DOI: 10.1016/j.pneurobio.2015.09.001
Pierce, J. R. (1980). An introduction to information theory: Symbols, signals and noise (2nd ed.). Dover Publications. ISBN: 9780486240619
Popper, K. R. (1959). The logic of scientific discovery. Hutchinson. ISBN: 9780415278447
Posner, M. I., & Petersen, S. E. (1990). The attention system of the human brain. Annual Review of Neuroscience, 13, 25-42. DOI: 10.1146/annurev.ne.13.030190.000325
Prigogine, I. (1977). Time, structure, and fluctuations. Science, 201(4358), 777-785. DOI: 10.1126/science.201.4358.777
Prigogine, I., & Stengers, I. (1984). Order out of chaos: Man's new dialogue with nature. Bantam Books. ISBN: 9780553343632
Putnam, H. (1981). Reason, truth and history. Cambridge University Press. DOI: 10.1017/CBO9780511625398
Quastler, H. (Ed.). (1956). Information theory in psychology: Problems and methods. Free Press. ISBN: 9780029255605
Quine, W. V. (1951). Two dogmas of empiricism. The Philosophical Review, 60(1), 20-43. DOI: 10.2307/2181906
Quionero-Candela, J., Sugiyama, M., Schwaighofer, A., & Lawrence, N. D. (2009). Dataset shift in machine learning. MIT Press. ISBN: 9780262170055
Russell, S. (2019). Human compatible: Artificial intelligence and the problem of control. Viking Press. ISBN: 9780525558613
Russell, S. J., & Norvig, P. (2020). Artificial intelligence: A modern approach (4th ed.). Pearson. ISBN: 9780134610993
Schrödinger, E. (1944). What is life? The physical aspect of the living cell. Cambridge University Press. ISBN: 9780521427081
Searle, J. R. (1980). Minds, brains, and programs. Behavioral and Brain Sciences, 3(3), 417-424. DOI: 10.1017/S0140525X00005756
Settles, B. (2012). Active learning. Morgan & Claypool Publishers. DOI: 10.2200/S00429ED1V01Y201207AIM018
Shalev-Shwartz, S., & Ben-David, S. (2014). Understanding machine learning: From theory to algorithms. Cambridge University Press. ISBN: 9781107057135
Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27(3), 379-423. DOI: 10.1002/j.1538-7305.1948.tb01338.x
Shumailov, I., et al. (2024). AI models can collapse when trained on recursively generated data. Nature, 631(8022), 755-759. DOI: 10.1038/s41586-024-07566-y
Silver, D., et al. (2016). Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587), 484-489. DOI: 10.1038/nature16961
Silver, D., et al. (2017). Mastering the game of Go without human knowledge. Nature, 550(7676), 354-359. DOI: 10.1038/nature24270
Spirtes, P., Glymour, C. N., & Scheines, R. (2000). Causation, prediction, and search (2nd ed.). MIT Press. ISBN: 9780262194402
Sterling, P. (2012). Allostasis: A model of predictive regulation. Physiology & Behavior, 106(1), 5-15. DOI: 10.1016/j.physbeh.2011.06.004
Stiennon, N., et al. (2020). Learning to summarize with human feedback. Advances in Neural Information Processing Systems, 33, 3008-3021. arXiv:2009.01325
Stigler, S. M. (1997). Regression towards the mean, historically considered. Statistical Methods in Medical Research, 6(2), 103-114. DOI: 10.1177/096228029700600202
Sutskever, I. (2019). An observation on generalization. Bounded Regret. Link
Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction (2nd ed.). MIT Press. ISBN: 9780262039246
Sweller, J. (1988). Cognitive load during problem solving: Effects on learning. Cognitive Science, 12(2), 257-285. DOI: 10.1207/s15516709cog1202_4
Szegedy, C. (2020). A promising path towards autoformalization and general artificial intelligence. Intelligent Computer Mathematics, 3-20. DOI: 10.1007/978-3-030-53518-6_1
Thanh-Tung, H., & Tran, T. (2020). Catastrophic forgetting and mode collapse in GANs. 2020 International Joint Conference on Neural Networks (IJCNN), 1-10. DOI: 10.1109/IJCNN48605.2020.9207181
Thayer, J. F., et al. (2012). A meta-analysis of heart rate variability and neuroimaging studies. Neuroscience & Biobehavioral Reviews, 36(2), 747-756. DOI: 10.1016/j.neubiorev.2011.11.009
Thoppilan, R., et al. (2022). LaMDA: Language models for dialog applications. arXiv preprint. arXiv:2201.08239
Tishby, N., & Zaslavsky, N. (2015). Deep learning and the information bottleneck principle. 2015 IEEE Information Theory Workshop (ITW), 1-5. arXiv:1503.02406
Tononi, G., et al. (2016). Integrated information theory: From consciousness to its physical substrate. Nature Reviews Neuroscience, 17(7), 450-461. DOI: 10.1038/nrn.2016.44
Turing, A. M. (1936). On computable numbers, with an application to the Entscheidungsproblem. Proceedings of the London Mathematical Society, 2(1), 230-265. DOI: 10.1112/plms/s2-42.1.230
Vapnik, V. N. (1998). Statistical learning theory. Wiley. ISBN: 978-0-471-03003-4
Vernon, D., et al. (2007). A survey of artificial cognitive systems. IEEE Transactions on Evolutionary Computation, 11(2), 151-180. DOI: 10.1109/TEVC.2006.890274
Vogel, S. (1994). Life in moving fluids: The physical biology of flow (2nd ed.). Princeton University Press. ISBN: 9780691026169
von Foerster, H. (1981). Observing systems. Intersystems Publications. ISBN: 9780914105190
Walton, D. N. (1991). Begging the question: Circular reasoning as a tactic of argumentation. Greenwood Press. ISBN: 9780313275999
Weidinger, L., et al. (2021). Ethical and social risks of harm from language models. arXiv preprint. arXiv:2112.04359
Whewell, W. (1840). The philosophy of the inductive sciences, founded upon their history. John W. Parker. Access: Internet Archive
Wiener, N. (1948). Cybernetics: Or control and communication in the animal and the machine. MIT Press. ISBN: 9780262730099
Williams, G. C. (1966). Adaptation and natural selection: A critique of some current evolutionary thought. Princeton University Press. ISBN: 9780691026152
Wilson, E. O. (1998). Consilience: The unity of knowledge. Knopf. ISBN: 9780679450771
Winner, L. (1980). Do artifacts have politics? Daedalus, 109(1), 121-136. Access: JSTOR
Wolpert, D. H. (2018). The stochastic thermodynamics of computation. Journal of Physics A: Mathematical and Theoretical, 52(19), 193001. DOI: 10.1088/1751-8121/ab073d
Yudkowsky, E. (2001). Creating friendly AI. The Singularity Institute. Link
Zhou, C., et al. (2024). LIMA: Less is more for alignment. arXiv preprint. arXiv:2305.11206
Zuboff, S. (2019). The age of surveillance capitalism: The fight for a human future at the new frontier of power. PublicAffairs. ISBN: 9781610395694

	All versions	This version
Views	96	96
Downloads	73	73
Data volume	93.1 MB	93.1 MB

Attention is Not Grounding_ Model Collapse as Informational Closure in Self-Referential Learning Systems (3.2).pdf

Files (930.9 kB)

Related works

Dates

References

Attention is Not Grounding: Model Collapse as Informational Closure in Self-Referential Learning Systems

Authors/Creators

Description

Files

Attention is Not Grounding_ Model Collapse as Informational Closure in Self-Referential Learning Systems (3.2).pdf

Files (930.9 kB)

Additional details

Related works

Dates

References