Architectural Constraints in LLM-Simulated Cognitive Decline: In Silico Dissociation of Memory Deficits and Generative Language as Candidate Digital Biomarkers
Authors/Creators
-
Pérez Elvira, Rubén
(Contact person)1
-
Oltra Cucarella, Javier
(Researcher)
- Agudo Juan, María (Researcher)
-
Polo Ferrero, Luis
(Researcher)
-
Quintana Díaz, Manuel
(Researcher)
-
Bosch Bayard, Jorge Francisco
(Researcher)
-
Salgado Ruiz, Alfonso
(Researcher)2
-
Mamun-or-Rashid, A.N.M.
(Researcher)
-
Juárez Vela, Raúl
(Researcher)
Description
This study examined whether large language models (LLMs) can generate clinically realistic profiles of cognitive decline and whether simulated deficits reflect architectural constraints rather than superficial role-playing artifacts. Using GPT-4o-mini, we generated synthetic cohorts (n = 10 per group) representing healthy aging, mild cognitive impairment (MCI), and Alzheimer’s disease (AD), assessed through a conversational neuropsychological battery covering episodic memory, verbal fluency, narrative production, orientation, naming, and comprehension. Experiment 1 tested whether synthetic subjects exhibited graded cognitive profiles consistent with clinical progression (Control > MCI > AD). Experiment 2 systematically manipulated prompt context in AD subjects (short, rich biographical, and few-shot prompts) to dissociate robust from manipulable deficits. Significant cognitive gradients emerged (p < 0.001) across eight of thirteen domains. AD subjects showed impaired episodic memory (Cohen’s d = 4.71), increased memory intrusions, and reduced narrative length (d = 3.07). Critically, structurally constrained memory tasks (episodic recall, digit span) were invariant to prompting (p > 0.05), whereas generative tasks (narrative length, verbal fluency) showed high sensitivity (F > 100, p < 0.001). Rich biographical prompts paradoxically increased memory intrusions by 343%, indicating semantic interference rather than cognitive rescue. These results demonstrate that LLMs can serve as in silico test benches for exploring candidate digital biomarkers and clinical training protocols, while highlighting architectural constraints that may inform computational hypotheses about memory and language processing.
Files
ai-07-00069.pdf
Files
(911.9 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:9d7662240a0857643650390f258d0b21
|
911.9 kB | Preview Download |
Additional details
Identifiers
References
- Park, J.S.; O'Brien, J.C.; Cai, C.J.; Morris, M.R.; Liang, P.; Bernstein, M.S. Generative Agents: Interactive Simulacra of Human Behavior. arXiv 2023, arXiv:2304.03442. [Google Scholar] [CrossRef] Song, H.; Zang, W.-N.; Hu, J.; Liu, T. Generating Persona Consistent Dialogues by Exploiting Natural Language Inference. In Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20); Association for the Advancement of Artificial Intelligence: Washington, DC, USA, 2020; pp. 8878–8885. [Google Scholar] Wang, Y.; Zhao, J.; Ones, D.S.; He, L.; Xu, X. Evaluating the Ability of Large Language Models to Emulate Personality. Sci. Rep. 2025, 15, 519. [Google Scholar] [CrossRef] Xu, W.; Fan, W.; Zhu, Y.; Wang, B. Consistency of Responses and Continuations Generated by Large Language Models on Social Media. arXiv 2025, arXiv:2501.08102. [Google Scholar] [CrossRef] Zhang, S.; Dinan, E.; Urbanek, J.; Szlam, A.; Kiela, D.; Weston, J. Personalizing Dialogue Agents: I Have a Dog, Do You Have Pets Too? arXiv 2018, arXiv:1801.07243. [Google Scholar] [CrossRef] Baile Ayensa, J.I. Paciente Con Depresión Creado Por Inteligencia Artificial de Libre Acceso Para La Enseñanza de Psicología. Estudio Preliminar de Su Validez. Rev. Tecnol. Cienc. Educ. 2024, 2024, 7–42. [Google Scholar] [CrossRef] Coda-Forno, J.; Witte, K.; Jagadish, A.K.; Binz, M.; Akata, Z.; Schulz, E. Inducing Anxiety in Large Language Models Can Induce Bias. arXiv 2023, arXiv:2304.11111. [Google Scholar] Khadangi, A.; Marxen, H.; Sartipi, A.; Tchappi, I.; Fridgen, G. When AI Takes the Couch: Psychometric Jailbreaks Reveal Internal Conflict in Frontier Models. arXiv 2025, arXiv:2512.04124. [Google Scholar] [CrossRef] Lin, Z. Large Language Models as Psychological Simulators: A Methodological Guide. arXiv 2025, arXiv:2506.16702. [Google Scholar] [CrossRef] Turing, A.M. I.—COMPUTING MACHINERY AND INTELLIGENCE. Mind 1950, LIX, 433–460. [Google Scholar] [CrossRef] Tenchov, R.; Sasso, J.M.; Zhou, Q.A. Alzheimer's Disease: Exploring the Landscape of Cognitive Decline. ACS Chem. Neurosci. 2024, 15, 3800–3827. [Google Scholar] [CrossRef] Morris, J.C.; Storandt, M.; Miller, J.P.; McKeel, D.W.; Price, J.L.; Rubin, E.H.; Berg, L. Mild Cognitive Impairment Represents Early-Stage Alzheimer Disease. Arch. Neurol. 2001, 58, 397–405. [Google Scholar] [CrossRef] Wang, M.; Jendrichovsky, P.; Kanold, P.O. Auditory Discrimination Learning Differentially Modulates Neural Representation in Auditory Cortex Subregions and Inter-Areal Connectivity. Cell Rep. 2024, 43, 114172. [Google Scholar] [CrossRef] Lerch, O.; Ferreira, D.; Stomrud, E.; Van Westen, D.; Tideman, P.; Palmqvist, S.; Mattsson-Carlgren, N.; Hort, J.; Hansson, O.; Westman, E. Predicting Progression from Subjective Cognitive Decline to Mild Cognitive Impairment or Dementia Based on Brain Atrophy Patterns. Alzheimer's Res. Ther. 2024, 16, 153. [Google Scholar] [CrossRef] [PubMed] Weintraub, S.; Wicklund, A.H.; Salmon, D.P. The Neuropsychological Profile of Alzheimer Disease. Cold Spring Harb. Perspect. Med. 2012, 2, a006171. [Google Scholar] [CrossRef] Dubois, B.; Feldman, H.H.; Jacova, C.; DeKosky, S.T.; Barberger-Gateau, P.; Cummings, J.; Delacourte, A.; Galasko, D.; Gauthier, S.; Jicha, G.; et al. Research Criteria for the Diagnosis of Alzheimer's Disease: Revising the NINCDS–ADRDA Criteria. Lancet Neurol. 2007, 6, 734–746. [Google Scholar] [CrossRef] Salmon, D.P.; Bondi, M.W. Neuropsychological Assessment of Dementia. Annu. Rev. Psychol. 2009, 60, 257–282. [Google Scholar] [CrossRef] Henry, J.D.; Crawford, J.R.; Phillips, L.H. Verbal Fluency Performance in Dementia of the Alzheimer's Type: A Meta-Analysis. Neuropsychologia 2004, 42, 1212–1222. [Google Scholar] [CrossRef] [PubMed] Gorno-Tempini, M.L.; Hillis, A.E.; Weintraub, S.; Kertesz, A.; Mendez, M.; Cappa, S.F.; Ogar, J.M.; Rohrer, J.D.; Black, S.; Boeve, B.F.; et al. Classification of Primary Progressive Aphasia and Its Variants. Neurology 2011, 76, 1006–1014. [Google Scholar] [CrossRef] [PubMed] Jack, C.R.; Knopman, D.S.; Jagust, W.J.; Petersen, R.C.; Weiner, M.W.; Aisen, P.S.; Shaw, L.M.; Vemuri, P.; Wiste, H.J.; Weigand, S.D.; et al. Tracking Pathophysiological Processes in Alzheimer's Disease: An Updated Hypothetical Model of Dynamic Biomarkers. Lancet Neurol. 2013, 12, 207–216. [Google Scholar] [CrossRef] [PubMed] Hodges, J.R.; Patterson, K. Semantic Dementia: A Unique Clinicopathological Syndrome. Lancet Neurol. 2007, 6, 1004–1014. [Google Scholar] [CrossRef] Fraser, K.C.; Meltzer, J.A.; Rudzicz, F. Linguistic Features Identify Alzheimer's Disease in Narrative Speech. J. Alzheimer's Dis. 2015, 49, 407–422. [Google Scholar] [CrossRef] Toth, L.; Hoffmann, I.; Gosztolya, G.; Vincze, V.; Szatloczki, G.; Banreti, Z.; Pakaski, M.; Kalman, J. A Speech Recognition-Based Solution for the Automatic Detection of Mild Cognitive Impairment from Spontaneous Speech. Curr. Alzheimer Res. 2018, 15, 130–138. [Google Scholar] [CrossRef] König, A.; Satt, A.; Sorin, A.; Hoory, R.; Toledo-Ronen, O.; Derreumaux, A.; Manera, V.; Verhey, F.; Aalten, P.; Robert, P.H.; et al. Automatic Speech Analysis for the Assessment of Patients with Predementia and Alzheimer's Disease. Alzheimer's Dement. 2015, 1, 112–124. [Google Scholar] [CrossRef] [PubMed] Eyigoz, E.; Mathur, S.; Santamaria, M.; Cecchi, G.; Naylor, M. Linguistic Markers Predict Onset of Alzheimer's Disease. eClinicalMedicine 2020, 28, 100583. [Google Scholar] [CrossRef] [PubMed] Horvath, A. EEG and ERP Biomarkers of Alzheimer's Disease: A Critical Review. Front. Biosci. 2018, 23, 183–220. [Google Scholar] [CrossRef] Binz, M.; Schulz, E. Using Cognitive Psychology to Understand GPT-3. Proc. Natl. Acad. Sci. USA 2023, 120, e2218523120. [Google Scholar] [CrossRef] [PubMed] Shen, X.; Chen, Z.; Backes, M.; Shen, Y.; Zhang, Y. "Do Anything Now": Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models. arXiv 2023, arXiv:2308.03825. [Google Scholar] Faust, M.E.; Balota, D.A.; Duchek, J.M.; Gernsbacher, M.A.; Smith, S. Inhibitory Control during Sentence Comprehension in Individuals with Dementia of the Alzheimer Type. Brain Lang. 1997, 57, 225–253. [Google Scholar] [CrossRef] [PubMed] Curiel Cid, R.E.; Vaillancourt, D.; Ortega, A.; Crocco, E.A.; Crenshaw, K.; Remedios, S.M.; Asken, B.M.; Armstrong, M.J.; Uribe, I.V.; Wang, W.; et al. Semantic Intrusion Errors Differentiate between Amnestic MCI Who Are Plasma P-Tau217+ from p-Tau217- after Adjusting for Initial Learning Strength. Front. Neurol. 2025, 16, 1613694. [Google Scholar] [CrossRef] Torres, V.L.; Rosselli, M.; Loewenstein, D.A.; Curiel, R.E.; Vélez Uribe, I.; Lang, M.; Arruda, F.; Penate, A.; Vaillancourt, D.E.; Greig, M.T.; et al. Types of Errors on a Semantic Interference Task in Mild Cognitive Impairment and Dementia. Neuropsychology 2019, 33, 670–684. [Google Scholar] [CrossRef] Capp, K.E.; Curiel Cid, R.E.; Crocco, E.A.; Stripling, A.; Kitaigorodsky, M.; Sierra, L.A.; Melo, J.G.; Loewenstein, D.A. Semantic Intrusion Error Ratio Distinguishes Between Cognitively Impaired and Cognitively Intact African American Older Adults. J. Alzheimer's Dis. 2020, 73, 785–790. [Google Scholar] [CrossRef] [PubMed] Chasles, M.-J.; Joubert, S.; Cole, J.; Delage, É.; Rouleau, I. Vulnerability to Semantic and Phonological Interference in Normal Aging and Amnestic Mild Cognitive Impairment (aMCI). Neuropsychology 2024, 38, 416–429. [Google Scholar] [CrossRef] Zheng, D.D.; Curiel Cid, R.E.; Duara, R.; Kitaigorodsky, M.; Crocco, E.; Loewenstein, D.A. Semantic Intrusion Errors as a Function of Age, Amyloid, and Volumetric Loss: A Confirmatory Path Analysis. Int. Psychogeriatr. 2022, 34, 991–1001. [Google Scholar] [CrossRef] Montemurro, S.; Mondini, S.; Arcara, G. Heterogeneity of Effects of Cognitive Reserve on Performance in Probable Alzheimer's Disease and in Subjective Cognitive Decline. Neuropsychology 2021, 35, 876–888. [Google Scholar] [CrossRef] Wang, X.; Ye, T.; Zhou, W.; Zhang, J. Alzheimer's Disease Neuroimaging Initiative. Uncovering Heterogeneous Cognitive Trajectories in Mild Cognitive Impairment: A Data-Driven Approach. Alzheimer's Res. Ther. 2023, 15, 57. [Google Scholar] [CrossRef] [PubMed] Kriegeskorte, N.; Douglas, P.K. Cognitive Computational Neuroscience. Nat. Neurosci. 2018, 21, 1148–1160. [Google Scholar] [CrossRef] [PubMed] Cichy, R.M.; Kaiser, D. Deep Neural Networks as Scientific Models. Trends Cogn. Sci. 2019, 23, 305–317. [Google Scholar] [CrossRef] Cheng, S.; Pan, L.; Yin, X.; Wang, X.; Wang, W.Y. Understanding the Interplay between Parametric and Contextual Knowledge for Large Language Models. arXiv 2024, arXiv:2410.08414. [Google Scholar] [CrossRef] Tao, Y.; Hiatt, A.; Haake, E.; Jetter, A.J.; Agrawal, A. When Context Leads but Parametric Memory Follows in Large Language Models. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing; Association for Computational Linguistics: Miami, FL, USA, 2024; pp. 4034–4058. [Google Scholar] Dickerson, B.C.; Eichenbaum, H. The Episodic Memory System: Neurocircuitry and Disorders. Neuropsychopharmacology 2010, 35, 86–104. [Google Scholar] [CrossRef] Binder, J.R.; Desai, R.H. The Neurobiology of Semantic Memory. Trends Cogn. Sci. 2011, 15, 527–536. [Google Scholar] [CrossRef] Loftus, E.F.; Pickrell, J.E. The Formation of False Memories. Psychiatr. Ann. 1995, 25, 720–725. [Google Scholar] [CrossRef] Themistocleous, C.; Eckerström, M.; Kokkinakis, D. Voice Quality and Speech Fluency Distinguish Individuals with Mild Cognitive Impairment from Healthy Controls. PLoS ONE 2020, 15, e0236009. [Google Scholar] [CrossRef] Fergadiotis, G.; Wright, H.H.; West, T.M. Measuring Lexical Diversity in Narrative Discourse of People with Aphasia. Am. J. Speech Lang. Pathol. 2013, 22, S397–S408. [Google Scholar] [CrossRef] Loewenstein, D.A.; Acevedo, A.; Luis, C.; Crum, T.; Barker, W.W.; Duara, R. Semantic Interference Deficits and the Detection of Mild Alzheimer's Disease and Mild Cognitive Impairment without Dementia. J. Int. Neuropsychol. Soc. 2004, 10, 91–100. [Google Scholar] [CrossRef] [PubMed] Thomas, K.R.; Eppig, J.; Edmonds, E.C.; Jacobs, D.M.; Libon, D.J.; Au, R.; Salmon, D.P.; Bondi, M.W. The Alzheimer's Disease Neuroimaging Initiative* Word-List Intrusion Errors Predict Progression to Mild Cognitive Impairment. Neuropsychology 2018, 32, 235–245. [Google Scholar] [CrossRef] [PubMed] Brooke, P.; Bullock, R. Validation of a 6 Item Cognitive Impairment Test with a View to Primary Care Usage. Int. J. Geriatr. Psychiatry 1999, 14, 936–940. [Google Scholar] [CrossRef] Petersen, R.C.; Aisen, P.S.; Beckett, L.A.; Donohue, M.C.; Gamst, A.C.; Harvey, D.J.; Jack, C.R.; Jagust, W.J.; Shaw, L.M.; Toga, A.W.; et al. Alzheimer's Disease Neuroimaging Initiative (ADNI): Clinical Characterization. Neurology 2010, 74, 201–209. [Google Scholar] [CrossRef]