The impact of fine-tuning LLMs on the quality of automated therapy assessed by digital patients
Authors/Creators
- 1. Efi Arazi School of Computer Science, Reichman University, Herzliya, Israel
- 2. Baruch Ivcher School of Psychology, Reichman University, Herzliya, Israel
- 3. Sammy Ofer School of Communications, Reichman University, Herzliya, Israel
Description
The use of generative large language models (LLMs) in mental health applications is gaining traction, with some proposals even suggesting LLM-based automated therapists. In this study, we assess the impact of fine-tuning therapist LLMs to improve the quality of therapy sessions, addressing a critical question in LLM-based mental health research. Specifically, we demonstrate that fine-tuning with datasets focused on specific therapeutic techniques significantly enhances the performance of LLM therapists. To facilitate this assessment, we introduce a novel evaluation system based on digital patients, powered by LLMs, which engage in text-based therapy sessions and provide session evaluations through questionnaires designed for human patients.
This method addresses the inadequacies of traditional text-similarity metrics, which are insufficient for assessing the quality of therapeutic interactions. This study centers on motivational interviewing (MI), a structured and goal-oriented therapeutic approach. However, our digital therapists and patients can be adapted to work in other forms of therapy. We believe that our digital therapists offer a standardized method for assessing automated therapists and showcasing the potential of LLMs in mental health care.
Files
s44184-025-00159-1.pdf
Files
(2.4 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:40e499f44290544373633b8c0be84b43
|
2.4 MB | Preview Download |