Published January 28, 2026 | Version v1
Preprint Open

Responsible Evaluation of AI for Mental Health

  • 1. ROR icon Technical University of Darmstadt
  • 2. ROR icon Vanderbilt University
  • 3. ROR icon Universität Trier
  • 4. ROR icon Trinity College Dublin
  • 5. Universität Trier Fach Psychologie
  • 6. ROR icon University of Washington
  • 7. ROR icon Georgia Institute of Technology
  • 8. Phillips-Universität Marburg
  • 9. EDMO icon Leiden University, Leiden Institute of Chemistry
  • 10. ROR icon Bocconi University
  • 11. ROR icon Queen Mary University of London
  • 12. ROR icon University of Warwick

Description

Although artificial intelligence (AI) shows growing promise for mental health care, current approaches to evaluating AI tools in this domain remain fragmented and poorly aligned with clinical practice, social context, and first-hand user experience. This paper argues for a rethinking of responsible evaluation -- what is measured, by whom, and for what purpose -- by introducing an interdisciplinary framework that integrates clinical soundness, social context, and equity, providing a structured basis for evaluation. Through an analysis of 135 recent *CL publications, we identify recurring limitations, including over-reliance on generic metrics that do not capture clinical validity, therapeutic appropriateness, or user experience, limited participation from mental health professionals, and insufficient attention to safety and equity. To address these gaps, we propose a taxonomy of AI mental health support types -- assessment-, intervention-, and information synthesis-oriented -- each with distinct risks and evaluative requirements, and illustrate its use through case studies.  

Files

ai_mental_health_evaluation.pdf

Files (295.2 kB)

Name Size Download all
md5:15e29c0bab61b38c38adbd05c2b226ab
295.2 kB Preview Download