MLTE: A process and tool for test and evaluation of machine learning models
Authors/Creators
Description
In January 2025, Alex Derr from the Software Engineering Institute at Carnegie Mellon University joined the HiRSE Seminar Series to talk about “MLTE: A process and tool for test and evaluation of machine learning models”
Abstract:
Test and Evaluation (T&E) of ML models largely focuses on model performance (e.g., accuracy) and often does not consider system aspects, which leads to models that fail in production. We will present MLTE, a semi-automated process and tool that enables negotiation, specification, and testing of ML model functional and non-functional requirements. A Negotiation Card records results of stakeholder discussions, which drive model development decisions and relevant test cases. MLTE automates test case execution and stores results that can be shared with stakeholders to provide evidence of testing that guides future iterations and system-level decisions.
The presentation recording is available on the HiRSE YouTube Channel: https://www.youtube.com/watch?v=ivkwKwCx_mQ
Learn more about the HiRSE Seminar Series: https://www.helmholtz-hirse.de/series.html
Files
Files
(19.1 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:2a2e87f56ed15b53bca2b7e599df44dd
|
19.1 MB | Download |