TALENT LLM: Multi-Label Talent Prediction in Children Using Fine-Tuned Large Language Models with Calibrated Baselines
Description
We present TALENT LLM, a system for multi-label talent prediction in children based on artifact analysis. Using a dataset of 5,173 analyses across 479 children (both synthetic and real platform users), we compare fine-tuned LLM predictions against calibrated classical baselines (Logistic Regression, LightGBM) across 7 talent categories. Our experiments demonstrate exceptional baseline performance (ROC-AUC 0.991–0.999, F1-macro 0.973–0.997) with effective probability calibration via Platt scaling, achieving ECE as low as 0.002. We introduce temporal evaluation (S1→S2 prediction) on 349 children with 2+ analyses, achieving F1-macro 0.833 for predicting future talent profiles from earlier assessments. The dataset includes 306 fine-grained talent categories and 8 artifact types (text, image, musical, audio, video, PDF, and others). SHAP analysis reveals interpretable feature importance patterns strongly aligned with educational theory.
Files
TALENT_LLM_paper.pdf
Files
(468.2 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:895dd05920b56b07bc7ab08086a9e939
|
468.2 kB | Preview Download |
Additional details
Related works
- Is supplemented by
- Software: https://github.com/Talents-kids/talent-llm (URL)
Dates
- Created
-
2025-11-27
Software
- Repository URL
- https://github.com/Talents-kids/talent-llm
- Programming language
- Python
- Development Status
- Active