Published November 28, 2025 | Version 1.0
Preprint Open

TALENT LLM: Multi-Label Talent Prediction in Children Using Fine-Tuned Large Language Models with Calibrated Baselines

  • 1. TEMNIKOVA LDA

Description

We present TALENT LLM, a system for multi-label talent prediction in children based on artifact analysis. Using a dataset of 5,173 analyses across 479 children (both synthetic and real platform users), we compare fine-tuned LLM predictions against calibrated classical baselines (Logistic Regression, LightGBM) across 7 talent categories. Our experiments demonstrate exceptional baseline performance (ROC-AUC 0.991–0.999, F1-macro 0.973–0.997) with effective probability calibration via Platt scaling, achieving ECE as low as 0.002. We introduce temporal evaluation (S1→S2 prediction) on 349 children with 2+ analyses, achieving F1-macro 0.833 for predicting future talent profiles from earlier assessments. The dataset includes 306 fine-grained talent categories and 8 artifact types (text, image, musical, audio, video, PDF, and others). SHAP analysis reveals interpretable feature importance patterns strongly aligned with educational theory.

Files

TALENT_LLM_paper.pdf

Files (468.2 kB)

Name Size Download all
md5:895dd05920b56b07bc7ab08086a9e939
468.2 kB Preview Download

Additional details

Related works

Is supplemented by
Software: https://github.com/Talents-kids/talent-llm (URL)

Dates

Created
2025-11-27

Software

Repository URL
https://github.com/Talents-kids/talent-llm
Programming language
Python
Development Status
Active