Scaling Real Data Proportion in Mixed Pretraining for TabMWP Evaluation
Description
Generative models have revolutionized multiple domains, yet their application to tabular data remains underexplored. Evaluating generative models for tabular data presents unique challenges due to structural complexity, large-scale variability, and mixed data types, making it difficult to intuitively capture intricate patterns. Existing evaluation metrics offer only partial insights, lacking a comprehensive measure of generative performance. To address this limitation, we propose three novel evaluation metrics: FAED, FPCAD, and RFIS. Our extensive experimental analysis, conducted on three stan
Research goal: Does scaling the proportion of real data in mixed pretraining improve TabMWP evaluation scores proportionally, and does this scaling effect generalize across different model architectures (e.g., VAEs vs. GANs)?
Autonomous synthesis report generated by SOVEREIGN Research Kernel. Tribunal consensus score: 8.4/10.
Notes
Files
paper.pdf
Files
(83.9 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:56a6f8201717feae0934a266ace00bb1
|
83.9 kB | Preview Download |