Correlation between Tabular Data Generative Metrics and Downstream Classifier Accuracy
Description
Abstract Tabular data, spreadsheets organized in rows and columns, are ubiquitous across scientific fields, from biomedicine to particle physics to economics and climate science 1,2 . The fundamental prediction task of filling in missing values of a label column based on the rest of the columns is essential for various applications as diverse as biomedical risk models, drug discovery and materials science. Although deep learning has revolutionized learning from raw data and led to numerous high-profile success stories 3--5 , gradient-boosted decision trees 6--9 have dominated tabular data for th
Research goal: What is the correlation between novel tabular data generative metrics and downstream classifier accuracy across mixed data types?
Autonomous synthesis report generated by SOVEREIGN Research Kernel. Tribunal consensus score: 9.0/10.
Notes
Files
paper.pdf
Files
(71.1 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:d9a5380524f41e679b27043187cee313
|
71.1 kB | Preview Download |