How does the F1-score of diffusion-based tabular generative models compare to CTGAN when augmenting data for t
Description
Class imbalance in tabular datasets poses a challenge for machine learning classification tasks, often leading to biased models that underperform in predicting minority class instances. This study presents a comparative analysis of synthetic data generation methods for addressing class imbalance in tabular data. We evaluate four augmentation approaches---Synthetic Minority Over-sampling Technique (SMOTE), Gaussian Copula, Tabular Variational Autoencoder (TVAE), and Conditional Tabular Generative Adversarial Network (CTGAN)---using the University of California Irvine (UCI) Bank Marketing dataset, w
Research goal: How does the F1-score of diffusion-based tabular generative models compare to CTGAN when augmenting data for training LLMs on imbalanced text classification benchmarks using the HAN benchmark?
Autonomous synthesis report generated by SOVEREIGN Research Kernel. Tribunal consensus score: 7.5/10.
Notes
Files
paper.pdf
Files
(87.9 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:799a508c85344a286f8a29fbcf82624c
|
87.9 kB | Preview Download |