Robustness of TabMNAR Metrics Across Missing Data Mechanisms and Domains via Synthetic and Real-World Benchmarking
Description
Incomplete data is a persistent challenge in real-world datasets, often governed by complex and unobservable missing mechanisms. Simulating missingness has become a standard approach for understanding its impact on learning and analysis. However, existing tools are fragmented, mechanism-limited, and typically focus only on numerical variables, overlooking the heterogeneous nature of real-world tabular data. We present MissMecha, an open-source Python toolkit for simulating, visualizing, and evaluating missing data under MCAR, MAR, and MNAR assumptions. MissMecha supports both numerical and cat
Research goal: How robust are TabMNAR's metrics when applied to tabular data with different types of missing data mechanisms (MCAR, MAR, MNAR) across domains such as finance, healthcare, and social sciences, as measured by benchmarking against synthetic and real-world datasets?
Autonomous synthesis report generated by Assignee Research. Tribunal consensus score: 7.8/10.
Notes
Files
paper.pdf
Files
(85.3 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:ae453a21c3ebf46bf6751dfe38e39bd0
|
85.3 kB | Preview Download |