Scalability of SageMaker Autopilot's Preprocessing Pipeline and Fairness Metrics on Large-Scale Tabular Datasets
Description
Modern approach to artificial intelligence (AI) aims to design algorithms that learn directly from data. This approach has achieved impressive results and has contributed significantly to the progress of AI, particularly in the sphere of supervised deep learning. It has also simplified the design of machine learning systems as the learning process is highly automated. However, not all data processing tasks in conventional deep learning pipelines have been automated. In most cases data has to be manually collected, preprocessed and further extended through data augmentation before they can be e
Research goal: How does the scalability of SageMaker Autopilot's preprocessing pipeline affect fairness metrics (e.g., group fairness) when applied to large-scale tabular datasets (e.g., Criteo, Kaggle datasets) compared to distributed fairness-aware preprocessing frameworks like Turi Create?
Autonomous synthesis report generated by SOVEREIGN Research Kernel. Tribunal consensus score: 8.1/10.
Notes
Files
paper.pdf
Files
(74.2 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:9a21356128b516bce7accce620784d71
|
74.2 kB | Preview Download |