The Prototype Fairness Illusion: Why Prototype-Based Fair Representations Fail on Image Data
Description
The Prototype Fairness Illusion: Why Prototype-Based Fair Representations Fail on Image Data
This repository contains the research paper studying the behavior of prototype-based fairness methods when applied to high-dimensional visual data.
Abstract
Prototype-based fair representation learning, introduced by Zemel et al. (2013) through Learning Fair Representations (LFR), has demonstrated strong fairness guarantees on tabular data by enforcing statistical parity in prototype assignments. In this work, we systematically study what happens when this framework is extended to high-dimensional image data.
We make three primary contributions:
-
Geometric analysis of vanilla LFR on images.
We provide both a theoretical argument and empirical evidence showing that vanilla LFR fails fundamentally on image data. Euclidean distance in pixel space is not semantically meaningful for faces, causing prototype assignments to become nearly uniform. As a result, the fairness constraint (L_z) approaches zero without actually removing sensitive attribute information from the learned representation. -
Deep Semantic LFR (DS-LFR).
We propose an extension that moves prototypes into a learned semantic convolutional latent space and replaces pixel-wise reconstruction loss with a VGG-based perceptual loss. This modification substantially improves classification performance, increasing accuracy from 51.95% to 92.08% on the CelebA benchmark. -
Fairness Activation Threshold.
We identify a previously unreported optimization phenomenon. During early training (epochs 1–14), the model exhibits complete collapse with (L_z = 0) and (L_y = 0.693) (random guessing). At epoch 15, a sharp phase transition occurs where classification and fairness objectives activate simultaneously.
Despite the architectural improvements, sensitive attribute accuracy (sAcc) remains approximately 0.926 across all hyperparameter configurations. This demonstrates that statistical parity in prototype assignments is fundamentally weaker than representation-level disentanglement.
These findings suggest that the (L_z) objective, regardless of weighting, cannot remove protected attribute information from useful visual representations. Instead, adversarial disentanglement mechanisms are likely required to achieve true representation-level fairness in image models.
Dataset: CelebA (Liu et al., 2015)
Code and experiments: https://www.kaggle.com/code/proprak01/prototype-fairness-main-image-data
DOI: https://doi.org/10.5281/zenodo.19016833
Files
LEARNING_FAIR_REPRESENTATIONS.pdf
Files
(19.5 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:210ad9f5b3e05eaab7d56930293a264d
|
19.5 MB | Preview Download |
Additional details
Software
- Repository URL
- https://www.kaggle.com/code/proprak01/prototype-fairness-main-image-data
- Programming language
- Python
- Development Status
- Active
References
- Zemel, R., Wu, Y., Swersky, K., Pitassi, T., & Dwork, C. (2013). Learning Fair Representations. ICML.
- Liu, Z., Luo, P., Wang, X., & Tang, X. (2015). Deep Learning Face Attributes in the Wild. Proceedings of ICCV.