Achievable Fairness on Your Data With Utility Guarantees
Muhammad Faaiz Taufiq, Jean-Francois Ton, Yang Liu
arXiv:2402.17106v4 »Full PDF »
In machine learning fairness, training models that minimize disparity across
different sensitive groups often leads to diminished accuracy, a phenomenon
known as the fairness-accuracy trade-off. The severity of this trade-off
inherently depends on dataset characteristics such as dataset imbalances or
biases and therefore, using a uniform fairness requirement across diverse
datasets remains questionable. To address this, we present a computationally
efficient approach to approximate the fairness-accuracy trade-off curve
tailored to individual datasets, backed by rigorous statistical guarantees. By
utilizing the You-Only-Train-Once (YOTO) framework, our approach mitigates the
computational burden of having to train multiple models when approximating the
trade-off curve. Crucially, we introduce a novel methodology for quantifying
uncertainty in our estimates, thereby providing practitioners with a robust
framework for auditing model fairness while avoiding false conclusions due to
estimation errors. Our experiments spanning tabular (e.g., Adult), image
(CelebA), and language (Jigsaw) datasets underscore that our approach not only
reliably quantifies the optimum achievable trade-offs across various data
modalities but also helps detect suboptimality in SOTA fairness methods.