From Parity to Preference-based Notions of Fairness in Classification

TLDR

Automated decision systems increasingly raise fairness concerns, prompting studies on defining, detecting, and mitigating unfairness, but parity‑based notions are often too strict, limiting accuracy. This paper proposes preference‑based fairness notions, inspired by fair‑division theory, where each group collectively prefers its own treatment or outcome regardless of parity with other groups. The authors develop tractable margin‑based classifier proxies that enforce the proposed preference‑based fairness constraints. Experiments on synthetic and real datasets demonstrate that preference‑based fairness achieves higher decision accuracy than parity‑based fairness.

Abstract

The adoption of automated, data-driven decision making in an ever expanding range of applications has raised concerns about its potential unfairness towards certain social groups. In this context, a number of recent studies have focused on defining, detecting, and removing unfairness from data-driven decision systems. However, the existing notions of fairness, based on parity (equality) in treatment or outcomes for different social groups, tend to be quite stringent, limiting the overall decision making accuracy. In this paper, we draw inspiration from the fair-division and envy-freeness literature in economics and game theory and propose preference-based notions of fairness -- given the choice between various sets of decision treatments or outcomes, any group of users would collectively prefer its treatment or outcomes, regardless of the (dis)parity as compared to the other groups. Then, we introduce tractable proxies to design margin-based classifiers that satisfy these preference-based notions of fairness. Finally, we experiment with a variety of synthetic and real-world datasets and show that preference-based fairness allows for greater decision accuracy than parity-based fairness.