Concepedia

TLDR

The notion of discrimination is defined solely by the joint statistics of predictor, target, and protected attribute, independent of individual feature interpretation. The authors propose a criterion for detecting and eliminating discrimination against a specified sensitive attribute in supervised learning, and analyze the limits of such oblivious bias tests. They demonstrate that, given data on predictor, target, and protected group membership, one can optimally adjust any learned predictor to remove discrimination, and outline how to implement this adjustment. The framework shifts the cost of misclassification from disadvantaged groups to decision makers, encouraging improved accuracy, and is illustrated with a FICO credit‑score case study.

Abstract

We propose a criterion for discrimination against a specified sensitive attribute in supervised learning, where the goal is to predict some target based on available features. Assuming data about the predictor, target, and membership in the protected group are available, we show how to optimally adjust any learned predictor so as to remove discrimination according to our definition. Our framework also improves incentives by shifting the cost of poor classification from disadvantaged groups to the decision maker, who can respond by improving the classification accuracy. In line with other studies, our notion is oblivious: it depends only on the joint statistics of the predictor, the target and the protected attribute, but not on interpretation of individualfeatures. We study the inherent limits of defining and identifying biases based on such oblivious measures, outlining what can and cannot be inferred from different oblivious tests. We illustrate our notion using a case study of FICO credit scores.