Concepedia

Publication | Closed Access

Discrimination-aware data mining

672

Citations

13

References

2008

Year

TLDR

Discrimination in civil rights law refers to unfair treatment based on group membership, and data‑mining derived rules can perpetuate such bias. The paper introduces and studies the concept of discriminatory classification rules. The authors formulate the redlining problem precisely and relate discriminatory rules to apparently safe ones using background knowledge. They show that guaranteeing non‑discrimination is non‑trivial, that simply removing discriminatory attributes is insufficient, and validate their approach empirically on the German credit dataset.

Abstract

In the context of civil rights law, discrimination refers to unfair or unequal treatment of people based on membership to a category or a minority, without regard to individual merit. Rules extracted from databases by data mining techniques, such as classification or association rules, when used for decision tasks such as benefit or credit approval, can be discriminatory in the above sense. In this paper, the notion of discriminatory classification rules is introduced and studied. Providing a guarantee of non-discrimination is shown to be a non trivial task. A naive approach, like taking away all discriminatory attributes, is shown to be not enough when other background knowledge is available. Our approach leads to a precise formulation of the redlining problem along with a formal result relating discriminatory rules with apparently safe ones by means of background knowledge. An empirical assessment of the results on the German credit dataset is also provided.

References

YearCitations

1998

10.5K

1994

9.4K

1972

3.7K

2000

3K

1998

2.2K

2000

1.7K

2003

815

2003

560

2004

105

2003

101

Page 1