OSCaR: Orthogonal Subspace Correction and Rectification of Biases in Word Embeddings

Abstract

Language representations are known to carry certain associations (e.g., gendered connotations) which may lead to invalid and harmful predictions in downstream tasks. While existing methods are effective at mitigating such unwanted associations by linear projection, we argue that they are too aggressive: not only do they remove such associations, they also erase information that should be retained. To address this issue, we propose OS-CAR (Orthogonal Subspace Correction and Rectification), a balanced approach of mitigation that focuses on disentangling associations between concepts that are deemed problematic, instead of removing concepts wholesale. We develop new measurements for evaluating information retention relevant to the debiasing goal. Our experiments on genderoccupation associations show that OSCAR is a well-balanced approach that ensures that semantic information is retained in the embeddings and unwanted associations are also effectively mitigated.

References

Page 1

	Year	Citations

Page 1