Queens are Powerful too: Mitigating Gender Bias in Dialogue Generation

TLDR

Social biases present in data are often directly reflected in the predictions of models trained on that data. We analyze gender bias in dialogue data and examine how this bias is not only replicated but also amplified in subsequent generative chit‑chat dialogue models. The authors measure bias in six dialogue datasets, select the most biased LIGHT dataset, and apply counterfactual data augmentation, targeted data collection, and bias‑controlled training to mitigate bias, evaluating with quantitative metrics and human assessments. Our techniques mitigate gender bias by balancing the genderedness of generated dialogue utterances, are particularly effective in combination, and produce less gendered but equally engaging chit‑chat responses.

Abstract

Social biases present in data are often directly reflected in the predictions of models trained on that data. We analyze gender bias in dialogue data, and examine how this bias is not only replicated, but is also amplified in subsequent generative chit-chat dialogue models. We measure gender bias in six existing dialogue datasets before selecting the most biased one, the multi-player text-based fantasy adventure dataset LIGHT, as a testbed for bias mitigation techniques. We consider three techniques to mitigate gender bias: counterfactual data augmentation, targeted data collection, and bias controlled training. We show that our proposed techniques mitigate gender bias by balancing the genderedness of generated dialogue utterances, and find that they are particularly effective in combination. We evaluate model performance with a variety of quantitative methods—including the quantity of gendered words, a dialogue safety classifier, and human assessments—all of which show that our models generate less gendered, but equally engaging chit-chat responses.

References

Page 1

	Year	Citations

Page 1