Neutralizing Gender Bias in Word Embeddings with Latent Disentanglement and Counterfactual Generation
- categorize
- Machine Learning
- Conference Name
- Findings of EMNLP (Findings@EMNLP 2020)
- Presentation Date
- Nov 16-20
- City
- Virtual Conference
- File
- 2297_Paper.pdf (4.6M) 40회 다운로드 DATE : 2023-11-10 00:20:02
Seungjae Shin, Kyungwoo Song, JoonHo Jang, Hyemi Kim, Weonyoung Joo, and Il-Chul Moon, Neutralizing Gender Bias in Word Embeddings with Latent Disentanglement and Counterfactual Generation, Findings of EMNLP (Findings@EMNLP 2020), Virtual Conference, Nov 16-20, 2020
Abstract
Recent researches demonstrate that word embeddings, trained on the human-generated corpus, have strong gender biases in embedding spaces, and these biases can result in the prejudiced results from the downstream tasks, i.e. sentiment analysis. Whereas the previous debiasing models project word embeddings into a linear subspace, we introduce a Latent Disentangling model with a siamese auto-encoder structure and a gradient reversal layer. Our siamese auto-encoder utilizes gender word pairs to disentangle semantics and gender information of given word, and the associated gradient reversal layer provides the negative gradient to distinguish the semantics from the gender. Afterwards, we introduce a Counterfactual Generation model to modify the gender information of words, so the original and the modified embeddings can produce a gender-neutralized word embedding after geometric alignment without loss of semantic information. Experimental results quantitatively and qualitatively indicate that the introduced method is better in debiasing word embeddings, and in minimizing the semantic information losses for NLP downstream tasks.
@inproceedings{shin-etal-2020-neutralizing,
title = "Neutralizing Gender Bias in Word Embeddings with Latent Disentanglement and Counterfactual Generation",
author = "Shin, Seungjae and Song, Kyungwoo and Jang, JoonHo and Kim, Hyemi and Joo, Weonyoung and Moon, Il-Chul",
editor = "Cohn, Trevor and He, Yulan and Liu, Yang",
booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
month = nov,
year = "2020",
address = "Online",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2020.findings-emnlp.280",
doi = "10.18653/v1/2020.findings-emnlp.280",
pages = "3126--3140"
}
Source Website:
https://aclanthology.org/2020.findings-emnlp.280/