IJAIEM

Abstract

Quantifying Bias in Word Embeddings: A Comparative Study of Mitigation Techniques

Christopher Kruegel

Abstract

Word embeddings such as Word2Vec and GloVe have significantly improved natural language processing tasks by capturing semantic relationships between words. However, they also encode societal biases present in training data, particularly regarding gender, race, and profession. This paper quantifies such biases using established benchmarks like the Word Embedding Association Test (WEAT) and evaluates three mitigation techniques: Hard Debiasing, Gender-Neutral Word Embeddings (GN-GloVe), and Projection Removal. We apply these methods to pre-trained embeddings trained on the Google News and Wikipedia+Gigaword corpora. Bias reduction is measured alongside downstream task performance on analogy completion, sentiment analysis, and named entity recognition (NER). Results show that hard debiasing effectively reduces WEAT scores by over 80%, but sometimes degrades performance on syntactic tasks. GN-GloVe maintains competitive task performance while achieving moderate bias reduction. P

IMPORTANT LINKS

About us
Issues
Publication Ethics and Publication Malpractice Statement
Author Guidliness
Copyright Form
Editorial Board
Contact us

Plagiarism

Check Article for

Plagiarism

UPDATES

call for paper:
volume8
issue-1 october 2024
Submission date:
22.10.2024

publishing date:28.10.2024

INDEXED BY:

IJAIEM

International journal of application or innovation in engineering
and management

ISSN:2319-4847

Quantifying Bias in Word Embeddings: A Comparative Study of Mitigation Techniques

Abstract

Plagiarism

IJAIEM

International journal of application or innovation in engineering and management

ISSN:2319-4847

Quantifying Bias in Word Embeddings: A Comparative Study of Mitigation Techniques

Abstract

Plagiarism

International journal of application or innovation in engineering
and management