poster

WordBias: An Interactive Visual Tool for Discovering Intersectional Biases Encoded in Word Embeddings

Authors:

Md Naimul Hoque,

Klaus MuellerAuthors Info & Claims

CHI EA '21: Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems

Article No.: 429, Pages 1 - 7

https://doi.org/10.1145/3411763.3451587

Published: 08 May 2021 Publication History

Abstract

Intersectional bias is a bias caused by an overlap of multiple social factors like gender, sexuality, race, disability, religion, etc. A recent study has shown that word embedding models can be laden with biases against intersectional groups like African American females, etc. The first step towards tackling intersectional biases is to identify them. However, discovering biases against different intersectional groups remains a challenging task. In this work, we present WordBias, an interactive visual tool designed to explore biases against intersectional groups encoded in static word embeddings. Given a pretrained static word embedding, WordBias computes the association of each word along different groups like race, age, etc. and then visualizes them using a novel interactive interface. Using a case study, we demonstrate how WordBias can help uncover biases against intersectional groups like Black Muslim Males, Poor Females, etc. encoded in word embedding. In addition, we also evaluate our tool using qualitative feedback from expert interviews. The source code for this tool can be publicly accessed for reproducibility at github.com/bhavyaghai/WordBias.

Supplemental Material

ZIP File

Supplemental material

Download
2.38 MB

References

[1]

Yongsu Ahn and Yu-Ru Lin. 2019. Fairsight: Visual analytics for fairness in decision making. IEEE transactions on visualization and computer graphics 26, 1(2019), 1086–1095.

[2]

Tolga Bolukbasi, Kai-Wei Chang, James Y Zou, Venkatesh Saligrama, and Adam T Kalai. 2016. Man is to computer programmer as woman is to homemaker? debiasing word embeddings. In Advances in Neural Information Processing Systems. 4349–4357.

[3]

Joy Buolamwini and Timnit Gebru. 2018. Gender shades: Intersectional accuracy disparities in commercial gender classification. In Conference on Fairness, Accountability and Transparency. 77–91.

[4]

Ángel Alexander Cabrera, Will Epperson, Fred Hohman, Minsuk Kahng, Jamie Morgenstern, and Duen Horng Chau. 2019. FairVis: Visual analytics for discovering intersectional bias in machine learning. arXiv preprint arXiv:1904.05419(2019).

[5]

Aylin Caliskan, Joanna J Bryson, and Arvind Narayanan. 2017. Semantics derived automatically from language corpora contain human-like biases. Science 356, 6334 (2017), 183–186.

[6]

Patricia Hill Collins and Sirma Bilge. 2020. Intersectionality. John Wiley & Sons.

[7]

Kimberlé Crenshaw. 1989. Demarginalizing the intersection of race and sex: A black feminist critique of antidiscrimination doctrine, feminist theory and antiracist politics. u. Chi. Legal f. (1989), 139.

[8]

Kimberle Crenshaw. 1990. Mapping the margins: Intersectionality, identity politics, and violence against women of color. Stan. L. Rev. 43(1990), 1241.

[9]

Kimberlé W Crenshaw. 2017. On intersectionality: Essential writings. The New Press.

[10]

Maria De-Arteaga, Alexey Romanov, Hanna Wallach, Jennifer Chayes, Christian Borgs, Alexandra Chouldechova, Sahin Geyik, Krishnaram Kenthapadi, and Adam Tauman Kalai. 2019. Bias in bios: A case study of semantic representation bias in a high-stakes setting. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 120–128.

Digital Library

[11]

Sunipa Dev and Jeff Phillips. 2019. Attenuating Bias in Word vectors. In The 22nd International Conference on Artificial Intelligence and Statistics. 879–887.

[12]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805(2018).

[13]

Crystal Marie Fleming. 2018. How to be less stupid about race: On racism, white supremacy, and the racial divide. Beacon Press.

[14]

Nikhil Garg, Londa Schiebinger, Dan Jurafsky, and James Zou. 2018. Word embeddings quantify 100 years of gender and ethnic stereotypes. Proceedings of the National Academy of Sciences 115, 16(2018), E3635–E3644.

[15]

Negin Ghavami and Letitia Anne Peplau. 2013. An intersectional analysis of gender and ethnic stereotypes: Testing three hypotheses. Psychology of Women Quarterly 37, 1 (2013), 113–127.

[16]

Martin Graham and Jessie Kennedy. 2003. Using curves to enhance parallel coordinate visualisations. In Proceedings on Seventh International Conference on Information Visualization, 2003. IV 2003. IEEE, 10–16.

[17]

Wei Guo and Aylin Caliskan. 2020. Detecting Emergent Intersectional Biases: Contextualized Word Embeddings Contain a Distribution of Human-like Biases. arXiv preprint arXiv:2006.03955(2020).

[18]

Alfred Inselberg and Bernard Dimsdale. 1990. Parallel coordinates: a tool for visualizing multi-dimensional geometry. In Proc. IEEE Visualization. 361–378.

[19]

Jae Yeon Kim, Carlos Ortiz, Sarah Nam, Sarah Santiago, and Vivek Datta. 2020. Intersectional Bias in Hate Speech and Abusive Language Datasets. arXiv preprint arXiv:2005.05921(2020).

[20]

Austin C Kozlowski, Matt Taddy, and James A Evans. 2019. The geometry of culture: Analyzing the meanings of class through word embeddings. American Sociological Review 84, 5 (2019), 905–949.

[21]

Vaibhav Kumar, Tenzin Singhay Bhotia, Vaibhav Kumar, and Tanmoy Chakraborty. 2020. Nurse is closer to woman than surgeon? mitigating gender-biased proximities in word embeddings. Transactions of the Association for Computational Linguistics 8 (2020), 486–503.

[22]

Thomas Manzini, Yao Chong Lim, Yulia Tsvetkov, and Alan W Black. 2019. Black is to criminal as caucasian is to police: Detecting and removing multiclass bias in word embeddings. arXiv preprint arXiv:1904.04047(2019).

[23]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111–3119.

[24]

Orestis Papakyriakopoulos, Simon Hegelich, Juan Carlos Medina Serrano, and Fabienne Marco. 2020. Bias in word embeddings. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. 446–457.

Digital Library

[25]

Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 1532–1543.

[26]

Matthew E Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. arXiv preprint arXiv:1802.05365(2018).

[27]

Marcelo OR Prates, Pedro H Avelar, and Luís C Lamb. 2019. Assessing gender bias in machine translation: a case study with google translate. Neural Computing and Applications(2019), 1–19.

[28]

David Rozado. 2020. Wide range screening of algorithmic bias in word embedding models using large sentiment lexicons reveals underreported bias types. PloS one 15, 4 (2020), e0231189.

[29]

Daniel Smilkov, Nikhil Thorat, Charles Nicholson, Emily Reif, Fernanda B Viégas, and Martin Wattenberg. 2016. Embedding projector: Interactive visualization and interpretation of embeddings. arXiv preprint arXiv:1611.05469(2016).

[30]

Gabriel Stanovsky, Noah A Smith, and Luke Zettlemoyer. 2019. Evaluating Gender Bias in Machine Translation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 1679–1684.

[31]

Nathaniel Swinger, Maria De-Arteaga, Neil Thomas Heffernan IV, Mark DM Leiserson, and Adam Tauman Kalai. 2019. What are the biases in my word embedding?. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society. 305–311.

Digital Library

[32]

Francisco Vargas and Ryan Cotterell. 2020. Exploring the linear subspace hypothesis in gender bias mitigation. arXiv preprint arXiv:2009.09435(2020).

[33]

Tianlu Wang, Xi Victoria Lin, Nazneen Fatema Rajani, Bryan McCann, Vicente Ordonez, and Caiming Xiong. 2020. Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation. arXiv preprint arXiv:2005.00965(2020).

[34]

James Wexler, Mahima Pushkarna, Tolga Bolukbasi, Martin Wattenberg, Fernanda Viégas, and Jimbo Wilson. 2019. The what-if tool: Interactive probing of machine learning models. IEEE transactions on visualization and computer graphics 26, 1(2019), 56–65.

[35]

Wikipedia contributors. 2020. Percentile rank — Wikipedia, The Free Encyclopedia. https://en.wikipedia.org/w/index.php?title=Percentile_rank&oldid=954713866 [Online; accessed 3-October-2020].

[36]

Jing Nathan Yan, Ziwei Gu, Hubert Lin, and Jeffrey M Rzeszotarski. 2020. Silva: Interactively Assessing Machine Learning Fairness Using Causality. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–13.

Digital Library

[37]

Haiyang Zhang, Alison Sneyd, and Mark Stevenson. 2020. Robustness and Reliability of Gender Bias Assessment in WordEmbeddings: The Role of Base Pairs. arXiv preprint arXiv:2010.02847(2020).

[38]

Jieyu Zhao, Tianlu Wang, Mark Yatskar, Vicente Ordonez, and Kai-Wei Chang. 2018. Gender bias in coreference resolution: Evaluation and debiasing methods. arXiv preprint arXiv:1804.06876(2018).

Cited By

Wolbring GNasir L(2024)Intersectionality of Disabled People through a Disability Studies, Ability-Based Studies, and Intersectional Pedagogy Lens: A Survey and a Scoping ReviewSocieties10.3390/soc1409017614:9(176)Online publication date: 7-Sep-2024
https://doi.org/10.3390/soc14090176
Bailey AWilliams APoddar ACimpian A(2024)Intersectional Male-Centric and White-Centric Biases in Collective ConceptsPersonality and Social Psychology Bulletin10.1177/01461672241232114Online publication date: 13-Apr-2024
https://doi.org/10.1177/01461672241232114
Rathore ADev SPhillips JSrikumar VZheng YYeh CWang JZhang WWang B(2024)VERB: Visualizing and Interpreting Bias Mitigation Techniques Geometrically for Word RepresentationsACM Transactions on Interactive Intelligent Systems10.1145/360443314:1(1-34)Online publication date: 9-Jan-2024
https://dl.acm.org/doi/10.1145/3604433
Show More Cited By

Recommendations

Gender Bias in Word Embeddings: A Comprehensive Analysis of Frequency, Syntax, and Semantics
AIES '22: Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society

Word embeddings are numeric representations of meaning derived from word co-occurrence statistics in corpora of human-produced texts. The statistical regularities in language corpora encode well-known social biases into word embeddings (e.g., the word ...
Detecting Emergent Intersectional Biases: Contextualized Word Embeddings Contain a Distribution of Human-like Biases
AIES '21: Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society

With the starting point that implicit human biases are reflected in the statistical regularities of language, it is possible to measure biases in English static word embeddings. State-of-the-art neural language models generate dynamic word embeddings ...
Bias in word embeddings
FAT* '20: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency

Word embeddings are a widely used set of natural language processing techniques that map words to vectors of real numbers. These vectors are used to improve the quality of generative and predictive models. Recent studies demonstrate that word embeddings ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CHI EA '21: Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems

May 2021

2965 pages

ISBN:9781450380959

DOI:10.1145/3411763

Editors:
Yoshifumi Kitamura
Tohoku University, Japan
,
Aaron Quigley
University of New South Wales, Australia
,
Katherine Isbister
University of California Santa Cruz, USA
,
Takeo Igarashi
The University of Tokyo, Japan

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGCHI: ACM Special Interest Group on Computer-Human Interaction

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 May 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Poster
Research
Refereed limited

Conference

CHI '21

Sponsor:

SIGCHI

CHI '21: CHI Conference on Human Factors in Computing Systems

May 8 - 13, 2021

Yokohama, Japan

Acceptance Rates

Overall Acceptance Rate 6,164 of 23,696 submissions, 26%

Upcoming Conference

CHI 2025

Sponsor:
sigchi

ACM CHI Conference on Human Factors in Computing Systems

April 26 - May 1, 2025

Yokohama , Japan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

12
Total Citations
View Citations
491
Total Downloads

Downloads (Last 12 months)98
Downloads (Last 6 weeks)15

Reflects downloads up to 01 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wolbring GNasir L(2024)Intersectionality of Disabled People through a Disability Studies, Ability-Based Studies, and Intersectional Pedagogy Lens: A Survey and a Scoping ReviewSocieties10.3390/soc1409017614:9(176)Online publication date: 7-Sep-2024
https://doi.org/10.3390/soc14090176
Bailey AWilliams APoddar ACimpian A(2024)Intersectional Male-Centric and White-Centric Biases in Collective ConceptsPersonality and Social Psychology Bulletin10.1177/01461672241232114Online publication date: 13-Apr-2024
https://doi.org/10.1177/01461672241232114
Rathore ADev SPhillips JSrikumar VZheng YYeh CWang JZhang WWang B(2024)VERB: Visualizing and Interpreting Bias Mitigation Techniques Geometrically for Word RepresentationsACM Transactions on Interactive Intelligent Systems10.1145/360443314:1(1-34)Online publication date: 9-Jan-2024
https://dl.acm.org/doi/10.1145/3604433
Yan XZhou YMishra AMishra HWang B(2024)Exploring Visualization for Fairness in AI Education2024 IEEE 17th Pacific Visualization Conference (PacificVis)10.1109/PacificVis60374.2024.00010(1-10)Online publication date: 23-Apr-2024
https://doi.org/10.1109/PacificVis60374.2024.00010
Ireland MPennebaker J(2024)Language Research in Social Personality PsychologyHandbook of Research Methods in Social and Personality Psychology10.1017/9781009170123.015(322-348)Online publication date: 12-Dec-2024
https://doi.org/10.1017/9781009170123.015
Szczuka JMühl L(2024)The Usage of Voice in Sexualized Interactions with Technologies and Sexual Health Communication: An OverviewCurrent Sexual Health Reports10.1007/s11930-024-00383-416:2(47-57)Online publication date: 27-Mar-2024
https://doi.org/10.1007/s11930-024-00383-4
Gohar UCheng LElkind E(2023)A survey on intersectional fairness in machine learningProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence10.24963/ijcai.2023/742(6619-6627)Online publication date: 19-Aug-2023
https://dl.acm.org/doi/10.24963/ijcai.2023/742
de Manuel ADelgado JParra Jounou IAusín TCasacuberta DCruz MGuersenzvaig AMoyano CRodríguez-Arias DRueda JPuyol A(2023)Ethical assessments and mitigation strategies for biases in AI-systems used during the COVID-19 pandemicBig Data & Society10.1177/2053951723117919910:1Online publication date: 12-Jun-2023
https://doi.org/10.1177/20539517231179199
Lee HChun MJung H(2023)PORDE: Explaining Data Poisoning Attacks Through Visual Analytics with Food Delivery App ReviewsCompanion Proceedings of the 28th International Conference on Intelligent User Interfaces10.1145/3581754.3584128(46-50)Online publication date: 27-Mar-2023
https://dl.acm.org/doi/10.1145/3581754.3584128
Seaborn KChandra SFabre T(2023)Transcending the “Male Code”: Implicit Masculine Biases in NLP ContextsProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581017(1-19)Online publication date: 19-Apr-2023
https://dl.acm.org/doi/10.1145/3544548.3581017
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten